# Nonlinear regression in Excel tutorial

2019-01-23

#### Included in

This tutorial explains how to set up and interpret a nonlinear regression in Excel with XLSTAT.

## Dataset for nonlinear regression

An Excel sheet with both the data and the results can be downloaded by clicking on the button below:

The data come from [Ratkowsky D.A. (1983). Nonlinear Regression Modeling. New York, Marcel Dekker].

### Goal of this tutorial

Nonlinear regression is used to model complex phenomena which cannot be handled by the linear model. Here, our goal is to see how the dry weight of onions varies with growth time. If it seems intuitive that a 3 parameters logistic regression would be suitable, the model suggested by Ratkowsky is more complex.
2 parameters Logistic model:

3 parameters Logistic model:

4 parameters Ratkowsky's model:

In order to show the capabilities of XLSTAT, and although the 3 parameters logistic model is directly available in XLSTAT, we will show you can create and add a user-defined function to the library.

Since the Ratkowsky model is not proposed in XLSTAT's non-linear regression models and because it has a complex structure, we have to calculate the derivatives of the function with respect to each of the 4 parameters.

The table below groups the four derivatives and their transcription with the conventions imposed by XLSTAT (Excel syntax, with pri for the parameter i and Xj for the variable j). Note: If a derivative begins with a -, it must be preceded by ' to prevent Excel from detecting an error.
Mathematical writing:

XLSTAT syntax: (pr3/pr4)*exp(-pr1-pr2*X1)/(1+exp(-pr1-pr2*X1))^(1+1/pr4)
Mathematical writing

XLSTAT syntax (pr3*X1/pr4)*exp(-pr1-pr2*X1)/(1+exp(-pr1-pr2*X1))^(1+1/pr4)
Mathematical writing:

XLSTAT syntax: 1/(1+exp(-pr1-pr2*X1))^(1/pr4)
Mathematical writing:

XLSTAT syntax: (pr1/pr4^2)*ln(1+exp(-pr1-pr2*X1))/(1+exp(-pr1-pr2*X1))^(1/pr4)
When functions have a complex structure, it is advisable to give XLSTAT a starting point. In our case, it seems that the point [0, 0, 725, 1] is reasonable, where 725 is the maximum of the dependent variable.
Note that the starting point and the derivatives must be entered into a column of your Excel sheet (see demo file).

## Setting up a nonlinear regression with XLSTAT

After opening XLSTAT, select the XLSTAT / Modeling data / Nonlinear regression feature.

The nonlinear regression dialog box pops up. Select the data on the Excel sheet.

The Dependent variable (also called response variable) is in our case the column Dry weight.
The quantitative explanatory variable corresponds to the column Time. We want to explain the variability of the Dry Weight by the Time.

As the first row contains the column headers, we left the option Variable labels activated.

In the Options tab, we select the starting points that corresponds to the initial value of each parameter. Note: a header must not be selected here.

In the Functions tab, click Add to add the Ratkowsky model to the list of built-in functions (you will notice that the 3-parameter logistic function is available). We can then enter the model in the Function box: Y =.

The four derivatives previously computed can be found in column D of the worksheet one below the other. Their selection is then possible by checking the Derivatives box.

In order to add our model to the library of user-defined functions and to reuse it later, we click Save. The template is then added and automatically selected.

The computations begin once you have clicked on the OK button. The results will then be displayed.

## Interpreting the results of a nonlinear regression

The first table gives basic statistics for the dependent and explanatory variables. The second table (see below) displays the goodness of fit coefficients, including the R² (coefficient of determination), and the SSE (sum of square of errors), the latter being the criterion used for the model optimization. The R² indicates the % of the variability of the dependent variable (the dry weight) explained by the explanatory variable (the time). The closer to 1 the R² is, the better the fit.

In our case, 99% of the variability of the Dry weight is explained by Time, which is an excellent result.
The next table shows the detailed results for the model parameters. Parameter pr3, which initial value vas 725, is equal to 699.64 after the fitting. The standard deviation gives an idea of the reliability of result obtained for each parameter. The value of the parameter pr4 being close 1, one can imagine that a 3 parameters logistic regression would give an almost as good result, a hypothesis that you can easily test as this function is available in the preprogrammed functions list.

The equation of the model is displayed and can easily be reused in Excel.

The next table (see in the demo Excel file) shows the analysis of residuals. One can notice that the model is not well fitted for the observations 11 and 14.
The first chart (see below) allows to visualize the data and the fitted curve. The other charts allow to visually analyze the residuals and are useful when the number of data is big.
[IMG08]

### Conclusion

As a conclusion, in the context of this analysis and of the selected model, the growth time explains a significant amount of the dry weight variability.

1c26995d494fb3061dd0ae8571ffc0a4@xlstat.desk-mail.com
https://cdn.desk.com/
false
desk