This tutorial explains how to set up and interpret a multiple nonlinear regression in Excel with XLSTAT.
Dataset to run a nonlinear multiple regression
An Excel sheet with both the data and the results can be downloaded by clicking on the button below:
Download the data
Nonlinear regression is used to model complex phenomena which cannot be handled by linear models. Our purpose is to study the effect of the concentration of two components, C1 and C2, on the viscosity of a yogurt. The model that we want to fit writes:
F(C1, C2) = pr5 / (1+Exp(-pr1-pr2*C1-pr3*C2-pr4*C1*C2))
Where pr1, ..., pr5 are the parameters of the model. This logistic-like model allows taking into account both the concentrations of the components and their interaction.
Setting up a nonlinear multiple regression
After opening XLSTAT, select the XLSTAT / Modeling data / Nonlinear regression feature.
The nonlinear regression dialog box pops up. Select the data on the Excel sheet. The Dependent variable (or response variable) is in our case the Viscosity. The quantitative explanatory variables are the concentration of the two components C1and C2.
As we selected the column headers, we need to activate the option Variable labels. We select the Residuals option as well in order to analyze the predictions and the residuals.
In the Options tab, we select the values of the initial values of the five parameters.
In the Functions tab, we have to possibilities. We can either choose one of the build-in functions (see below) or define our own function.
Here, we choose the second option. Click Add and enter the equation F (C1, C2) of our model. Check the Derivatives option and select their values from the Excel sheet. Click Save to add the model to the library of user-defined functions.
The computations begin once you have clicked on the OK button.
Interpreting the results of a nonlinear multiple regression
The first table gives the basic statistics for the selected variables. The second table (see below) displays the goodness of fit coefficients, including the R² (coefficient of determination), and the SSE (sum of square of errors), the latter being the criterion used for the model optimization. The R² corresponds to the % of the variability of the dependent variable (the viscosity) that is explained by the two explanatory variables (the components). The closer to 1 the R² is, the better the fit.
In our case, 99% of the variability is explained by the two variables and their interaction, which is an excellent result that confirms that the selected model is appropriate.
The next table shows the results for the model parameters. As we can see, the ratios (parameter)/(std deviation) are larger for pr5 and pr4. As the same ratio is the largest for pr5 we deduce that the interaction between the two components has a greater effect on the viscosity than the concentrations themselves.
The following chart visualizes the quality of the fit by comparing the predicted values to the observed values.