Comparison of Supervised Machine Learning Algorithms
The following two grids compare the main Supervised Machine Learning algorithms available in XLSTAT. One grid is for classification tasks (qualitative Y), the other is for regression tasks (quantitative Y). For a short introduction to the principles of Supervised Machine Learning, check out this article.
Algorithms are compared with regards to several criteria

Can they work with more variables than observations?

Do they easily adapt to nonlinear relationships between the predictors and the outcome?

Can the algorithm be used for explanatory purposes? In other words, can it be used to describe the relative impacts of predictors on the outcome?

Can they automatically detect and learn interactions among predictors?

What are the main hyperparameters to tune?
Classification algorithms
Algorithm  Works with more variables than observations?  Adapts to nonlinear situations?  Explanatory intelligibility  Automatically learns relevant interactions among predictors?  Main Hyperparameters  XLSTAT menu  Remarks 

Logistic Regression  No    +++  No  none  Modeling data  Good option for explanatory intelligibility (provides logodds coefficients and pvalues) 
Penalized regression (Ridge, Lasso, Elastic Net)  Yes    ++  No  lambda, alpha  XLSTATR, glmnet  Select Binomial or Multinomial family 
Linear Discriminant Analysis  No    +  No  none  Analyzing data / Discriminant Analysis; Activate Equality of Covariance Matrices in the Options tab  
Quadratic Discriminant Analysis  No  +  +  No  none  Analyzing data / Discriminant Analysis; Deactivate Equality of Covariance Matrices in the Options tab  
Partial Least Squares Discriminant Analysis (PLSDA)  Yes    +  No  number of components  Modeling data  Typically used with few observations & many variables (chemometrics) 
General Additive Models  No  ++  +  No  Method, add extra penalty  XLSTATR, gam  
Naive Bayes  Yes      No  Smoothing parameter  Machine Learning  Fast computations on large data sets 
Support Vector Machines (SVM)  Yes  ++ (RBF kernel recommended for nonlinear situations)    No  C, kernel and kernelspecific hyperparemeters  Machine Learning  Computationally intensive on large data sets 
K Nearest Neighbors (KNN)  Yes  ++    No  Number of neighbors  Machine Learning  
Classification trees (C&RT)  Yes  ++  ++  Yes  CP  Machine Learning  Binary splits at each node 
Classification trees (CHAID)  Yes  ++  ++  Yes  CP  Machine Learning  Multiple splits at each node 
Classification Random Forests  Yes  ++  +  Yes  CP, mtry  Machine Learning  Better predictive performance compared to classification trees 
Neural networks  Yes  ++    Yes  Network architecture, error function, activation functions  XLSTATR, neuralnet  Requires advanced expertise 
Regression algorithms
Algorithm  Works with more variables than observations?  Adapts to non linear situations?  Explanatory intelligibility  Automatically learns relevant interactions among predictors?  Main Hyperparameters in XLSTAT  XLSTAT menu  Remarks 

Linear regression  No    +++  No  none  Modeling data  Good option for explanatory intelligibility (slope coefficients and pvalues) 
Penalized regression (Ridge, Lasso, Elastic Net)  Yes    ++  No  lambda, alpha  XLSTATR, glmnet  Select Gaussian family 
Quantile Regression  Yes    +  No  none  Modeling data  
General Additive Models  No  ++  +  No  Method, add extra penalty  XLSTATR, gam  
Partial Least Squares (PLS)  Yes    +  No  number of components  Modeling data  Typically used with few observations & many variables (chemometrics) 
Principal Component Regression (PCR)  Yes    +  No  Standardize variables  Modeling data  
K Nearest Neighbors (KNN)  Yes  ++    No  number of neighbors  Machine Learning  
Regression trees (C&RT)  Yes  ++  ++  Yes  Minimum parent size, minimum son size, maximum depth, CP  Machine Learning  Binary splits at each node 
Regression trees (CHAID)  Yes  ++  ++  Yes  Minimum parent size, minimum son size, maximum depth, CP  Machine Learning  Multiple splits at each node 
Random Forests  Yes  ++  +  Yes  CP, mtry  Machine Learning  Better predictive performance compared to regression trees 
Neural Network  Yes  ++    Yes  Network architecture, error function, activation functions  XLSTATR, neuralnet  Requires advanced expertise 
Was this article useful?
 Yes
 No