Comparison of Supervised Machine Learning Algorithms
The following two grids compare the main Supervised Machine Learning algorithms available in XLSTAT. One grid is for classification tasks (qualitative Y), the other is for regression tasks (quantitative Y). For a short introduction to the principles of Supervised Machine Learning, check out this article.
Algorithms are compared with regards to several criteria
-
Can they work with more variables than observations?
-
Do they easily adapt to non-linear relationships between the predictors and the outcome?
-
Can the algorithm be used for explanatory purposes? In other words, can it be used to describe the relative impacts of predictors on the outcome?
-
Can they automatically detect and learn interactions among predictors?
-
What are the main hyperparameters to tune?
Classification algorithms
Algorithm | Works with more variables than observations? | Adapts to non-linear situations? | Explanatory intelligibility | Automatically learns relevant interactions among predictors? | Main Hyperparameters | XLSTAT menu | Remarks |
---|---|---|---|---|---|---|---|
Logistic Regression | No | - | +++ | No | none | Modeling data | Good option for explanatory intelligibility (provides log-odds coefficients and p-values) |
Penalized regression (Ridge, Lasso, Elastic Net) | Yes | - | ++ | No | lambda, alpha | XLSTAT-R, glmnet | Select Binomial or Multinomial family |
Linear Discriminant Analysis | No | - | + | No | none | Analyzing data / Discriminant Analysis; Activate Equality of Covariance Matrices in the Options tab | |
Quadratic Discriminant Analysis | No | + | + | No | none | Analyzing data / Discriminant Analysis; Deactivate Equality of Covariance Matrices in the Options tab | |
Partial Least Squares Discriminant Analysis (PLS-DA) | Yes | - | + | No | number of components | Modeling data | Typically used with few observations & many variables (chemometrics) |
General Additive Models | No | ++ | + | No | Method, add extra penalty | XLSTAT-R, gam | |
Naive Bayes | Yes | - | - | No | Smoothing parameter | Machine Learning | Fast computations on large data sets |
Support Vector Machines (SVM) | Yes | ++ (RBF kernel recommended for non-linear situations) | - | No | C, kernel and kernel-specific hyperparemeters | Machine Learning | Computationally intensive on large data sets |
K Nearest Neighbors (KNN) | Yes | ++ | - | No | Number of neighbors | Machine Learning | |
Classification trees (C&RT) | Yes | ++ | ++ | Yes | CP | Machine Learning | Binary splits at each node |
Classification trees (CHAID) | Yes | ++ | ++ | Yes | CP | Machine Learning | Multiple splits at each node |
Classification Random Forests | Yes | ++ | + | Yes | CP, mtry | Machine Learning | Better predictive performance compared to classification trees |
Neural networks | Yes | ++ | - | Yes | Network architecture, error function, activation functions | XLSTAT-R, neuralnet | Requires advanced expertise |
Regression algorithms
Algorithm | Works with more variables than observations? | Adapts to non linear situations? | Explanatory intelligibility | Automatically learns relevant interactions among predictors? | Main Hyperparameters in XLSTAT | XLSTAT menu | Remarks |
---|---|---|---|---|---|---|---|
Linear regression | No | - | +++ | No | none | Modeling data | Good option for explanatory intelligibility (slope coefficients and p-values) |
Penalized regression (Ridge, Lasso, Elastic Net) | Yes | - | ++ | No | lambda, alpha | XLSTAT-R, glmnet | Select Gaussian family |
Quantile Regression | Yes | - | + | No | none | Modeling data | |
General Additive Models | No | ++ | + | No | Method, add extra penalty | XLSTAT-R, gam | |
Partial Least Squares (PLS) | Yes | - | + | No | number of components | Modeling data | Typically used with few observations & many variables (chemometrics) |
Principal Component Regression (PCR) | Yes | - | + | No | Standardize variables | Modeling data | |
K Nearest Neighbors (KNN) | Yes | ++ | - | No | number of neighbors | Machine Learning | |
Regression trees (C&RT) | Yes | ++ | ++ | Yes | Minimum parent size, minimum son size, maximum depth, CP | Machine Learning | Binary splits at each node |
Regression trees (CHAID) | Yes | ++ | ++ | Yes | Minimum parent size, minimum son size, maximum depth, CP | Machine Learning | Multiple splits at each node |
Random Forests | Yes | ++ | + | Yes | CP, mtry | Machine Learning | Better predictive performance compared to regression trees |
Neural Network | Yes | ++ | - | Yes | Network architecture, error function, activation functions | XLSTAT-R, neuralnet | Requires advanced expertise |
Was this article useful?
- Yes
- No