ML.REGRESSION.RANDOM_FOREST_REG¶
Creates a Random Forest Regression object.
Syntax¶
ML.REGRESSION.RANDOM_FOREST_REG(n_estimators, criterion, max_depth, min_samples_split, min_samples_leaf, max_features, bootstrap, random_state)
Arguments¶
| Name | Type | Default | Description |
|---|---|---|---|
| n_estimators | int | 100 | The number of trees in the forest. |
| criterion | str | "squared_error" | The function to measure the quality of a split. Supported criteria: 'squared_error' for the mean squared error, 'absolute_error' for the mean absolute error, 'friedman_mse' for the mean squared error with improvement by Friedman, 'poisson' for the Poisson loss. |
| max_depth | int | None | The maximum depth of the tree. |
| min_samples_split | int | 2 | The minimum number of samples required to split an internal node. |
| min_samples_leaf | int | 1 | The minimum number of samples required to be at a leaf node. |
| max_features | str | float | int |
| bootstrap | bool | TRUE | Whether bootstrap samples are used when building trees. |
| random_state | int | None | Controls the randomness of the bootstrapping procedure. |
Returns¶
A Random Forest regressor handle, ready to pass into ML.FIT.
When to use¶
Reach for a Random Forest regressor when the relationship between your features and the numeric target is non-linear, when features interact in complex ways, or when you simply want a strong tabular-data baseline with little tuning. It handles mixed-scale features and outliers gracefully and rarely needs feature scaling.
Compared to the alternatives in this namespace:
- Use
ML.REGRESSION.LINEAR/RIDGE/LASSO/ELASTIC_NETwhen the relationship looks linear and you want interpretable coefficients. - Use random_forest_reg when the relationship is non-linear, features interact, or you don't yet know what shape the relationship takes.
Examples¶
Fit a forest with the default 100 trees on features in A2:E100 and target
in F2:F100, then predict ten new rows in A101:E110:
Grow more trees for a small accuracy bump (at the cost of fit time):
Cap depth and set a seed for reproducible fits:
Remarks¶
n_estimatorsis the number of trees. More trees rarely hurts accuracy but always costs fit time — 100 is a sensible default; 500 for a final model.max_depthdefaults toNone(no limit). Set it to a small integer (e.g.5or10) to curb overfitting on small datasets.- Random Forests are largely scale-invariant — you usually do not need to scale features beforehand.
- Random Forests cannot extrapolate beyond the range of the target seen during
training. For predictions far outside the training distribution, prefer
ML.REGRESSION.LINEARor its regularized variants. - For reproducible runs, pass an integer to
random_state.