Skip to content

ML.REGRESSION.RANDOM_FOREST_REG

Creates a Random Forest Regression object.

Syntax

ML.REGRESSION.RANDOM_FOREST_REG(n_estimators, criterion, max_depth, min_samples_split, min_samples_leaf, max_features, bootstrap, random_state)

Arguments

Name Type Default Description
n_estimators int 100 The number of trees in the forest.
criterion str "squared_error" The function to measure the quality of a split. Supported criteria: 'squared_error' for the mean squared error, 'absolute_error' for the mean absolute error, 'friedman_mse' for the mean squared error with improvement by Friedman, 'poisson' for the Poisson loss.
max_depth int None The maximum depth of the tree.
min_samples_split int 2 The minimum number of samples required to split an internal node.
min_samples_leaf int 1 The minimum number of samples required to be at a leaf node.
max_features str float int
bootstrap bool TRUE Whether bootstrap samples are used when building trees.
random_state int None Controls the randomness of the bootstrapping procedure.

Returns

A Random Forest regressor handle, ready to pass into ML.FIT.

When to use

Reach for a Random Forest regressor when the relationship between your features and the numeric target is non-linear, when features interact in complex ways, or when you simply want a strong tabular-data baseline with little tuning. It handles mixed-scale features and outliers gracefully and rarely needs feature scaling.

Compared to the alternatives in this namespace:

  • Use ML.REGRESSION.LINEAR / RIDGE / LASSO / ELASTIC_NET when the relationship looks linear and you want interpretable coefficients.
  • Use random_forest_reg when the relationship is non-linear, features interact, or you don't yet know what shape the relationship takes.

Examples

Fit a forest with the default 100 trees on features in A2:E100 and target in F2:F100, then predict ten new rows in A101:E110:

=ML.REGRESSION.RANDOM_FOREST_REG()
=ML.FIT(H1, A2:E100, F2:F100)
=ML.PREDICT(H2, A101:E110)

Grow more trees for a small accuracy bump (at the cost of fit time):

=ML.REGRESSION.RANDOM_FOREST_REG(500)

Cap depth and set a seed for reproducible fits:

=ML.REGRESSION.RANDOM_FOREST_REG(200, "squared_error", 8, 2, 1, 1.0, TRUE, 42)

Remarks

  • n_estimators is the number of trees. More trees rarely hurts accuracy but always costs fit time — 100 is a sensible default; 500 for a final model.
  • max_depth defaults to None (no limit). Set it to a small integer (e.g. 5 or 10) to curb overfitting on small datasets.
  • Random Forests are largely scale-invariant — you usually do not need to scale features beforehand.
  • Random Forests cannot extrapolate beyond the range of the target seen during training. For predictions far outside the training distribution, prefer ML.REGRESSION.LINEAR or its regularized variants.
  • For reproducible runs, pass an integer to random_state.

See also