Skip to content

ML.PREPROCESSING.STANDARD_SCALER

Standardizes features by removing the mean and scaling to unit variance.

Syntax

ML.PREPROCESSING.STANDARD_SCALER()

Returns

A StandardScaler transformer handle, ready to pass into ML.FIT_TRANSFORM or ML.PIPELINE.

When to use

Reach for standard_scaler whenever a model is sensitive to feature magnitude — which covers most algorithms in formulaML except tree-based ones. It centers each column on zero and scales it to unit variance, so a feature measured in millions doesn't drown out a feature measured in tenths.

Use standard_scaler as the default scaler. Choose ML.PREPROCESSING.MIN_MAX_SCALER instead when you need values strictly in [0, 1], and ML.PREPROCESSING.ROBUST_SCALER when your data has heavy outliers that would distort the mean and standard deviation.

Always required before:

  • ML.CLASSIFICATION.LOGISTIC / ML.CLASSIFICATION.SVM
  • ML.REGRESSION.RIDGE / LASSO / ELASTIC_NET
  • ML.CLUSTERING.KMEANS
  • ML.DIM_REDUCTION.PCA

Examples

Scale features in A2:E100 and read the standardized values back into the sheet:

=ML.PREPROCESSING.STANDARD_SCALER()
=ML.FIT_TRANSFORM(H1, A2:E100)

Combine the scaler and a model into a single pipeline so the scaler is fit on training data and reused at predict time:

=ML.PREPROCESSING.STANDARD_SCALER()
=ML.CLASSIFICATION.LOGISTIC()
=ML.PIPELINE(H1, H2)
=ML.FIT(H3, A2:E100, F2:F100)
=ML.PREDICT(H4, A101:E110)

After fitting on training data, apply the same scaler to a held-out test set with ML.TRANSFORM:

=ML.TRANSFORM(H1, A101:E110)

Remarks

  • Always fit the scaler on the training data only, then reuse it on the test or production data via ML.TRANSFORM. Re-fitting on test data leaks information from the test set into your evaluation.
  • The cleanest way to avoid that mistake is to wrap the scaler and model in ML.PIPELINE; the pipeline calls fit on the scaler with the training data and transform automatically at predict time.
  • Tree-based models (ML.CLASSIFICATION.RANDOM_FOREST_CLF, ML.REGRESSION.RANDOM_FOREST_REG) are scale-invariant — you usually do not need a scaler in front of them.

See also