ML.CLASSIFICATION.SVM¶
Creates a Support Vector Machine (SVM) object.
Syntax¶
Arguments¶
| Name | Type | Default | Description |
|---|---|---|---|
| C | float | 1.0 | Regularization parameter. Must be a positive float; the strength of the regularization is inversely proportional to C. |
| kernel | str | "rbf" | Specifies the kernel type to be used in the algorithm. One of: 'linear', 'poly', 'rbf', 'sigmoid', 'precomputed'. |
| degree | int | 3 | Degree of the polynomial kernel function ('poly'). Ignored by all other kernels. |
| gamma | str | "scale" | Kernel coefficient for 'rbf', 'poly', and 'sigmoid'. If 'scale' (default), uses 1 / (n_features * X.var()). If 'auto', uses 1 / n_features. |
| coef0 | float | 0.0 | Independent term in kernel function. It is only significant in 'poly' and 'sigmoid' kernels. |
Returns¶
A Support Vector Machine model handle, ready to pass into ML.FIT.
When to use¶
Reach for an SVM when classes are not linearly separable and a kernel can carve a curved decision boundary through the feature space. SVMs shine on smaller, well-curated datasets — a few thousand rows or fewer — where every row matters and the cost of fitting a heavier model is worth it.
Compared to the alternatives in this namespace:
- Use
ML.CLASSIFICATION.LOGISTICwhen a linear boundary is enough and you need speed or interpretability. - Use svm when you suspect a non-linear boundary and your dataset is small
to medium-sized; the
"rbf"kernel is a strong default. - Use
ML.CLASSIFICATION.RANDOM_FOREST_CLFwhen you have lots of rows, many features, or a mix of numeric and categorical inputs.
Examples¶
Fit an RBF-kernel SVM (the default) on labeled data in A2:E100 / F2:F100
and predict ten new rows in A101:E110:
Switch to a linear kernel for a fast, interpretable baseline when you suspect the classes are linearly separable:
Use a polynomial kernel of degree 3 when you expect curved class boundaries:
Remarks¶
Ccontrols the trade-off between margin width and misclassification: smallCkeeps the margin wide (more regularization), largeCpunishes errors more aggressively.- The
"rbf"kernel is the most common starting point. Try"linear"first if you have many features relative to rows;"poly"if you expect polynomial-like boundaries. - Always scale your features (e.g. with
ML.PREPROCESSING.STANDARD_SCALER) before fitting — SVMs are very sensitive to feature magnitude. - SVMs scale poorly with the number of rows. For datasets above ~10,000 rows,
prefer
ML.CLASSIFICATION.LOGISTICorML.CLASSIFICATION.RANDOM_FOREST_CLF.