Machine Learning Toolkit Macros in Splunk Enterprise Security
Machine Learning Toolkit macros act as shortcuts and wrappers. The macros are found from the Splunk Enterprise menu at Settings > Advanced Search > Search macros.
An example of using a macro to apply data to model=app:failures_by_src_count_1d for qualitative_id=medium, and field=failure:
... | `mltk_apply_upper("app:failures_by_src_count_1d", "medium", "failure")`
Versus doing it without the macro:
... | apply app:failures_by_src_count_1d
[| inputlookup append=T qualitative_thresholds_lookup where qualitative_id="medium"
| rename threshold as upper_threshold
| return upper_threshold | eval search=replace(search,"\"","")]
| search "IsOutlier(failure)"=1
Macros used in SPL
You might use the following macros to apply data to your models.
[mltk_apply]
This is approximately equivalent to the xsWhere command, for applying to either upper or lower bounds.
[mltk_apply(3)] args = model,qualitative_id,field definition = apply $model$ [| `get_qualitative_threshold($qualitative_id$)`] | search "IsOutlier($field$)"=1
The macro takes the following arguments:
- model
- The name of the model for applying data and comparing against standards to find outliers, such as
app:failures_by_src_count_1d
. - qualitative_id
- The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as
medium
. - field
- The name of the field that you're searching or counting to find outliers, such as
failure
.
[mltk_apply_lower]
This is approximately equivalent to the xsWhere command, for applying to lower bounds.
[mltk_apply_lower(3)] args = model,qualitative_id,field definition = apply $model$ [| `get_qualitative_lower_threshold($qualitative_id$)`] | search "IsOutlier($field$)"=1
The macro takes the following arguments:
- model
- The name of the model for applying data and comparing against standards to find outliers, such as
app:failures_by_src_count_1d
. - qualitative_id
- The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as
medium
. - field
- The name of the field that you're searching or counting to find outliers, such as
failure
.
[mltk_apply_upper]
This is approximately equivalent to the xsWhere command, for applying to upper bounds.
[mltk_apply_upper(3)] args = model,qualitative_id,field definition = apply $model$ [| `get_qualitative_upper_threshold($qualitative_id$)`] | search "IsOutlier($field$)"=1
The macro takes the following arguments:
- model
- The name of the model for applying data and comparing against standards to find outliers, such as
app:failures_by_src_count_1d
. - qualitative_id
- The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as
medium
. - field
- The name of the field that you're searching or counting to find outliers, such as
failure
.
[mltk_findbest]
This is approximately equivalent to the xsFindBestConcept command. For each value, this macro tells you in which threshold range the value falls on the distribution curve.
[mltk_findbest(1)] args = model definition = apply $model$ as findbest [| `get_findbest_thresholds`] | eval [| `get_findbest_qualitative`] | fields - BoundaryRanges,findbest*
The macro takes the following arguments:
- model
- The name of the model for applying data and comparing against standards to find outliers, such as
app:failures_by_src_count_1d
.
Note that the threshold doesn't take a field parameter like the other macros. It performs the findbest
operation on the exact field that the Model Gen fit
command was performed on. For example:
- If the Model Gen performed:
... | fit DensityFunction current_count dist=norm into app:total_risk_1d
, themltk_findbest()
search will only match on thecurrent_count
field. - This means that the portion of the search that comes before the
mltk_findbest()
command must contain thecurrent_count
field.
Macros used by other macros
These macros are in use by the macros used in SPL.
[get_qualitative_threshold]
This is a building block for [mltk_apply]. You might not use this one by itself.
[get_qualitative_threshold(1)] args = qualitative_id definition = inputlookup append=T qualitative_thresholds_lookup where qualitative_id="$qualitative_id$" | return threshold | eval search=replace(search,"\"","")
The macro takes the following arguments:
- qualitative_id
- The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as
medium
.
[get_qualitative_lower_threshold]
This is a building block for [mltk_apply_upper]. You might not use this one by itself.
[get_qualitative_lower_threshold(1)] args = qualitative_id definition = inputlookup append=T qualitative_thresholds_lookup where qualitative_id="$qualitative_id$" | rename threshold as lower_threshold | return lower_threshold | eval search=replace(search,"\"","")
The macro takes the following arguments:
- qualitative_id
- The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as
medium
.
[get_qualitative_upper_threshold]
This is a building block for [mltk_apply_upper]. You might not use this one by itself.
[get_qualitative_upper_threshold(1)] args = qualitative_id definition = inputlookup append=T qualitative_thresholds_lookup where qualitative_id="$qualitative_id$" | rename threshold as upper_threshold | return upper_threshold | eval search=replace(search,"\"","")
The macro takes the following arguments:
- qualitative_id
- The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as
medium
.
[get_findbest_thresholds]
This is a building block for [mltk_findbest]. You might not use this one by itself.
[get_findbest_thresholds] definition = inputlookup append=T qualitative_thresholds_lookup | stats values(threshold) as search | eval search="threshold=\"".mvjoin(mvsort(search), ",")."\""
[get_findbest_qualitative]
This is a building block for [mltk_findbest]. You might not use this one by itself.
[get_findbest_qualitative] definition = inputlookup append=T qualitative_thresholds_lookup | eval threshold_id="findbest_th=".threshold | sort threshold | eval subcase="'".threshold_id."'=\"1.0\",\"".qualitative_label."\"" | stats values(subcase) as search | eval search="qualitative=case(".mvjoin(search, ",").")"
Machine Learning Toolkit Searches in Splunk Enterprise Security | Convert Extreme Searches to Machine Learning Toolkit in Splunk Enterprise Security |
This documentation applies to the following versions of Splunk® Enterprise Security: 8.0.0, 8.0.1, 8.0.2
Feedback submitted, thanks!