Machine Learning Toolkit Macros in Splunk Enterprise Security

Machine Learning Toolkit macros act as shortcuts and wrappers. The macros are found from the Splunk Enterprise menu at Settings > Advanced Search > Search macros.

An example of using a macro to apply data to model=app:failures_by_src_count_1d for qualitative_id=medium, and field=failure:

... | `mltk_apply_upper("app:failures_by_src_count_1d", "medium", "failure")`

Versus doing it without the macro:

Macros used in SPL

You might use the following macros to apply data to your models.

[mltk_apply]

This is approximately equivalent to the xsWhere command, for applying to either upper or lower bounds.

[mltk_apply(3)]
args       = model,qualitative_id,field
definition = apply $model$ [| `get_qualitative_threshold($qualitative_id$)`] | search "IsOutlier($field$)"=1

The macro takes the following arguments:

model: The name of the model for applying data and comparing against standards to find outliers, such as app:failures_by_src_count_1d.
qualitative_id: The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as medium.
field: The name of the field that you're searching or counting to find outliers, such as failure.

[mltk_apply_lower]

This is approximately equivalent to the xsWhere command, for applying to lower bounds.

[mltk_apply_lower(3)]
args       = model,qualitative_id,field
definition = apply $model$ [| `get_qualitative_lower_threshold($qualitative_id$)`] | search "IsOutlier($field$)"=1

The macro takes the following arguments:

model: The name of the model for applying data and comparing against standards to find outliers, such as app:failures_by_src_count_1d.
qualitative_id: The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as medium.
field: The name of the field that you're searching or counting to find outliers, such as failure.

[mltk_apply_upper]

This is approximately equivalent to the xsWhere command, for applying to upper bounds.

[mltk_apply_upper(3)]
args       = model,qualitative_id,field
definition = apply $model$ [| `get_qualitative_upper_threshold($qualitative_id$)`] | search "IsOutlier($field$)"=1

The macro takes the following arguments:

model: The name of the model for applying data and comparing against standards to find outliers, such as app:failures_by_src_count_1d.
qualitative_id: The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as medium.
field: The name of the field that you're searching or counting to find outliers, such as failure.

[mltk_findbest]

This is approximately equivalent to the xsFindBestConcept command. For each value, this macro tells you in which threshold range the value falls on the distribution curve.

[mltk_findbest(1)]
args       = model
definition = apply $model$ as findbest [| `get_findbest_thresholds`] | eval [| `get_findbest_qualitative`] | fields - BoundaryRanges,findbest*

The macro takes the following arguments:

model: The name of the model for applying data and comparing against standards to find outliers, such as app:failures_by_src_count_1d.

Note that the threshold doesn't take a field parameter like the other macros. It performs the findbest operation on the exact field that the Model Gen fit command was performed on. For example:

If the Model Gen performed: ... | fit DensityFunction current_count dist=norm into app:total_risk_1d, the mltk_findbest() search will only match on the current_count field.
This means that the portion of the search that comes before the mltk_findbest() command must contain the current_count field.

Macros used by other macros

These macros are in use by the macros used in SPL.

[get_qualitative_threshold]

This is a building block for [mltk_apply]. You might not use this one by itself.

[get_qualitative_threshold(1)]
args       = qualitative_id
definition = inputlookup append=T qualitative_thresholds_lookup where qualitative_id="$qualitative_id$" | return threshold | eval search=replace(search,"\"","")

The macro takes the following arguments:

qualitative_id: The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as medium.

[get_qualitative_lower_threshold]

This is a building block for [mltk_apply_upper]. You might not use this one by itself.

[get_qualitative_lower_threshold(1)]
args       = qualitative_id
definition = inputlookup append=T qualitative_thresholds_lookup where qualitative_id="$qualitative_id$" | rename threshold as lower_threshold | return lower_threshold | eval search=replace(search,"\"","")

The macro takes the following arguments:

qualitative_id: The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as medium.

[get_qualitative_upper_threshold]

This is a building block for [mltk_apply_upper]. You might not use this one by itself.

[get_qualitative_upper_threshold(1)]
args       = qualitative_id
definition = inputlookup append=T qualitative_thresholds_lookup where qualitative_id="$qualitative_id$" | rename threshold as upper_threshold | return upper_threshold | eval search=replace(search,"\"","")

The macro takes the following arguments:

qualitative_id: The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as medium.

[get_findbest_thresholds]

This is a building block for [mltk_findbest]. You might not use this one by itself.

[get_findbest_thresholds]
definition = inputlookup append=T qualitative_thresholds_lookup | stats values(threshold) as search | eval search="threshold=\"".mvjoin(mvsort(search), ",")."\""

[get_findbest_qualitative]

This is a building block for [mltk_findbest]. You might not use this one by itself.

[get_findbest_qualitative]
definition = inputlookup append=T qualitative_thresholds_lookup | eval threshold_id="findbest_th=".threshold | sort threshold | eval subcase="'".threshold_id."'=\"1.0\",\"".qualitative_label."\"" | stats values(subcase) as search | eval search="qualitative=case(".mvjoin(search, ",").")"

Related answers from Splunk Community

Machine Learning Toolkit Macros in Splunk Enterprise Security

Macros used in SPL

[mltk_apply]

[mltk_apply_lower]

[mltk_apply_upper]

[mltk_findbest]

Macros used by other macros

[get_qualitative_threshold]

[get_qualitative_lower_threshold]

[get_qualitative_upper_threshold]

[get_findbest_thresholds]

[get_findbest_qualitative]

Comments

Machine Learning Toolkit Macros in Splunk Enterprise Security

Was this topic useful?