Machine Learning Toolkit Searches in Splunk Enterprise Security

Extreme Search (XS) context generating searches with names ending in "Context Gen" are revised to use Machine Learning Toolkit (MLTK) and are renamed to end with "Model Gen" instead. Other saved searches, correlation searches, key indicator searches, and rules that used XS keep their names but are also revised to use MLTK. If you have any locally modified XS searches, you need to port them over to use MLTK.

Since XS correlation searches no longer use XS, the corresponding Model Gen searches must first be run to generate a model. As mentioned in the overview, MLTK does not merge daily data into the model, but replaces it with every run. If you want to experiment with running and tuning a model without overwriting it, see Machine Learning Toolkit Troubleshooting in Splunk Enterprise Security.

Searches migrating from XS to MLTK

The list of default searches, correlation searches, key indicators, and rules that are revised from XS to MLTK follows.

DA-ESS-AccessProtection

XS: Access - Total Access Attempts \| tstats `summariesonly` count as current_count from datamodel=authentication.authentication where earliest=-24h@h latest=+0s \| appendcols [\| tstats `summariesonly` count as historical_count from datamodel=authentication.authentication where earliest=-48h@h latest=-24h@h] \| `get_ksi_fields(current_count,historical_count)` \| xsfindbestconcept current_count from count_1d in authentication as current_count_qual \| xsfindbestconcept delta from percentile in default as delta_qual
MLTK: Access - Total Access Attempts \| tstats `summariesonly` count as current_count from datamodel=Authentication.Authentication where earliest=-24h@h latest=+0s \| appendcols [\| tstats `summariesonly` count as historical_count from datamodel=Authentication.Authentication where earliest=-48h@h latest=-24h@h] \| `get_ksi_fields(current_count,historical_count)` \| `mltk_findbest("app:authentication_count_1d")` \| `get_percentage_qualitative(delta, delta_qual)`

DA-ESS-EndpointProtection

XS: Change - Abnormally High Number of Endpoint Changes By User - Rule \| `tstats` count from datamodel=endpoint.filesystem where filesystem.tag="change" by filesystem.user \| eval change_type="filesystem",user='filesystem.user' \| `tstats` append=t count from datamodel=endpoint.registry where registry.tag="change" by registry.user \| eval change_type=if(isnull(change_type),"registry",change_type),user=if(isnull(user),'registry.user',user) \| `tstats` append=t count from datamodel=change.all_changes where nodename="all_changes.endpoint_changes" by all_changes.change_type,all_changes.user \| eval change_type=if(isnull(change_type),'all_changes.change_type',change_type),user=if(isnull(user), 'all_changes.user',user) \| stats count as change_count by change_type,user \| xswhere change_count from change_count_by_user_by_change_type_1d in change_analysis by change_type is above high
MLTK: Change - Abnormally High Number of Endpoint Changes By User - Rule \| `tstats` count from datamodel=Endpoint.Filesystem where Filesystem.tag="change" by Filesystem.user \| eval change_type="filesystem",user='Filesystem.user' \| `tstats` append=T count from datamodel=Endpoint.Registry where Registry.tag="change" by Registry.user \| eval change_type=if(isnull(change_type),"registry",change_type),user=if(isnull(user),'Registry.user',user) \| `tstats` append=T count from datamodel=Change.All_Changes where nodename="All_Changes.Endpoint_Changes" by All_Changes.change_type,All_Changes.user \| eval change_type=if(isnull(change_type),'All_Changes.change_type',change_type),user=if(isnull(user), 'All_Changes.user',user) \| stats count as change_count by change_type,user \| `mltk_apply_upper("app:change_count_by_user_by_change_type_1d", "extreme", "change_count")`
XS: Endpoint - Host Sending Excessive Email - Rule \| tstats `summariesonly` sum(all_email.recipient_count) as count,dc(all_email.dest) as dest_count from datamodel=email.all_email where not all_email.src_category="email_servers" by "all_email.src",_time span=1h \| `drop_dm_object_name("all_email")` \| xswhere count from recipients_by_src_1h in email is above medium or dest_count from destinations_by_src_1h in email is above medium
MLTK: Endpoint - Host Sending Excessive Email - Rule \| tstats `summariesonly` sum(All_Email.recipient_count) as recipient_count,dc(All_Email.dest) as dest_count from datamodel=Email.All_Email where NOT All_Email.src_category="email_servers" by "All_Email.src",_time span=1h \| `drop_dm_object_name("All_Email")` \| apply app:recipients_by_src_1h [\|`get_qualitative_upper_threshold(high)`] \| apply app:destinations_by_src_1h [\|`get_qualitative_upper_threshold(high)`] \| search "IsOutlier(recipient_count)"=1 OR "IsOutlier(dest_count)"=1
XS: Malware - Total Infection Count \| tstats `summariesonly` dc(malware_attacks.signature) as infection_count from datamodel=malware.malware_attacks where earliest=-24h@h latest=+0s malware_attacks.action=allowed by malware_attacks.dest \| stats sum(infection_count) as current_count \| appendcols [\| tstats `summariesonly` dc(malware_attacks.signature) as infection_count from datamodel=malware.malware_attacks where earliest=-48h@h latest=-24h@h malware_attacks.action=allowed by malware_attacks.dest \| stats sum(infection_count) as historical_count] \| `get_ksi_fields(current_count,historical_count)` \| xsfindbestconcept current_count from count_1d in malware as current_count_qual \| xsfindbestconcept delta from percentile in default as delta_qual
MLTK: Malware - Total Infection Count \| tstats `summariesonly` dc(Malware_Attacks.signature) as infection_count from datamodel=Malware.Malware_Attacks where earliest=-24h@h latest=+0s Malware_Attacks.action=allowed by Malware_Attacks.dest \| stats sum(infection_count) as current_count \| appendcols [\| tstats `summariesonly` dc(Malware_Attacks.signature) as infection_count from datamodel=Malware.Malware_Attacks where earliest=-48h@h latest=-24h@h Malware_Attacks.action=allowed by Malware_Attacks.dest \| stats sum(infection_count) as historical_count] \| `get_ksi_fields(current_count,historical_count)` \| `mltk_findbest("app:malware_infection_count_by_1d")` \| `get_percentage_qualitative(delta, delta_qual)`

DA-ESS-IdentityManagement

XS: Identity - High Volume Email Activity with Non-corporate Domains - Rule \| tstats `summariesonly` sum(all_email.size) as bytes, values(all_email.recipient) as recipient from datamodel=email.all_email where not `cim_corporate_email_domain_search("all_email.recipient")` by all_email.src_user \| `drop_dm_object_name("all_email")` \| xsfindbestconcept bytes from email_volume_1h_noncorp \| eval risk_score=case(bestconcept="extreme",80,bestconcept="high",50,bestconcept="medium",20, 1==1, 0) \| search risk_score>0
MLTK: Identity - High Volume Email Activity with Non-corporate Domains - Rule \| tstats `summariesonly` sum(All_Email.size) as bytes, values(All_Email.recipient) as recipient from datamodel=Email.All_Email where NOT `cim_corporate_email_domain_search("All_Email.recipient")` by All_Email.src_user, All_Email.src_user_bunit \| `drop_dm_object_name("All_Email")` \| `mltk_apply_upper("app:email_activity_to_non_corporate_by_user_1h", "medium", "bytes")`
XS: Identity - Web Uploads to Non-corporate Domains - Rule \| tstats `summariesonly` sum(web.bytes) as bytes from datamodel=web.web where (web.http_method="post" or web.http_method="put") not (`cim_corporate_web_domain_search("web.url")`) by web.user \| `drop_dm_object_name("web")` \| xsfindbestconcept bytes from web_volume_1h_noncorp \| eval risk_score=case(bestconcept="extreme",80,bestconcept="high",50,bestconcept="medium",20, 1==1, 0) \| search risk_score>0
MLTK: Identity - Web Uploads to Non-corporate Domains - Rule \| tstats `summariesonly` sum(Web.bytes) as bytes from datamodel=Web.Web where (Web.http_method="POST" OR Web.http_method="PUT") NOT (`cim_corporate_web_domain_search("Web.url")`) by Web.user, Web.user_bunit \| `drop_dm_object_name("Web")` \| `mltk_apply_upper("app:web_upload_to_non_corporate_by_user_1h", "medium", "bytes")`

DA-ESS-NetworkProtection

XS: Network - Unusual Volume of Network Activity - Rule \| tstats `summariesonly` dc(all_traffic.src) as src_count,count from datamodel=network_traffic.all_traffic \| localop \| xswhere count from count_30m in network_traffic is extreme or src_count from src_count_30m in network_traffic is extreme \| eval const_dedup_id="network - unusual volume of network activity - rule"
MLTK: Network - Unusual Volume of Network Activity - Rule \| tstats `summariesonly` dc(All_Traffic.src) as src_count,count as total_count from datamodel=Network_Traffic.All_Traffic \| localop \| apply network_traffic_src_count_30m [\|`get_qualitative_upper_threshold(extreme)`] \| apply network_traffic_count_30m [\|`get_qualitative_upper_threshold(extreme)`] \| search "IsOutlier(src_count)"=1 OR "IsOutlier(total_count)"=1
XS: Web - Abnormally High Number of HTTP Method Events By Src - Rule \| tstats `summariesonly` count as web_event_count from datamodel=web.web by web.src, web.http_method \| `drop_dm_object_name("web")` \| xswhere web_event_count from count_by_http_method_by_src_1d in web by http_method is above high
MLTK: Web - Abnormally High Number of HTTP Method Events By Src - Rule \| tstats `summariesonly` count as web_event_count from datamodel=Web.Web by Web.src, Web.http_method \| `drop_dm_object_name("Web")` \| `mltk_apply_upper("app:count_by_http_method_by_src_1d", "extreme", "web_event_count")`

SA-AccessProtection

XS: Access - Authentication Failures By Source - Context Gen \| tstats `summariesonly` count as failures from datamodel=authentication.authentication where authentication.action="failure" by authentication.src,_time span=1h \| stats median(failures) as median, min(failures) as min, count as count \| eval max = median*2 \| xsupdateddcontext app="sa-accessprotection" name=failures_by_src_count_1h container=authentication scope=app \| stats count
MLTK: Access - Authentication Failures By Source - Model Gen \| tstats `summariesonly` count as failure from datamodel=Authentication.Authentication where Authentication.action="failure" by Authentication.src,_time span=1h \| fit DensityFunction failure dist=norm into app:failures_by_src_count_1h
XS: Access - Authentication Failures By Source Per Day - Context Gen \| tstats `summariesonly` count as failures from datamodel=authentication.authentication where authentication.action="failure" by authentication.src,_time span=1d \| stats median(failures) as median, min(failures) as min, count as count \| eval max = median*2 \| xscreateddcontext app="sa-accessprotection" name=failures_by_src_count_1d container=authentication scope=app type=domain terms=`xs_default_magnitude_concepts` \| stats count
MLTK: Access - Authentication Failures By Source Per Day - Model Gen \| tstats `summariesonly` count as failure from datamodel=Authentication.Authentication where Authentication.action="failure" by Authentication.src,_time span=1d \| fit DensityFunction failure dist=norm into app:failures_by_src_count_1d
XS: Access - Authentication Volume Per Day - Context Gen \| tstats `summariesonly` count as count_1d from datamodel=authentication.authentication by _time span=1d \| stats count, median(count_1d) as median, stdev(count_1d) as size \| search size>0 \| xscreateddcontext name=count_1d container=authentication type=median_centered scope=app app=sa-accessprotection terms=`xs_default_magnitude_concepts` \| stats count
MLTK: Access - Authentication Volume Per Day - Model Gen \| tstats `summariesonly` count as current_count from datamodel=Authentication.Authentication by _time span=1d \| fit DensityFunction current_count dist=norm into app:authentication_count_1d
XS: Access - Brute Force Access Behavior Detected - Rule `\| from datamodel:"authentication"."authentication" \| stats values(tag) as tag,values(app) as app,count(eval('action'=="failure")) as failure,count(eval('action'=="success")) as success by src \| search success>0 \| xswhere failure from failures_by_src_count_1h in authentication is above medium`
MLTK: Access - Brute Force Access Behavior Detected - Rule \| from datamodel:"Authentication"."Authentication" \| stats values(tag) as tag,values(app) as app,count(eval('action'=="failure")) as failure,count(eval('action'=="success")) as success by src \| search success>0 \| `mltk_apply_upper("app:failures_by_src_count_1h", "high", "failure")`
XS: Access - Brute Force Access Behavior Detected Over 1d - Rule \| tstats `summariesonly` values(authentication.app) as app,count from datamodel=authentication.authentication by authentication.action,authentication.src \| `drop_dm_object_name("authentication")` \| eval success=if(action="success",count,0),failure=if(action="failure",count,0) \| stats values(app) as app,sum(failure) as failure,sum(success) as success by src \| where success > 0 \| xswhere failure from failures_by_src_count_1d in authentication is above medium
MLTK: Access - Brute Force Access Behavior Detected Over 1d - Rule \| tstats `summariesonly` values(Authentication.app) as app,count from datamodel=Authentication.Authentication by Authentication.action,Authentication.src \| `drop_dm_object_name("Authentication")` \| eval success=if(action="success",count,0),failure=if(action="failure",count,0) \| stats values(app) as app,sum(failure) as failure,sum(success) as success by src \| where success > 0 \| `mltk_apply_upper("app:failures_by_src_count_1d", "medium", "failure")`

SA-EndpointProtection

XS: Change - Total Change Count By User By Change Type Per Day - Context Gen \| `tstats` count from datamodel=endpoint.filesystem where filesystem.tag="change" by _time,filesystem.user span=24h \| eval change_type="filesystem",user='filesystem.user' \| `tstats` append=t count from datamodel=endpoint.registry where registry.tag="change" by _time,registry.user span=24h \| eval change_type=if(isnull(change_type),"registry",change_type),user=if(isnull(user),'registry.user',user) \| `tstats` append=t count from datamodel=change.all_changes by _time,all_changes.change_type,all_changes.user span=24h \| eval change_type=if(isnull(change_type),'all_changes.change_type',change_type),user=if(isnull(user),'all_changes.user',user) \| stats count as change_count by _time,change_type,user \| `context_stats(change_count, change_type)` \| eval min=0 \| eval max=median*2 \| xsupdateddcontext name=change_count_by_user_by_change_type_1d container=change_analysis class=change_type type=domain app="sa-endpointprotection" scope=app terms=`xs_default_magnitude_concepts` \| stats count
MLTK: Change - Total Change Count By User By Change Type Per Day - Model Gen \| `tstats` count from datamodel=Endpoint.Filesystem where Filesystem.tag="change" by _time,Filesystem.user span=24h \| eval change_type="filesystem",user='Filesystem.user' \| `tstats` append=T count from datamodel=Endpoint.Registry where Registry.tag="change" by _time,Registry.user span=24h \| eval change_type=if(isnull(change_type),"registry",change_type),user=if(isnull(user),'Registry.user',user) \| `tstats` append=T count from datamodel=Change.All_Changes by _time,All_Changes.change_type,All_Changes.user span=24h \| eval change_type=if(isnull(change_type),'All_Changes.change_type',change_type),user=if(isnull(user),'All_Changes.user',user) \| stats count as change_count by _time,change_type,user \| fit DensityFunction change_count by change_type dist=norm into app:change_count_by_user_by_change_type_1d
XS: Endpoint - Emails By Destination Count - Context Gen `\| tstats summariesonly=false dc(all_email.dest) as dest_count from datamodel=email.all_email where not all_email.src_category="email_servers" by "all_email.src",_time span=1h \| stats avg(dest_count) as avg, count \| eval min=0 \| eval max=avg * 2 \| xsupdateddcontext app=sa-endpointprotection name=destinations_by_src_1h container=email type=domain scope=app \| stats count`
MLTK: Endpoint - Emails By Destination Count - Model Gen `\| tstats summariesonly=false dc(All_Email.dest) as dest_count from datamodel=Email.All_Email where NOT All_Email.src_category="email_servers" by "All_Email.src",_time span=1h \| fit DensityFunction dest_count dist=norm into app:destinations_by_src_1h`
XS: Endpoint - Emails By Source - Context Gen `\| tstats summariesonly=false sum(all_email.recipient_count) as recipient_count from datamodel=email.all_email where not all_email.src_category="email_servers" by "all_email.src",_time span=1h \| stats avg(recipient_count) as avg, count \| eval min=0 \| eval max=avg * 2 \| xsupdateddcontext app=sa-endpointprotection name=recipients_by_src_1h container=email type=domain scope=app \| stats count`
MLTK: Endpoint - Emails By Source - Model Gen `\| tstats summariesonly=false sum(All_Email.recipient_count) as recipient_count from datamodel=Email.All_Email where NOT All_Email.src_category="email_servers" by "All_Email.src",_time span=1h \| fit DensityFunction recipient_count dist=norm into app:recipients_by_src_1h`
XS: Endpoint - Malware Daily Count - Context Gen \| tstats `summariesonly` dc(malware_attacks.signature) as infection_count from datamodel=malware.malware_attacks where earliest=-31d@d latest=-1d@d malware_attacks.action=allowed by malware_attacks.dest,_time span=1d \| stats sum(infection_count) as total_infection_count by _time \| stats count,median(total_infection_count) as median by _time \| eval min=0 \| eval max=median*2 \| xscreateddcontext name=count_1d container=malware type=domain terms="minimal,small,medium,large,extreme" scope=app app=sa-networkprotection \| stats count
MLTK: Endpoint - Malware Daily Count - Model Gen \| tstats `summariesonly` dc(Malware_Attacks.signature) as infection_count from datamodel=Malware.Malware_Attacks where earliest=-31d@d latest=-1d@d Malware_Attacks.action=allowed by Malware_Attacks.dest,_time span=1d \| stats sum(infection_count) as current_count by _time \| fit DensityFunction current_count dist=norm into app:malware_infection_count_by_1d

SA-IdentityManagement

XS: Identity - Email Activity to Non-corporate Domains by Users Per 1d - Context Gen \| tstats `summariesonly` sum(all_email.size) as bytes, values(all_email.recipient) as recipient from datamodel=email.all_email where not `cim_corporate_email_domain_search("all_email.recipient")` by _time, all_email.src_user, all_email.src_user_bunit span=1h \| `drop_dm_object_name("all_email")` \| stats avg(bytes) as avg, stdev(bytes) as stdev, count by src_user_bunit \| eval min=0 \| eval max=avg + 3*stdev \| xsupdateddcontext name="email_volume_1h_noncorp" class=src_user_bunit scope=app terms=`xs_default_magnitude_concepts` uom="email_volume_bytes" type=domain app=sa-identitymanagement \| stats count
MLTK: Identity - Email Activity to Non-corporate Domains by Users Per 1d - Model Gen \| tstats `summariesonly` sum(All_Email.size) as bytes, values(All_Email.recipient) as recipient from datamodel=Email.All_Email where NOT `cim_corporate_email_domain_search("All_Email.recipient")` by _time, All_Email.src_user, All_Email.src_user_bunit span=1h \| `drop_dm_object_name("All_Email")` \| fit DensityFunction bytes by src_user_bunit dist=norm into app:email_activity_to_non_corporate_by_user_1h
XS: Identity - Web Uploads to Non-corporate Domains by Users Per 1d - Context Gen \| tstats `summariesonly` sum(web.bytes) as bytes from datamodel=web.web where not(`cim_corporate_web_domain_search("web.url")`) (web.http_method="post" or web.http_method="put") by _time, web.user, web.user_bunit span=1h \| `drop_dm_object_name("web")`\| stats avg(bytes) as avg, stdev(bytes) as stdev, count by user_bunit \| eval min=0 \| eval max=avg + 3*stdev \| xsupdateddcontext name="web_volume_1h_noncorp" class=user_bunit scope=app terms=`xs_default_magnitude_concepts` uom="web_volume_bytes" type=domain app=sa-identitymanagement \| stats count
MLTK: Identity - Web Uploads to Non-corporate Domains by Users Per 1d - Model Gen \| tstats `summariesonly` sum(Web.bytes) as bytes from datamodel=Web.Web where NOT(`cim_corporate_web_domain_search("Web.url")`) (Web.http_method="POST" OR Web.http_method="PUT") by _time, Web.user, Web.user_bunit span=1h \| `drop_dm_object_name("Web")` \| fit DensityFunction bytes by user_bunit dist=norm into app:web_upload_to_non_corporate_by_user_1h

SA-NetworkProtection

XS: Network - Event Count By Signature Per Hour - Context Gen \| tstats `summariesonly` count as count_by_signature_1h from datamodel=intrusion_detection.ids_attacks by _time,ids_attacks.signature span=1h \| `drop_dm_object_name("ids_attacks")` \| `context_stats(count_by_signature_1h, signature)` \| search size>0 \| xscreateddcontext name=count_by_signature_1h class=signature container=ids_attacks type=median_centered terms="minimal,low,medium,high,extreme" scope=app app=sa-networkprotection \| stats count
MLTK: Network - Event Count By Signature Per Hour - Model Gen \| tstats `summariesonly` count as ids_attacks from datamodel=Intrusion_Detection.IDS_Attacks by _time,IDS_Attacks.signature span=1h \| `drop_dm_object_name("IDS_Attacks")` \| fit DensityFunction ids_attacks by signature dist=norm into app:count_by_signature_1h
XS: Network - Port Activity By Destination Port - Context Gen \| tstats `summariesonly` count as dest_port_traffic_count from datamodel=Network_Traffic.All_Traffic by All_Traffic.dest_port,_time span=1d \| `drop_dm_object_name("All_Traffic")` \| `context_stats(dest_port_traffic_count, dest_port)` \| search size>0 \| xscreateddcontext name=count_by_dest_port_1d class=dest_port container=network_traffic type=median_centered terms="minimal,low,medium,high,extreme" width=3 scope=app app=SA-NetworkProtection \| stats count
MLTK: Network - Port Activity By Destination Port - Model Gen \| tstats `summariesonly` count as dest_port_traffic_count from datamodel=Network_Traffic.All_Traffic by All_Traffic.dest_port,_time span=1d \| `drop_dm_object_name("All_Traffic")` \| fit DensityFunction dest_port_traffic_count by dest_port dist=norm into app:count_by_dest_port_1d
XS: Network - Substantial Increase In Intrusion Events - Rule \| tstats `summariesonly` count,values(ids_attacks.tag) as tag from datamodel=intrusion_detection.ids_attacks by ids_attacks.signature \| `drop_dm_object_name("ids_attacks")` \| xswhere count from count_by_signature_1h in ids_attacks by signature is above medium
MLTK: Network - Substantial Increase In Intrusion Events - Rule \| tstats `summariesonly` count as ids_attacks,values(IDS_Attacks.tag) as tag from datamodel=Intrusion_Detection.IDS_Attacks by IDS_Attacks.signature \| `drop_dm_object_name("IDS_Attacks")` \| `mltk_apply_upper("app:count_by_signature_1h", "high", "ids_attacks")`
XS: Network - Substantial Increase in Port Activity - Rule \| tstats `summariesonly` count,values(all_traffic.tag) as tag from datamodel=network_traffic.all_traffic by all_traffic.dest_port \| `drop_dm_object_name("all_traffic")` \| xswhere count from count_by_dest_port_1d in network_traffic by dest_port is extreme
MLTK: Network - Substantial Increase in Port Activity - Rule \| tstats `summariesonly` count as dest_port_traffic_count,values(All_Traffic.tag) as tag from datamodel=Network_Traffic.All_Traffic by All_Traffic.dest_port \| `drop_dm_object_name("All_Traffic")` \| `mltk_apply_upper("app:count_by_dest_port_1d", "extreme", "dest_port_traffic_count")`
XS: Network - Traffic Source Count Per 30m - Context Gen \| tstats `summariesonly` dc(all_traffic.src) as src_count from datamodel=network_traffic.all_traffic by _time span=30m \| stats count, median(src_count) as median, stdev(src_count) as size \| search size>0 \| xsupdateddcontext name=src_count_30m container=network_traffic terms="minimal,low,medium,high,extreme" type=median_centered width=3 app=sa-networkprotection scope=app \| stats count
MLTK: Network - Traffic Source Count Per 30m - Model Gen \| tstats `summariesonly` dc(All_Traffic.src) as src_count from datamodel=Network_Traffic.All_Traffic by _time span=30m \| fit DensityFunction src_count dist=norm into app:network_traffic_src_count_30m
XS: Network - Traffic Volume Per 30m - Context Gen \| tstats `summariesonly` count as total_count from datamodel=network_traffic.all_traffic by _time span=30m \| stats count, median(total_count) as median, stdev(total_count) as size \| search size>0 \| xsupdateddcontext name=count_30m container=network_traffic terms="minimal,low,medium,high,extreme" type=median_centered width=3 app=sa-networkprotection scope=app \| stats count
MLTK: Network - Traffic Volume Per 30m - Model Gen \| tstats `summariesonly` count as total_count from datamodel=Network_Traffic.All_Traffic by _time span=30m \| fit DensityFunction total_count dist=norm into app:network_traffic_count_30m
XS: Web - Web Event Count By Src By HTTP Method Per 1d - Context Gen \| tstats `summariesonly` count as web_event_count from datamodel=web.web by web.src, web.http_method, _time span=24h \| `drop_dm_object_name("web")` \| where match(http_method, "^[a-za-z]+$") \| `context_stats(web_event_count, http_method)` \| eval min=0 \| eval max=median*2 \| xscreateddcontext name=count_by_http_method_by_src_1d container=web class=http_method app="sa-networkprotection" scope=app type=domain terms=`xs_default_magnitude_concepts` \| stats count
MLTK: Web - Web Event Count By Src By HTTP Method Per 1d - Model Gen \| tstats `summariesonly` count as web_event_count from datamodel=Web.Web by Web.src, Web.http_method, _time span=24h \| `drop_dm_object_name("Web")` \| where match(http_method, "^[A-Za-z]+$") \| fit DensityFunction web_event_count by http_method dist=norm into app:count_by_http_method_by_src_1d

SA-ThreatIntelligence

XS: Risk - Aggregated Other Risk \| tstats `summariesonly` sum(all_risk.risk_score) as current_count from datamodel=risk.all_risk where earliest=-24h@h latest=+0s all_risk.risk_object_type="other" by all_risk.risk_object_type \| appendcols [\| tstats `summariesonly` sum(all_risk.risk_score) as historical_count from datamodel=risk.all_risk where earliest=-48h@h latest=-24h@h all_risk.risk_object_type="other" by all_risk.risk_object_type] \| `get_ksi_fields(current_count, historical_count)` \| `drop_dm_object_name("all_risk")` \| xsfindbestconcept current_count from total_risk_by_object_type_1d in risk by risk_object_type as current_count_qual \| xsfindbestconcept delta from percentile in default as delta_qual
MLTK: Risk - Aggregated Other Risk \| tstats `summariesonly` sum(All_Risk.risk_score) as current_count from datamodel=Risk.All_Risk where earliest=-24h@h latest=+0s All_Risk.risk_object_type="other" by All_Risk.risk_object_type \| appendcols [\| tstats `summariesonly` sum(All_Risk.risk_score) as historical_count from datamodel=Risk.All_Risk where earliest=-48h@h latest=-24h@h All_Risk.risk_object_type="other" by All_Risk.risk_object_type] \| `get_ksi_fields(current_count, historical_count)` \| `drop_dm_object_name("All_Risk")` \| `mltk_findbest("app:total_risk_by_object_type_1d")` \| `get_percentage_qualitative(delta, delta_qual)`
XS: Risk - Aggregated Risk \| tstats `summariesonly` sum(all_risk.risk_score) as current_count from datamodel=risk.all_risk where earliest=-24h@h latest=+0s \| appendcols [\| tstats `summariesonly` sum(all_risk.risk_score) as historical_count from datamodel=risk.all_risk where earliest=-48h@h latest=-24h@h] \| `get_ksi_fields(current_count, historical_count)` \| xsfindbestconcept current_count from total_risk_by_object_type_1d in risk as current_count_qual \| xsfindbestconcept delta from percentile in default as delta_qual
MLTK: Risk - Aggregated Risk \| tstats `summariesonly` sum(All_Risk.risk_score) as current_count from datamodel=Risk.All_Risk where earliest=-24h@h latest=+0s \| appendcols [\| tstats `summariesonly` sum(All_Risk.risk_score) as historical_count from datamodel=Risk.All_Risk where earliest=-48h@h latest=-24h@h] \| `get_ksi_fields(current_count, historical_count)` \| `mltk_findbest("app:total_risk_1d")` \| `get_percentage_qualitative(delta, delta_qual)`
XS: Risk - Aggregated System Risk \| tstats `summariesonly` sum(all_risk.risk_score) as current_count from datamodel=risk.all_risk where earliest=-24h@h latest=+0s all_risk.risk_object_type="system" by all_risk.risk_object_type \| appendcols [\| tstats `summariesonly` sum(all_risk.risk_score) as historical_count from datamodel=risk.all_risk where earliest=-48h@h latest=-24h@h all_risk.risk_object_type="system" by all_risk.risk_object_type] \| `get_ksi_fields(current_count, historical_count)` \| `drop_dm_object_name("all_risk")` \| xsfindbestconcept current_count from total_risk_by_object_type_1d in risk by risk_object_type as current_count_qual \| xsfindbestconcept delta from percentile in default as delta_qual
MLTK: Risk - Aggregated System Risk \| tstats `summariesonly` sum(All_Risk.risk_score) as current_count from datamodel=Risk.All_Risk where earliest=-24h@h latest=+0s All_Risk.risk_object_type="system" by All_Risk.risk_object_type \| appendcols [\| tstats `summariesonly` sum(All_Risk.risk_score) as historical_count from datamodel=Risk.All_Risk where earliest=-48h@h latest=-24h@h All_Risk.risk_object_type="system" by All_Risk.risk_object_type] \| `get_ksi_fields(current_count, historical_count)` \| `drop_dm_object_name("All_Risk")` \| `mltk_findbest("app:total_risk_by_object_type_1d")` \| `get_percentage_qualitative(delta, delta_qual)`
XS: Risk - Aggregated User Risk \| tstats `summariesonly` sum(all_risk.risk_score) as current_count from datamodel=risk.all_risk where earliest=-24h@h latest=+0s all_risk.risk_object_type="user" by all_risk.risk_object_type \| appendcols [\| tstats `summariesonly` sum(all_risk.risk_score) as historical_count from datamodel=risk.all_risk where earliest=-48h@h latest=-24h@h all_risk.risk_object_type="user" by all_risk.risk_object_type] \| `get_ksi_fields(current_count, historical_count)` \| `drop_dm_object_name("all_risk")` \| xsfindbestconcept current_count from total_risk_by_object_type_1d in risk by risk_object_type as current_count_qual \| xsfindbestconcept delta from percentile in default as delta_qual
MLTK: Risk - Aggregated User Risk \| tstats `summariesonly` sum(All_Risk.risk_score) as current_count from datamodel=Risk.All_Risk where earliest=-24h@h latest=+0s All_Risk.risk_object_type="user" by All_Risk.risk_object_type \| appendcols [\| tstats `summariesonly` sum(All_Risk.risk_score) as historical_count from datamodel=Risk.All_Risk where earliest=-48h@h latest=-24h@h All_Risk.risk_object_type="user" by All_Risk.risk_object_type] \| `get_ksi_fields(current_count, historical_count)` \| `drop_dm_object_name("All_Risk")` \| `mltk_findbest("app:total_risk_by_object_type_1d")` \| `get_percentage_qualitative(delta, delta_qual)`
XS: Risk - Median Object Risk Per Day - Context Gen \| tstats `summariesonly` sum(all_risk.risk_score) as object_risk from datamodel=risk.all_risk by _time,all_risk.risk_object,all_risk.risk_object_type span=1d \| `drop_dm_object_name("all_risk")` \| `context_stats(object_risk, risk_object_type)` \| eval min=0 \| eval max=median*2 \| xsupdateddcontext app=sa-threatintelligence name=median_object_risk_by_object_type_1d container=risk class=risk_object_type type=domain scope=app \| stats count
MLTK: Risk - Median Object Risk Per Day - Model Gen \| tstats `summariesonly` sum(All_Risk.risk_score) as current_count from datamodel=Risk.All_Risk by _time,All_Risk.risk_object,All_Risk.risk_object_type span=1d \| `drop_dm_object_name("All_Risk")` \| fit DensityFunction current_count dist=norm into app:median_object_risk_1d
XS: Risk - Median Object Risk Per Day by Object Type - Context Gen N/A. The original Risk - Median Object Risk Per Day - Context Gen became two: Risk - Median Object Risk Per Day - Model Gen and Risk - Median Object Risk Per Day by Object Type - Model Gen.
MLTK: Risk - Median Object Risk Per Day by Object Type - Model Gen \| tstats `summariesonly` sum(All_Risk.risk_score) as current_count from datamodel=Risk.All_Risk by _time,All_Risk.risk_object,All_Risk.risk_object_type span=1d \| `drop_dm_object_name("All_Risk")` \| fit DensityFunction current_count by risk_object_type dist=norm into app:median_object_risk_by_object_type_1d
XS: Risk - Median Risk Score \| tstats `summariesonly` sum(all_risk.risk_score) as accum_risk from datamodel=risk.all_risk where earliest=-24h@h latest=+0s by all_risk.risk_object \| stats median(accum_risk) as current_count \| appendcols [\| tstats `summariesonly` sum(all_risk.risk_score) as accum_risk from datamodel=risk.all_risk where earliest=-48h@h latest=-24h@h by all_risk.risk_object \| stats median(accum_risk) as historical_count] \| `get_ksi_fields(current_count, historical_count)` \| xsfindbestconcept current_count from median_object_risk_by_object_type_1d in risk as current_count_qual \| xsfindbestconcept delta from percentile in default as delta_qual
MLTK: Risk - Median Risk Score \| tstats `summariesonly` sum(All_Risk.risk_score) as accum_risk from datamodel=Risk.All_Risk where earliest=-24h@h latest=+0s by All_Risk.risk_object \| stats median(accum_risk) as current_count \| appendcols [\| tstats `summariesonly` sum(All_Risk.risk_score) as accum_risk from datamodel=Risk.All_Risk where earliest=-48h@h latest=-24h@h by All_Risk.risk_object \| stats median(accum_risk) as historical_count] \| `get_ksi_fields(current_count, historical_count)` \| `mltk_findbest("app:median_object_risk_1d")` \| `get_percentage_qualitative(delta, delta_qual)`
XS: Risk - Median Risk Score By Other \| tstats `summariesonly` sum(all_risk.risk_score) as accum_risk from datamodel=risk.all_risk where earliest=-24h@h latest=+0s all_risk.risk_object_type="other" by all_risk.risk_object, all_risk.risk_object_type \| stats median(accum_risk) as current_count \| appendcols [\| tstats `summariesonly` sum(all_risk.risk_score) as accum_risk from datamodel=risk.all_risk where earliest=-48h@h latest=-24h@h all_risk.risk_object_type="other" by all_risk.risk_object, all_risk.risk_object_type \| stats median(accum_risk) as historical_count] \| eval risk_object_type="other" \| `get_ksi_fields(current_count, historical_count)` \| `drop_dm_object_name("all_risk")` \| xsfindbestconcept current_count from median_object_risk_by_object_type_1d in risk by risk_object_type as current_count_qual \| xsfindbestconcept delta from percentile in default as delta_qual
MLTK: Risk - Median Risk Score By Other \| tstats `summariesonly` sum(All_Risk.risk_score) as accum_risk from datamodel=Risk.All_Risk where earliest=-24h@h latest=+0s All_Risk.risk_object_type="other" by All_Risk.risk_object, All_Risk.risk_object_type \| stats median(accum_risk) as current_count \| appendcols [\| tstats `summariesonly` sum(All_Risk.risk_score) as accum_risk from datamodel=Risk.All_Risk where earliest=-48h@h latest=-24h@h All_Risk.risk_object_type="other" by All_Risk.risk_object, All_Risk.risk_object_type \| stats median(accum_risk) as historical_count] \| eval risk_object_type="other" \| `get_ksi_fields(current_count, historical_count)` \| `drop_dm_object_name("All_Risk")` \| `mltk_findbest("app:median_object_risk_by_object_type_1d")` \| `get_percentage_qualitative(delta, delta_qual)`
XS: Risk - Median Risk Score By System \| tstats `summariesonly` sum(all_risk.risk_score) as accum_risk from datamodel=risk.all_risk where earliest=-24h@h latest=+0s all_risk.risk_object_type="system" by all_risk.risk_object, all_risk.risk_object_type \| stats median(accum_risk) as current_count \| appendcols [\| tstats `summariesonly` sum(all_risk.risk_score) as accum_risk from datamodel=risk.all_risk where earliest=-48h@h latest=-24h@h all_risk.risk_object_type="system" by all_risk.risk_object, all_risk.risk_object_type \| stats median(accum_risk) as historical_count] \| eval risk_object_type="system" \| `get_ksi_fields(current_count, historical_count)` \| `drop_dm_object_name("all_risk")` \| xsfindbestconcept current_count from median_object_risk_by_object_type_1d in risk by risk_object_type as current_count_qual \| xsfindbestconcept delta from percentile in default as delta_qual
MLTK: Risk - Median Risk Score By System \| tstats `summariesonly` sum(All_Risk.risk_score) as accum_risk from datamodel=Risk.All_Risk where earliest=-24h@h latest=+0s All_Risk.risk_object_type="system" by All_Risk.risk_object, All_Risk.risk_object_type \| stats median(accum_risk) as current_count \| appendcols [\| tstats `summariesonly` sum(All_Risk.risk_score) as accum_risk from datamodel=Risk.All_Risk where earliest=-48h@h latest=-24h@h All_Risk.risk_object_type="system" by All_Risk.risk_object, All_Risk.risk_object_type \| stats median(accum_risk) as historical_count] \| eval risk_object_type="system" \| `get_ksi_fields(current_count, historical_count)` \| `drop_dm_object_name("All_Risk")` \| `mltk_findbest("app:median_object_risk_by_object_type_1d")` \| `get_percentage_qualitative(delta, delta_qual)`
XS: Risk - Median Risk Score By User \| tstats `summariesonly` sum(all_risk.risk_score) as accum_risk from datamodel=risk.all_risk where earliest=-24h@h latest=+0s all_risk.risk_object_type="user" by all_risk.risk_object, all_risk.risk_object_type \| stats median(accum_risk) as current_count \| appendcols [\| tstats `summariesonly` sum(all_risk.risk_score) as accum_risk from datamodel=risk.all_risk where earliest=-48h@h latest=-24h@h all_risk.risk_object_type="user" by all_risk.risk_object, all_risk.risk_object_type \| stats median(accum_risk) as historical_count] \| eval risk_object_type="user" \| `get_ksi_fields(current_count, historical_count)` \| `drop_dm_object_name("all_risk")` \| xsfindbestconcept current_count from median_object_risk_by_object_type_1d in risk by risk_object_type as current_count_qual \| xsfindbestconcept delta from percentile in default as delta_qual
MLTK: Risk - Median Risk Score By User \| tstats `summariesonly` sum(All_Risk.risk_score) as accum_risk from datamodel=Risk.All_Risk where earliest=-24h@h latest=+0s All_Risk.risk_object_type="user" by All_Risk.risk_object, All_Risk.risk_object_type \| stats median(accum_risk) as current_count \| appendcols [\| tstats `summariesonly` sum(All_Risk.risk_score) as accum_risk from datamodel=Risk.All_Risk where earliest=-48h@h latest=-24h@h All_Risk.risk_object_type="user" by All_Risk.risk_object, All_Risk.risk_object_type \| stats median(accum_risk) as historical_count] \| eval risk_object_type="user" \| `get_ksi_fields(current_count, historical_count)` \| `drop_dm_object_name("All_Risk")` \| `mltk_findbest("app:median_object_risk_by_object_type_1d")` \| `get_percentage_qualitative(delta, delta_qual)`
XS: Risk - Total Risk By Risk Object Type Per Day - Context Gen \| tstats `summariesonly` sum(all_risk.risk_score) as accum_risk from datamodel=risk.all_risk by _time,all_risk.risk_object_type span=1d \| `drop_dm_object_name("all_risk")` \| `context_stats(accum_risk, risk_object_type)` \| eval min=0 \| eval max=median*2 \| xsupdateddcontext app=sa-threatintelligence name=total_risk_by_object_type_1d container=risk class=risk_object_type type=domain scope=app \| stats count
MLTK: Risk - Total Risk By Risk Object Type Per Day - Model Gen \| tstats `summariesonly` sum(All_Risk.risk_score) as current_count from datamodel=Risk.All_Risk by _time,All_Risk.risk_object_type span=1d \| `drop_dm_object_name("All_Risk")` \| fit DensityFunction current_count by risk_object_type dist=norm into app:total_risk_by_object_type_1d
XS: Risk - Total Risk Per Day - Context Gen N/A. The original Risk - Total Risk By Risk Object Type Per Day - Context Gen became two: Risk - Total Risk By Risk Object Type Per Day - Model Gen and Risk - Total Risk Per Day - Model Gen.
MLTK: Risk - Total Risk Per Day - Model Gen \| tstats `summariesonly` sum(All_Risk.risk_score) as current_count from datamodel=Risk.All_Risk by _time span=1d \| `drop_dm_object_name("All_Risk")` \| fit DensityFunction current_count dist=norm into app:total_risk_1d

SA-Utils

XS: ESS - Percentile - Context Gen `\| xscreateudcontext scope=app container=default name=percentile terms="extreme,high,medium,low,minimal,low,medium,high,extreme" type=domain uom="percentage" min=-100 max=100 count=1 \| stats count`

Audit searches using an MLTK Model

There is a savedsearch to help audit your model generating searches and the corresponding rules that apply them.

For example, the following savedsearch finds the search called "Network - Traffic Source Count Per 30m - Model Gen" that builds the model for network_traffic_src_count_30m with fit densityfunction. Then it also finds the rule called "Network - Unusual Volume of Network Activity - Rule" that applies data to the model and finds the outliers using apply and the `get_qualitative_upper_threshold(extreme)` macro.

Example search:

| savedsearch "Audit - Searches using an MLTK Model" model_name=network_traffic_src_count_30m

Example results:

eai:acl.app	title	search
SA-NetworkProtection	Network - Traffic Source Count Per 30m - Model Gen	tstats `summariesonly` dc(all_traffic.src) as src_count from datamodel=network_traffic.all_traffic by _time span=30m \| fit densityfunction src_count dist=norm into app:network_traffic_src_count_30m
DA-ESS-NetworkProtection	Network - Unusual Volume of Network Activity - Rule	tstats `summariesonly` dc(all_traffic.src) as src_count,count as total_count from datamodel=network_traffic.all_traffic \| localop \| apply network_traffic_src_count_30m [\|`get_qualitative_upper_threshold(extreme)`] \| apply network_traffic_count_30m [\|`get_qualitative_upper_threshold(extreme)`] \| search "isoutlier(src_count)"=1 or "isoutlier(total_count)"=1

Related answers from Splunk Community

Machine Learning Toolkit Searches in Splunk Enterprise Security

Searches migrating from XS to MLTK

DA-ESS-AccessProtection

DA-ESS-EndpointProtection

DA-ESS-IdentityManagement

DA-ESS-NetworkProtection

SA-AccessProtection

SA-EndpointProtection

SA-IdentityManagement

SA-NetworkProtection

SA-ThreatIntelligence

SA-Utils

Audit searches using an MLTK Model

Comments

Machine Learning Toolkit Searches in Splunk Enterprise Security

Was this topic useful?