Troubleshooting the deep dives
Use the following prompts to troubleshoot the deep dives.
Can I include additional source types in my model?
In the Deep dive: Using ML to identify user access anomalies, if you want this analytic to cover other data sources you need to change the base tstats
search to include those other source types. As long as the count is something that represents failed logons, the rest of the search does not need to be changed.
Can I use this approach to find outliers in error rates from other data sources?
In the Deep dive: Using ML to detect outliers in error message rates, the timechart
, eval
, and fit
and apply
stages of the search can run against other data sources. Just update the base search to find errors in other log types.
Can I use this approach to find outliers in server response time from other data sources?
In the Deep dive: Using ML to detect outliers in server response time the timechart
, eval
, and fit
and apply
stages of the search can run against other data sources. Just update the base search to find outliers in other log types.
Can I use the host field in the DensityFunction model training by clause?
While the host
or even user
fields might provide an added layer of granularity to your searches by creating baselines at an entity level, they can quickly increase the processing time and requirements for DensityFunction. Typically if there are more than 1000 different combinations of elements in the by
clause, DensityFunction is not the best approach. You can take an alternative approach using stats and lookups. There is a great example of this approach from IG Group at .conf21 on how to handle high cardinality data. See, Anomaly Detection, Sealed with a KISS.
Calculating a 5 minute aggregate is taking a long time to compute. Is there anything I can do?
Although you are aggregating over 5 minute time spans for the search here, you might find that your search performs better if you look at larger time frame aggregates, such as hourly.
I'm finding too many outliers in my data. What can I do?
See the Tune the model section of the deep dive topic you are working on. In particular, look at how you can tune the detection sensitivity of the threshold
parameter.
I don't understand how DensityFunction is identifying outliers. How can I find out more about what the algorithm is doing with my data?
You can use the summary
command for information about the models generated using DensityFunction. You can see the distribution type the model has mapped your data to, some statistics about the data distribution, and a cardinality field that tells you how many records have been used to train the model.
A couple of key metrics to investigate are the cardinality and the Wasserstein distance metric. For cardinality, the higher this number is the better. For the Wasserstein distance metric, which tells you how closely the probability distribution matches your actual data, the lower the number the better.
Are there any gotchas I need to know about?
There are situations where DensityFunction incorrectly identifies outliers when data is mapped to the beta distribution with certain parameters. For example, alpha=beta=0.5
. Rather than running the distribution type in the defaut setting of dist_type=auto
, select the distribution type when fitting the model. For example, choosing dist_type=normal
in the fit step.
Deep dive: Create a data ingest anomaly detection dashboard using ML-SPL commands | Share data in the Machine Learning Toolkit |
This documentation applies to the following versions of Splunk® Machine Learning Toolkit: 4.5.0, 5.0.0, 5.1.0, 5.2.0, 5.2.1, 5.2.2, 5.3.0, 5.3.1, 5.3.3, 5.4.0, 5.4.1, 5.4.2, 5.5.0
Feedback submitted, thanks!