Predictive Analytics performance considerations in ITSI
While training and testing models in ITSI, there are several performance considerations to take into account. This topic provides guidance for choosing services to model, selecting training inputs, and retraining models.
While it's impossible to provide prescriptive advice for maximizing performance in every situation, the following observations and tips can help you tune and improve performance in your unique environment:
- Avoid creating models for services that contain more than approximately 20 KPIs and 50 entities. The training time for a model depends on three factors: the number of KPIs in the service, the number of entities in the service, and the frequency of KPI searches. For a service with 20 total KPIs, 50 total entities, and 1-minute KPI searches, the average training time is 5 minutes. Scaling any of these three factors increases the training time.
- Do not create models for more than 75 total services. Management of models is manual and can be difficult if you create too many.
- Configure the MLTK to handle more memory and events. For more information, see Set up Predictive Analytics in ITSI.
- Use at least 14 days worth of data to train your model. The training period is determined by the time period you specify before training a model. 30 days or more is recommended. In general, it is best to use as much data as you have available to train a model.
- Base your training/test split ratio on the size of your dataset. Use a large split when dealing with large datasets (for example, 1 million data points). Use a smaller split when dealing with smaller datasets (for example, 10,000 data points). For more information, see Split your data into training and test sets.
- Do not change the Test Period until you select a final model for the service. Leaving the data in separate train and test partitions, as configured in the training/test split, provides an honest assessment of the model's performance.
- Retrain a service's model if KPIs or entities are added, removed, or changed. Changes in service architecture can cause changes in KPI behavior and service health score trends. For more information, see Retrain a predictive model in ITSI.
- Before retraining a model, test it on at least 90 minutes of data. If you test on fewer than 90 minutes, the results could be inaccurate or incomplete.
- Do not use the training search as a scheduled search. Predictive Analytics models do not require repetitive retraining, such as retraining every night. Retraining them too often can cause scale implications. Training is specifically designed to be an intermittent expense.
More performance help
If you experience performance issues, or want to receive feedback tailored to your setup, you have the following options:
- Post a request to the community on Splunk Answers.
- Contact Splunk Support.
Use the Predictive Analytics dashboard in ITSI | Scenario: Use ITSI Predictive Analytics |
This documentation applies to the following versions of Splunk® IT Service Intelligence: 4.11.0, 4.11.1, 4.11.2, 4.11.3, 4.11.4, 4.11.5, 4.11.6, 4.12.0 Cloud only, 4.12.1 Cloud only, 4.12.2 Cloud only, 4.13.0, 4.13.1, 4.13.2, 4.13.3, 4.14.0 Cloud only, 4.14.1 Cloud only, 4.14.2 Cloud only, 4.15.0, 4.15.1, 4.15.2, 4.15.3, 4.16.0 Cloud only, 4.17.0, 4.17.1, 4.18.0, 4.18.1, 4.19.0, 4.19.1, 4.19.2
Feedback submitted, thanks!