Splunk® Enterprise

Knowledge Manager Manual

Download manual as PDF

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

Accelerate data models

Data model acceleration is a tool that you can use to speed up data models that represent extremely large datasets. After acceleration, pivots based on accelerated data model objects complete quicker than they did before, as do reports and dashboard panels that are based on those pivots.

Data model acceleration does this with the help of the High Performance Analytics Store functionality, which builds data summaries behind the scenes in a manner similar to that of report acceleration. Like report acceleration summaries, data model acceleration summaries are easy to enable and disable, and are stored on your indexers parallel to the index buckets that contain the events that are being summarized.

This topic covers:

  • The differences between data model acceleration, report acceleration, and summary indexing.
  • How you enable persistent acceleration for data models.
  • How the Splunk platform builds data model acceleration summaries.
  • How you can query accelerated data model acceleration summaries with the tstats command.
  • Advanced configurations for persistently accelerated data models.

This topic also explains ad hoc data model acceleration. the Splunk platform applies ad hoc data model acceleration whenever you build a pivot with an unaccelerated object, even search- and transaction-based objects, which can't be accelerated in a persistent fashion. However, any acceleration benefits you obtain are lost the moment you leave the Pivot Editor or switch objects during a session with the Pivot Editor. These disadvantages do not apply to "persistently" accelerated objects, which will always load with acceleration whenever they're accessed via Pivot. In addition, unlike "persistent" data model acceleration, ad hoc acceleration is not applied to reports or dashboard panels built with Pivot.

How data model acceleration differs from report acceleration and summary indexing

This is how data model acceleration differs from report acceleration and summary indexing:

  • Report acceleration and summary indexing speed up individual searches, on a report by report basis. They do this by building collections of precomputed search result aggregates.
  • Data model acceleration speeds up reporting for the entire set of' attributes (fields) that you define in a data model and which you and your Pivot users want to report on. In effect it accelerates the dataset represented by that collection of fields rather than a particular search against that dataset.

What is a high-performance analytics store?

Data model acceleration summaries are composed of multiple time-series index files, which have the .tsidx file extension. Each .tsidx file contains records of the indexed field::value combos in the selected dataset and all of the index locations of those field::value combos. It's these .tsidx files that make up the high-performance analytics store. Collectively, the .tsidx files are optimized to accelerate a range of analytical searches involving a specific set of fields--the set of fields defined as attributes in the accelerated data model.

An accelerated data model's high-performance analytics store spans a "summary range". This is a range of time that you select when you enable acceleration for the data model. When you run a pivot on an accelerated dataset, the pivot's time range must fall at least partly within this summary range in order to get an acceleration benefit. For example, if you have a data model that accelerates the last month of data but you create a pivot using one of this data model's objects that runs over the past year, the pivot will initially only get acceleration benefits for the portion of the search that runs over the past month.

The .tsidx files that make up a high-performance analytics store for a single data model are always distributed across one or more of your indexers. This is because the Splunk platform creates .tsidx files on the indexer, parallel to the buckets that contain the events referenced in the file and which cover the range of time that the summary spans.

Note: The high-performance analytics store created through persistent data model acceleration is different from the summaries created through ad hoc data model acceleration. Ad hoc summaries are always created in a dispatch directory at the search head. For more information about ad hoc data model acceleration, see the subtopic "About ad hoc data model acceleration," below.

Enable persistent acceleration for a data model

You use the Edit Acceleration dialog to enable persistent acceleration for a data model.

1. Open the Edit Acceleration dialog. There are three ways to get to this dialog:

  • Navigate to the Data Models management page, find the model you want to accelerate, and click Edit and select Edit Acceleration.
  • Navigate to the Data Models management page, expand the row of the data model you want to accelerate, and click Add for ACCELERATION.
  • Open the Data Model Editor for a data model, click Edit and select Edit Acceleration.

2. Select Accelerate to to enable acceleration for the data model.

3. Choose a Summary Range.

The Summary Range can span 1 Day, 7 Days, 1 Month, 3 Months, 1 Year, or All Time. It represents the time range over which you plan to run pivots against the accelerated objects in the data model. For example, if you only want to run pivots over periods of time within the last seven days, choose 7 Days. For more information about this setting, see the subtopic "About the summary range," below.
If you require a different summary range than the ones supplied by the Summary Range field, you can configure it for your data model in datamodels.conf.

Note: Smaller time ranges mean smaller .tsidx files that require less time to build and which take up less space on disc, so you may want to go with shorter ranges when you can.

For more details on the summary range setting and how it works, see "About the summary range," below.

Data model acceleration caveats

There are a number of restrictions on the kinds of data model objects that can be accelerated.

  • Data model acceleration only affects event object hierarchies. Object hierarchies based on root search and root transaction objects are not accelerated.
    • Pivots that use unaccelerated objects fall back to _raw data, which means that they initially run slower. However, they can receive some acceleration benefit from ad hoc data model acceleration. See "About ad hoc data model acceleration" at the end of this topic.
  • Data model acceleration is most efficient if the root event objects being accelerated include in their initial constraint search the index(es) that the Splunk platform should search over. A single high-performance analytics store can span across several indexes in multiple indexers. If you know that all of the data that you want to pivot on resides in a particular index or set of indexes, you can speed things up by telling the Splunk platform where to look. Otherwise the Splunk platform may end up wasting time unnecessarily accelerating data that is not of use to you.

For the full list of restrictions and caveats on data model usage see the list in "Managing Data Models," in this manual.

After you enable acceleration for a data model

After you enable persistent acceleration for your data model, the Splunk platform begins building a data model acceleration summary for the data model that spans the summary range that you've specified. The Splunk platform creates the .tsidx files for the summary in indexes that contain events that have the fields specified in the data model. It stores the .tsidx files parallel to their corresponding index buckets in a manner identical to that of report acceleration summaries.

After the Splunk platform builds the data model acceleration summary, it runs scheduled searches on a 5 minute interval to keep it updated. Every 30 minutes, the Splunk platform runs a maintenance process to remove old, outdated .tsidx summary files. You can adjust these intervals in datamodels.conf and limits.conf, respectively.

A few facts about data model acceleration summaries:

  • Each bucket in each index in a the Splunk platform instance can have one or more data model acceleration summary .tsidx files, one for each accelerated data model for which it has relevant data. These summaries are created by the Splunk platform as it collects data.
  • The Splunk platform restricts summaries to a particular search head (or search head pool id) to account for different extractions that may produce different results for the same search string.
  • You can only accelerate data models that you have shared to all users of an app or shared globally to all users of your Splunk platform instance. You cannot accelerate data models that are private. This prevents individual users from taking up disk space with private data model acceleration summaries.

Note: If necessary you can configure the location of data model acceleration summaries via indexes.conf.

About the summary range

Data model acceleration summary ranges span an approximate range of time. At times, a data model acceleration summary can have a store of data that slightly exceeds its summary range, but the summary never fails to meet that range, except during the period when the Splunk platform is first building it.

When the Splunk platform finishes building a data model acceleration summary, its data model summarization process ensures that the summary always covers its summary range. The process periodically removes older summary data that passes out of the summary range.

If you have a pivot that is associated with an accelerated data model object, that pivot completes fastest when you run it over a time range that falls within the summary range of the data model. The Splunk platform runs the pivot against the data model acceleration summary rather than the source index _raw data. The summary has far less data than the source index, which means that the pivot completes faster than it does on its initial run.

If you run the same pivot over a time range that is only partially covered by the summary range, the pivot is slower to complete. The Splunk platform has to run at least part of the pivot search over the source index _raw data in the index, which means it must parse through a larger set of events. So it is best to set the Summary Range for a data model wide enough that it captures all of the searches you plan to run against it.

Note: There are advanced settings related to Summary Range that you can use if you have a large Splunk implementation that involves multi-terrabyte datasets. This can lead to situations where the search required to build the initial data model acceleration summary runs too long and/or is resource intensive. For more information, see the subtopic "Advanced configurations for persistently accelerated data models," below.

Summary range example

You create a data model and accelerate it with a Summary Range of 7 days. The Splunk platform builds a summary for your data model that approximately spans the past 7 days and then maintains it over time, periodically updating it with new data and removing data that is older than seven days.

You run a pivot over a time range that falls within the last week, and it should complete fairly quickly. But if you run the same pivot over the last 3 to 10 days it will not complete as quickly, even though this search also covers 7 days of data. Only the part of the search that runs over the last 3 to 7 days benefits by running against the data model acceleration summary. The portion of the search that runs over the last 8 to 10 days runs over raw data and is not accelerated. In cases like this, the Splunk platform returns the accelerated results from summaries first, and then fills in the gaps at a slower speed.

Keep this in mind when you set the Summary Range value. If you always plan to run a report over time ranges that exceed the past 7 days, but don't extend further out than 30 days, you should select a Summary Range of 1 month when you set up report acceleration for that report.

How the Splunk platform builds data model acceleration summaries

When you enable acceleration for a data model, the Splunk platform builds the initial set of .tsidx file summaries for the data model and then runs scheduled searches in the background every 5 minutes to keep those summaries up to date. Each update ensures that the entire configured time range is covered without a significant gap in data. This method of summary building also ensures that late-arriving data is summarized without complication.

Parallel summarization

Data model acceleration summaries utilize parallel summarization by default. This means that the Splunk platform runs two concurrent search jobs to build .tsidx summary files instead of one. It also runs two concurrent searches on a 5 minute schedule to maintain those summary files. Parallel summarization decreases the amount of time it takes for the Splunk platform to build and maintain data model acceleration summaries.

There is a cost for this improvement in summarization search performance. The concurrent searches count against the total number of concurrent search jobs that your Splunk platform instance can run, which means that they can cause increased indexer resource usage.

If you find that the default parallel summarization setting of two concurrent summary building and maintenance searches per summary is a burden, you can reduce it to a single search by changing a setting in datamodels.conf for the data model or models in question.

1. Open the datamodels.conf file in your Splunk platform instance that has the data model that you want to update summarization settings for.

2. Locate the stanza for the data model.

3. Add acceleration.max_concurrent = 1 if that parameter is not present in the stanza.

If it is present, change its value to 1.

4. Save your changes.

In general we do not recommend increasing acceleration.max_concurrent to a value higher than 2. However, if your Splunk platform implementation has the capacity for a large amount of search concurrency, you can try setting acceleration.max_concurrent to 3 or higher for selected accelerated data models.

See:

Review summary creation metrics

The speed of summary creation depends on the amount of events involved and the size of the summary range. You can track progress towards summary completion on the Data Models management page. Find the accelerated data model that you want to inspect, expand its row, and review the information that appears under ACCELERATION.

6.0 dm acceleration metrics.png

Status tells you whether the acceleration summary for the data model is complete. If it is in Building status it will tell you what percentage of the summary is complete. Data model acceleration summaries are constantly updating with new data. A summary that is "complete" now will return to "building" status later when it updates with new data.

Note: When the Splunk software calculates the acceleration status for a data model, it bases its calculations on the Schedule Window that you have set for for the data model. However, if you have set a backfill relative time range for the data model, the Splunk software uses that time range to calculate acceleration status instead.

You might set up a backfill time range for a data model when the search that populates the data model acceleration summaries takes an especially long time to run. See Advanced configurations for persistently accelerated data models, in this manual.

Verify that the Splunk platform is scheduling summary update searches

You can verify that the Splunk platform is scheduling searches to update your data model acceleration summaries. Inlog.cfg, set category.SavedSplunker=DEBUG and then watch scheduler.log for events like:

04-24-2013 11:12:02.357 -0700 DEBUG SavedSplunker - Added 1 scheduled searches for accelerated datamodels to the end of ready-to-run list

Data model acceleration summary size on disk

You can use the data model metrics on the Data Models management page to track the total size of a data model's summary on disk. Summaries do take up space, and sometimes a signficant amount of it, so it's important that you avoid overuse of data model acceleration. For example, you may want to reserve data model acceleration for data models whose pivots are heavily used in dashboard panels.

The amount of space that a data model takes up is related to the number of events that you are collecting for the summary range you've chosen. It can also be negatively affected if the data model includes attributes with high cardinality (that have a large set of unique values), such as a Name attribute.

If you are particularly size constrained you may want to test the amount of space a data model acceleration summary will take up by enabling acceleration for a small Summary Range first, and then moving to a larger range if you think you can afford it.

Where data model acceleration summaries are created and stored

By default, the Splunk platform creates each data model acceleration summary on the indexer, parallel to the bucket or buckets that cover the range of time over which the summary spans. For example, for the "index1" index, they reside under $SPLUNK_HOME/var/lib/splunk/index1/datamodel_summary.

You can override this default index path by making it the definition of the tstatsHomePath parameter in indexes.conf. For information about using tstatsHomePath when setting up size-based retention for your summary indexes, see "Configure size-based retention for data model acceleration summaries," in this topic.

Indexer clusters do not replicate data model acceleration summaries or otherwise accommodate their presence. If primacy gets reassigned from the original copy of a bucket to another (for example, because the peer holding the primary copy fails), the summary does not move to the peer with new primary copy. Therefore, it becomes unavailable. It will not be available again until the next time the Splunk platform attempts to update the summary, finds that it is missing, and regenerates it.

Note: Until the Splunk platform regenerates your summaries, the results of your accelerated data models will be correct, but they will run slower.

For more information about how indexer clusters handle data model acceleration summaries, see "How search works in an indexer cluster" in the Managing Indexers and Clusters of Indexers manual.

Configure size-based retention for data model acceleration summaries

Do you set size-based retention limits for your indexes so they do not take up too much disk storage space? By default, data model acceleration summaries can take up an unlimited amount of disk space. This can be a problem if you're also locking down the maximum data size of your indexes or index volumes. The good news is that you can optionally configure similar retention limits for your data model acceleration summaries.

Note: Before attempting to configure size-based retention for your data model acceleration summaries, you should first understand how to use volumes to configure limits on index size across indexes, as many of the principles are the same. For more information, see "Configure index size" in Managing Indexers and Clusters.

By default, data model acceleration summaries reside in a predefined volume titled _splunk_summaries at the following path:

 $SPLUNK_DB/<index_name>/datamodel_summary/<bucket_id>/<search_head_or_pool_id>/DM_<datamodel_app>_<datamodel_name>

This volume initially has no maximum size specification, which means that it has infinite retention.

Also by default, the tstatsHomePath parameter is specified only once as a default setting in indexes.conf. Its path is inherited by all indexes. In etc/system/default/indexes.conf:

[default]
[....]
tstatsHomePath = volume:_splunk_summaries/$_index_name/datamodel_summary
[....]

You can override this default behavior by specifying tstatsHomePath for a specific index and pointing to a different volume that you have defined. You can also add size limits to any volume (including _splunk_summaries) by setting a maxVolumeDataSizeMB parameter in the volume configuration.

Here are the steps you take to set up size-based retention for data model acceleration summaries. All of the configurations described are made within indexes.conf.

1. (Optional) If you want to have data model acceleration summary results go into volumes other than _splunk_summaries, create them.

2. Add maxVolumeDataSizeMB parameters to the volume or volumes that will be the home for your data model acceleration summary data, such as _splunk_summaries.

This parameter manages size-based retention for data model acceleration summaries across your indexers.

3. Update your index definitions.

Set a tstatsHomePath parameter for each index that deals with data model acceleration summary data. Ensure that the path is pointing to the data model acceleration summary data volume that you identified in Step 2.
If you defined multiple volumes for your data model acceleration summaries, make sure that the tstatsHomePath settings for your indexes point to the appropriate volumes.

This example configuration sets up data size limits for data model acceleration summaries on the _splunk_summaries volume.

########################
# Default settings
########################

# When you do not provide the tstatsHomePath value for an index, 
# the index inherits the default volume, which gives the index a data 
# size limit of 1TB. 
[default]
maxDataSize = 1000000
tstatsHomePath = volume:_splunk_summaries/$_index_name/datamodel_summary

#########################
# Volume definitions
#########################

# Indexes with tstatsHomePath values pointing at this partition have 
# a data size limit of 100GB.  
[volume:_splunk_summaries]
path = $SPLUNK_DB
maxVolumeDataSizeMB = 100000

#########################
# Index definitions
#########################

[main]
homePath   = $SPLUNK_DB/defaultdb/db
coldPath   = $SPLUNK_DB/defaultdb/colddb
thawedPath = $SPLUNK_DB/defaultdb/thaweddb
maxMemMB = 20
maxConcurrentOptimizes = 6
maxHotIdleSecs = 86400
maxHotBuckets = 10
maxDataSize = auto_high_volume

[history]
homePath   = $SPLUNK_DB/historydb/db
coldPath   = $SPLUNK_DB/historydb/colddb
thawedPath = $SPLUNK_DB/historydb/thaweddb
tstatsHomePath = volume:_splunk_summaries/historydb/datamodel_summary
maxDataSize = 10
frozenTimePeriodInSecs = 604800

[dm_acceleration]
homePath   = $SPLUNK_DB/dm_accelerationdb/db
coldPath   = $SPLUNK_DB/dm_accelerationdb/colddb
thawedPath = $SPLUNK_DB/dm_accelerationdb/thaweddb

[_internal]
homePath   = $SPLUNK_DB/_internaldb/db
coldPath   = $SPLUNK_DB/_internaldb/colddb
thawedPath = $SPLUNK_DB/_internaldb/thaweddb
tstatsHomePath = volume:_splunk_summaries/_internaldb/datamodel_summary

When a data model acceleration summary volume reaches its size limit, the Splunk platform volume manager removes the oldest summary in the volume to make room. When the volume manager removes a summary, leaves the preexisting "done" file intact. The summarizer does not rebuild the bucket when the "done" file is present.

Note: You can configure size-based retention for report acceleration summaries in much the same way that you do for data model acceleration summaries. The primary difference is that there is no default volume for report acceleration summaries. If you are trying to manage both kinds of summaries you may decide it is easiest to have both summaries use the default _splunk_summaries volume. For more information about this strategy for managing size-based retention of both acceleration summary types, see "Manage report acceleration" in this manual.

Query data model acceleration summaries

You can query the high-performance analytics store for a specific accelerated data model in Search with the tstats command.

tstats can sort through the full set of .tsidx file summaries that belong to your accelerated data model even when they are distributed among multiple indexes.

This can be a way to quickly run a stats-based search against a particular data model just to see if it's capturing the data you expect for the summary range you've selected.

To do this, you identify the data model using FROM datamodel=<datamodel-name>:

| tstats avg(foo) FROM datamodel=buttercup_games WHERE bar=value2 baz>5

The above query returns the average of the field foo in the "Buttercup Games" data model acceleration summaries, specifically where bar is value2 and the value of baz is greater than 5.

Note: You don't have to specify the app of the data model as the Splunk platform takes this from the search context (the app you are in). However you cannot query an accelerated data model in App B from App A unless the data model in App B is shared globally.

Using the summariesonly argument

The summariesonly argument of the tstats command enables you to get specific information about data model acceleration summaries.

This example uses the summariesonly argument to get the time range of the summary for an accelerated data model named mydm.

| tstats summariesonly=t min(_time) as min, max(_time) as max from datamodel=mydm | eval prettymin=strftime(min, "%c") | eval prettymax=strftime(max, "%c")

This example uses summariesonly in conjunction with timechart to reveal what data has been summarized over a selected time range for an accelerated data model titled mydm.

| tstats summariesonly=t prestats=t count from datamodel=mydm by _time span=1h | timechart span=1h count

For more about the tstats command, including the usage of tstats to query normal indexed data, see the entry for tstats in the Search Reference.

Enable multi-eval to improve datamodel acceleration

Searches against root-event objects within datamodels iterate through many eval commands, which can be an expensive operation to complete during datamodel acceleration. You can improve the datamodel search efficiency by enabling multi-eval calculations for search in limits.conf.

enable_datamodel_meval = <bool>
* Enable concatenation of successively occuring evals into a single
  comma seperated eval during generation of datamodel searches.
* default true

If you disabled automatic rebuilds for any accelerated data model, you will need to rebuild that datamodel manually after enabling multi-eval calculations. For more information about rebuilding data models, see "Manage data models".

Advanced configurations for persistently accelerated data models

There are a few very specific situations that may require you to set up advanced configurations for your persistently accelerated data models in datamodels.conf.

When summary-populating searches take too long to run

If your Splunk platform implementation processes an extremely large amount of data on a regular basis you may find that the initial creation of persistent data model acceleration summaries is resource intensive. The searches that build these summaries may run too long, causing them to fail to summarize incoming events. To deal with this situation, the Splunk platform gives you two configuration parameters, both in datamodels.conf. These parameters are acceleration.max_time and acceleration.backfill_time.

Important: Most Splunk platform users do not need to adjust these settings. The default acceleration.max_time setting of 1 hour should ensure that long-running summary creation searches do not impede the addition of new events to the summary. We advise that you not change these advanced summary range configurations unless you know it is the only solution to your summary creation issues.

Change the maximum period of time that a summary-populating search can run

The acceleration.max_time causes summary populating searches to quit after a specified amount of time has passed. After a summary-populating search stops, the Splunk platform runs a search to catch all of the events that have come in since the initial summary-populating search began, and then it continues adding the summary where the last summary-populating search left off. The acceleration.max_time parameter is set to 3600 seconds (60 minutes) by default, a setting that should ensure proper summary creation for the majority of Splunk platform instances.

For example: You have enabled acceleration for a data model, and you want its summary to retain events for the past three months. Because your organization indexes large amounts of data, the search that initially creates this summary should take about four hours to complete. Unfortunately you can't let the search run interrupted for that amount of time because it might fail to index some of the new events that come in while that four-hour search is in process.

The acceleration.max_time parameter stops the search after an hour, and another search takes its place to pull in the new events that have come in during that time. It then continues running to add events from the last three months to the summary. This second search also stops after an hour and the process repeats until the summary is complete.

Note: The acceleration.max_time parameter is an approximate time limit. After the 60 minutes elapses, the Splunk platform has to finish summarizing the current bucket before kicking off the summary search. This prevents wasted work.

Set a backfill time range that is shorter than the summary time range

If you are indexing a tremendous amount of data with your Splunk implementation and you don't want to adjust the acceleration.max_time range for a slow-running summary-populating search, you have an alternative option: the acceleration.backfill_time parameter.

The acceleration.backfill_time parameter creates a second "backfill time range" that you set within the summary range. The Splunk platform builds a partial summary that initially only covers this shorter time range. After that, the summary expands with each new event summarized until it reaches the limit of the larger summary time range. At that point the full summary is complete and Splunk stops retaining events that age out of the summary range.

For example, say you want to set your Summary Range to 1 Month but you know that your system would be taxed by a search that built a summary for that time range. To deal with this, you set acceleration.backfill_time = -7d to have the Splunk platform run a search that creates a partial summary that initially just covers the past week. After that limit is reached, the Splunk platform would only add new events to the summary, causing the range of time covered by the summary to expand. But the full summary would still only retain events for one month, so once the partial summary expands to the full Summary Range of the past month, it starts dropping its oldest events, just like an ordinary data model acceleration summary does.

When you do not want persistently accelerated data models to be rebuilt automatically

By default the Splunk platform automatically rebuilds persistently accelerated data models whenever it finds that those models are outdated. Data models can become outdated when the search stored in the data model configuration in savesearches.conf no longer matches the search for the actual data model. This can happen if the JSON file for an accelerated model is edited on disk without first disabling the model's acceleration.

In very specific cases you may want to disable this feature for specific accelerated data models, so that those data models are not rebuilt automatically when they become out of date. Instead it will be up to admins to initiate the rebuilds manually. Admins can manually rebuild a data model through the Data Model Manager page, by expanding the row for the affected data model and clicking Rebuild. For more about the Data Model Manager page, see "Manage data models" in this manual.

To disable automatic rebuilds for a specific persistently accelerated data model, open datamodels.conf, find the stanza for the data model, and set acceleration.manual_rebuilds = true

About ad hoc data model acceleration

Even when you're building a pivot that is based on a data model object that is not accelerated in a persistent fashion, that pivot can benefit from what we call "ad hoc" data model acceleration. In these cases, the Splunk platform builds a summary in a search head dispatch directory when you work with an object to build a pivot in the Pivot Editor.

The search head begins building the ad-hoc data model acceleration summary after you select an object and enter the pivot editor. You can follow the progress of the ad hoc summary construction with the progress bar:

6.0 pivot progressbar.png

When the progress bar reads Complete, the ad hoc summary is built, and the search head uses it to return pivot results faster going forward. But this summary only lasts while you work with the object in the Pivot Editor. If you leave the editor and return, or switch to another object and then return to the first one, the search head will need to rebuild the ad hoc summary.

Ad hoc data model acceleration summaries complete faster when they collect data for a shorter range of time. You can change this range for root event objects, root transaction objects, and their children by resetting the time Filter in the Pivot Editor. See "About ad hoc data model acceleration summary time ranges," below, for more information.

Ad hoc data model acceleration works for all object types, including root search objects and root transaction objects. Its main disadvantage against persistent data model acceleration is that with persistent data model acceleration, the summary is always there, keeping pivot performance speedy, until acceleration is disabled for the data model. With ad hoc data model acceleration, you have to wait for the summary to be rebuilt each time you enter the Pivot Editor.

About ad hoc data model acceleration summary time ranges

The search head always tries to make ad hoc data model acceleration summaries fit the range set by the time Filter in the Pivot Editor. When you first enter the Pivot Editor for an object, the pivot time range is set to All Time. If your object represents a large dataset this can mean that the initial pivot will complete slowly as it builds the ad hoc summary behind the scenes.

When you give the pivot a time range other than All Time, the search head builds an ad hoc summary that fits that range as efficiently as possible. For any given data model object, the search head completes an ad hoc summary for a pivot with a short time range quicker than it completes when that same pivot has a longer time range.

The search head only rebuilds the ad hoc summary from start to finish if you replace the current time range with a new time range that has a different "latest" time. This is because the search head builds each ad hoc summary backwards, from its latest time to its earliest time. If you keep the latest time the same but change the earliest time the search head at most will work to collect any extra data that is required.

Note: Root search objects and their child objects are a special case here as they do not have time range filters in Pivot (they do not extract _time as an attribute). Pivots based on these objects always build summaries for all of the events returned by the search. However, you can design the root search object's search string so it includes "earliest" and "latest" dates, which restricts the dataset represented by the root search object and its children.

How ad hoc data model acceleration differs from persistent data model acceleration

Here's a summary of the ways in which ad hoc data model acceleration differs from persistent data model acceleration:

  • Ad hoc data model acceleration takes place on the search head rather than the indexer. This enables it to accelerate all three object types (event, search, and transaction).
  • The Splunk platform creates ad hoc data model acceleration summaries in dispatch directories at the search head. It creates and stores persistent data model acceleration summaries in your indexes alongside index buckets.
  • The Splunk platform deletes ad hoc data model acceleration summaries when you leave the Pivot Editor or change the object you are working on while you are in the Pivot Editor. When you return to the Pivot Editor for the same object, the search head must rebuild the ad hoc summary. You cannot preserve ad hoc data model acceleration summaries for later use.
    • Pivot job IDs are retained in the pivot URL, so if you quickly use the back button after leaving Pivot (or return to the pivot job with a permalink) you may be able to use the ad-hoc summary for that job without waiting for a rebuild. The search head deletes ad hoc data model acceleration summaries from the dispatch directory a few minutes after you leave Pivot or switch to a different model within Pivot.
  • Ad hoc acceleration does not apply to reports or dashboard panels that are based on pivots. If you want pivot-based reports and dashboard panels to benefit from data model acceleration, base them on objects from persistently accelerated event object hierarchies.
  • Ad hoc data model acceleration can potentially create more load on your search head than persistent data model acceleration creates on your indexers. This is because the search head creates a separate ad hoc data model acceleration summary for each user that accesses a specific data model object in Pivot that is not persistently accelerated. On the other hand, summaries for persistently accelerated data model objects are shared by each user of the associated data model. This data model acceleration summary reuse results in less work for your indexers.
PREVIOUS
Manage report acceleration
  NEXT
Use summary indexing for increased reporting efficiency

This documentation applies to the following versions of Splunk® Enterprise: 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13, 6.3.14


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters