If you rely on data model acceleration (DMA) to speed up your search performance and your Splunk platform implementation has a number of search heads or search head clusters, you might be running several instances of the same DMA summary across those search heads or search head clusters. This leads to identical summaries doing redundant work and taking up unnecessary space on your indexers.
If you have access to datamodels.conf, you can arrange to share a single DMA summary among data models on multiple search heads or search head clusters. Sharing summaries frees up indexer space and cuts down on processing overhead across your Splunk platform implementation.
All data models that share a summary must have the following things:
- Data model acceleration enabled with
acceleration = true.
- Root data model dataset constraints and acceleration time ranges that are very similar to each other, if not identical.
If you use Splunk Cloud Platform
If you use Splunk Cloud Platform and would like to use this feature to share data model summaries between clusters in your Splunk Cloud Platform environment, file a ticket with Splunk Support to get it enabled and configured.
Provide the GUID of the source search head or search head cluster
To set up a data model to share the summary of a data model on another search head or search head cluster, you need to add an
acceleration.source_guid setting to the data model's stanza in datamodels.conf. The
acceleration.source_guid setting specifies the GUID (globally unique identifier) of the search head or search head cluster that holds the source DMA summary. The datamodels.conf file needs to be in the same app namespace as the data model that is sharing its summary. See "DMA summary sharing and app namespaces".
The GUID for a search head cluster is defined in
server.conf, by the
id setting of the
[shclustering] stanza. If you are running a single instance you can find the GUID in
Simple example configuration
Say you have two search heads that you've labeled Search Head One and Search Head Two. You have an accelerated data model on Search Head One, and you want to share its summary with an accelerated data model on Search Head Two.
On datamodels.conf for Search Head One, you have the following configuration for the source data model:
[internal_audit_logs] acceleration = true acceleration.earliest_time = -1w acceleration.backfill_time = -1d
On datamodels.conf for Search Head Two, you have configured this accelerated data model to share the summary of the accelerated data model from Search Head One:
[internal_audit_logs] acceleration = true acceleration.earliest_time = -1w acceleration.backfill_time = -1d acceleration.source_guid = <search_head_one_GUID>
Note that both data models have acceleration enabled, and that the data model on Search Head Two has identical settings to those of the data model on Search Head One, with the exception of the
acceleration.source_guid setting. For best results, all of the target data models should have settings that are identical to the settings of the source data model.
DMA summary sharing and app namespaces
When a data model is accelerated, its data model definition is stored on the search head under the data model's app namespace. When data models share a summary, each of the data models involved need to be defined under the same app on their respective search heads. This enables search heads to seek shared summaries across their mutual app namespaces.
For example, let's say you have an "Authentication and Web" data model that you have defined on Search Head 1 under the Splunk_SA_CIM app. If you want to share its summary with an "Authentication and Web" data model on Search Head 2, the Search Head 2 data model must also be defined under the Splunk_SA_CIM app. If you share the summary to a data model associated with a different app on Search Head 2, Search Head 2 will not be able to find the summary.
In other words, if you want to share the summary for the data model defined on Search Head 1, you must apply the
acceleration.source_guid setting to the appropriate data model stanza in /etc/apps/Splunk_SA_CIM/local/datamodels.conf on Search Head 2.
After you set
acceleration.source_guid for a data model, searches of that data model draw upon the summaries associated with the provided GUID when possible. When a data model is sharing the summary of another data model, the data model has the following conditions applied to it:
- You cannot use Splunk Web to edit the data model.
accelerationsettings in datamodels.conf are ignored by the Splunk software for the data model, because it is sharing the summary of another accelerated model.
- Existing summaries for the data model are removed by the Splunk software. Work towards summary maintenance ceases.
allow_old_summariessetting is set to
truefor the data model. This happens because there can be slight differences between the definition of the data model and definition of the data model at the remote search head or search head cluster whose summary it will be using. See When the data model definition changes and your summaries have not been updated to match it.
allow_old_summaries=true for data models that share remote DMA summaries, they do run the risk of using mismatched data if the root dataset constraints of the data model at the remote search head or search head cluster are changed.
Identify data models that are sharing DMA summaries
You can see whether a data model is sharing a DMA summary on the Data Models management page.
- Navigate to Settings > Data Models.
- Expand a row for an accelerated data model.
If the data model you have selected is sharing another model's DMA summary, you will see the following message at the top of the Acceleration section: "Source GUID detected. The summary information displayed will be that of the specified search head."
You will also see the Source GUID for the search head or search head cluster listed among the other DMA summary details in the Acceleration section. As the message indicates, this DMA information relates to the summary at the source GUID, not a summary that is generated for the data model that you are inspecting.
The Rebuild and Edit actions are removed from data models that share another model's summary.
In multisite indexer clusters, the summaries reside with the primary bucket copy. Because a multisite cluster has multiple primaries, one for each site that supports search affinity, the summaries reside with the particular primary that the generating search head accessed when it ran its summary-creation search. Due to site affinity, that usually means that the summaries reside on primaries on the same site as the generating search head.
The issue here is this: when you have several search head clusters operating within a multisite indexer cluster, each of those SHCs is "assigned" to a particular site within that cluster. They won't automatically know which site within the indexer cluster to check for the summary they are sharing without potentially duplicating results. Here are two things you can do to deal with this situation:
- Replicate summaries on all indexers. Go to server.conf and set
[clustering]stanza. This causes all searchable bucket copies to have associated summaries.
- Make sure the involved search head clusters are searching the same site. You can direct the search head clusters to point at the site that holds the shared summary.
When you run a
tstats search against a data model with a shared summary, set
summariesonly=t to ensure search consistency. Otherwise you are running searches that might include differing sources of unsummarized data in their results.
Accelerate data models
Use summary indexing for increased search efficiency
This documentation applies to the following versions of Splunk® Enterprise: 8.1.0, 8.1.1, 8.1.2, 8.1.3, 8.1.4, 8.1.5, 8.1.6, 8.1.7, 8.1.8, 8.1.9, 8.1.10, 8.1.11, 8.1.12, 8.1.13, 8.2.0, 8.2.1, 8.2.2, 8.2.3, 8.2.4, 8.2.5, 8.2.6, 8.2.7, 8.2.8, 8.2.9, 8.2.10, 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4
Feedback submitted, thanks!