Share data model acceleration summaries among search heads

If you rely on data model acceleration (DMA) to speed up your search performance and your Splunk platform implementation has a number of search heads or search head clusters, you might be running several instances of the same DMA summary across those search heads or search head clusters. This leads to identical summaries doing redundant work and taking up unnecessary space on your indexers.

If you have access to datamodels.conf, you can arrange to share a single DMA summary among data models on multiple search heads or search head clusters. Sharing summaries frees up indexer space and cuts down on processing overhead across your Splunk platform implementation.

To set up data model summary sharing, you need a reader data model, and a writer data model on a separate search head. The following table defines these terms and explains how these data models must be set up to facilitate sharing of data model summaries.

	Reader data model	Writer data model
Definition	Reads summary data shared by the writer data model. Must be on a separate search head from the writer data model. Must share the same app namespace as the writer data model.	Shares summary data with the reader data model. Must be on a separate search head from the reader data model. Must share the same app namespace as the reader data model.
Is data model acceleration required?	The reader data model can read shared data model acceleration summaries whether or not it has acceleration turned on.	The writer data model must be accelerated to share its data model acceleration summary with the reader data model.

A reader data model can have data model acceleration settings that differ from those of an accelerated writer data model and still share the writer data model's summary. However, when this is the case, the reader data model won't display accurate summary creation metrics for the writer data model in Splunk Web. If you want accurate summary creation metrics, ensure that your reader and writer data models have root data model dataset constraints and Summary range and Backfill range values that are similar if not identical.

Using different data model JSON files for the writer and reader nodes might impact search performance and accuracy of results. To minimize the possibility of issues with data model summary sharing, make sure the data model JSON files are the same for the reader and writer.

For more information about data model acceleration summary creation metrics, see Accelerate data models.

If you use Splunk Cloud Platform

If you use Splunk Cloud Platform and would like to use this feature to share data model summaries between clusters in your Splunk Cloud Platform environment, file a ticket with Splunk Support to get it enabled and configured.

Provide the GUID of the source search head or search head cluster

To set up a reader data model to share the summary of a writer data model on another search head or search head cluster, you need to add an acceleration.source_guid setting to the reader data model's stanza in datamodels.conf. The acceleration.source_guid setting specifies the GUID (globally unique identifier) of the search head or search head cluster that holds the writer DMA summary. The datamodels.conf file needs to be in the same app namespace as the data model that is sharing its summary. See DMA summary sharing and app namespaces.

The GUID for a search head cluster is defined in server.conf, by the id setting of the [shclustering] stanza. If you are running a single instance you can find the GUID in etc/instance.cfg.

Simple example configuration

Say you have two search heads that you've labeled Search Head One and Search Head Two. You have an accelerated data model on Search Head One, and you want to share its summary with an unaccelerated data model on Search Head Two. In this scenario, Search Head One is the writer data model, and Search Head Two is the reader data model.

On datamodels.conf for Search Head One, you have the following configuration for the writer data model:

[internal_audit_logs]
acceleration = true
acceleration.earliest_time = -1w
acceleration.backfill_time = -1d

On datamodels.conf for Search Head Two, you have configured this unaccelerated data model to share the summary of the accelerated data model from Search Head One:

[internal_audit_logs]
acceleration.earliest_time = -1w
acceleration.backfill_time = -1d
acceleration.source_guid = <search_head_one_GUID>

Note that the writer data model on Search Head One has acceleration turned on, and that the reader data model on Search Head Two has identical settings to those of the writer data model on Search Head One, with the exception of the acceleration.source_guid setting. For best results, all of the reader data models must have settings that are identical to the settings of the writer data model.

DMA summary sharing and app namespaces

When you accelerate a data model, Splunk software stores that data model's definition on the search head under the data model's app namespace. When data models share a summary, the reader and writer data models involved must be defined under the same app on their respective search heads. This lets search heads seek shared summaries across their mutual app namespaces.

For example, say you have an "Authentication and Web" data model that you have defined on Search Head 1 under the Splunk_SA_CIM app. If you want to share its summary with an "Authentication and Web" data model on Search Head 2, you must have also defined the Search Head 2 data model under the Splunk_SA_CIM app. If you share the summary to a data model associated with a different app on Search Head 2, Search Head 2 cannot find the summary.

In other words, if you want to share the summary for the writer data model defined on Search Head 1, you must apply the acceleration.source_guid setting to the appropriate reader data model stanza in /etc/apps/Splunk_SA_CIM/local/datamodels.conf on Search Head 2.

What changes for the data model that shares the DMA summary of another model

After you set acceleration.source_guid for a reader data model, searches of that data model draw upon the summaries associated with the provided GUID when possible. When a reader data model shares the summary of a writer data model, Splunk software applies the following conditions to the reader data model:

You cannot use Splunk Web to edit the reader data model.
Splunk software utilizes the acceleration.earliest_time and acceleration.backfill_time settings in datamodels.conf (or the Summary range and Backfill range settings in Splunk Web) to generate data model summary creation metrics in Splunk Web for the reader data model. To ensure accurate metrics, these reader data model acceleration settings must have values that are identical or very similar to the corresponding acceleration settings of the writer data model.
Splunk software removes the reader data model's existing summaries. Work towards summary maintenance ceases.
Splunk software removes the Rebuild and Edit actions from the reader data model's summary controls in Splunk Web.
Splunk software sets the reader data model's allow_old_summaries setting to true. This happens because there can be slight differences between the definition of the reader data model and the definition of the writer data model. See When the data model definition changes and your summaries have not been updated to match it.

Because allow_old_summaries=true for reader data models that share remote DMA summaries, they do run the risk of using mismatched data if you change the root dataset constraints of the writer data model.

Identify data models that are sharing DMA summaries

You can see whether a data model is sharing a DMA summary on the Data Models management page.

Steps

Navigate to Settings > Data Models.
Expand a row for an accelerated data model.

If the data model you have selected is sharing another model's DMA summary, you will see the following message at the top of the Acceleration section: "Source GUID detected. The summary information displayed will be that of the specified search head."

You will also see the Source GUID for the search head or search head cluster listed among the other DMA summary details in the Acceleration section. As the message indicates, this DMA information relates to the summary at the source GUID, not a summary that is generated for the data model that you are inspecting.

If the summary you want to share is in a multisite indexer cluster

In multisite indexer clusters, the summaries reside with the primary bucket copy. Because a multisite cluster has multiple primaries, one for each site that supports search affinity, the summaries reside with the particular primary that the generating search head accessed when it ran its summary-creation search. Due to site affinity, that usually means that the summaries reside on primaries on the same site as the generating search head.

When you have several search head clusters operating within a multisite indexer cluster, each of those SHCs is "assigned" to a particular site within that cluster. They won't automatically know which site within the indexer cluster to check for the summary they are sharing without potentially duplicating results. Here are two things you can do to deal with this situation:

Replicate summaries on all indexers. Go to server.conf and set summary_replication=true in the [clustering] stanza. This causes all searchable bucket copies to have associated summaries.
Make sure the involved search head clusters are searching the same site. You can direct the search head clusters to point at the site that holds the shared summary.

Considerations for searching data models with shared summaries

When you run a tstats search against a reader data model that is sharing a writer data model's summary, set summariesonly=t to ensure search consistency. Otherwise you are running searches that might include differing sources of unsummarized data in their results.

Related answers from Splunk Community

Share data model acceleration summaries among search heads

If you use Splunk Cloud Platform

Provide the GUID of the source search head or search head cluster

Simple example configuration

DMA summary sharing and app namespaces

What changes for the data model that shares the DMA summary of another model

Identify data models that are sharing DMA summaries

If the summary you want to share is in a multisite indexer cluster

Considerations for searching data models with shared summaries

Comments

Share data model acceleration summaries among search heads

Was this topic useful?