Splunk® Enterprise

Knowledge Manager Manual

Download manual as PDF

Download topic as PDF

Share data model acceleration summaries among search head clusters

If you rely on data model acceleration (DMA) to speed up your search performance and your Splunk platform implementation has a number of search heads or search head clusters, you may be running several instances of the same DMA summary across those search heads or search head clusters. This leads to identical summaries doing redundant work and taking up unnecessary space on your indexers.

If you have access to datamodels.conf, you can arrange to share a single DMA summary among data models on multiple search heads or search head clusters. This will free up indexer space and cut down on processing overhead across your Splunk platform implementation.

All of the data models that share a summary must have root dataset constraints and acceleration time ranges that are very similar to each other, if not identical.

If you are upgrading to Splunk Enterprise 8.0.0 from an earlier version, you can share DMA summaries as long as your search heads and search head clusters are are fully upgraded to 8.0.x. Your indexer nodes can remain at an earlier version.

Provide the GUID of the source search head or search head cluster

To set up a data model to share the summary of a data model on another search head or search head cluster, you need to add an acceleration.source_guid setting to the data model's stanza in datamodels.conf. The acceleration.source_guid setting specifies the GUID (globally unique identifier) of the search head or search head cluster that holds the source DMA summary. The datamodels.conf file needs to be in the same app namespace as the data model that is sharing its summary. See "DMA summary sharing and app namespaces".

You can find the GUID for a search head cluster in the [shclustering] stanza of server.conf. If you are running a single instance you can find the GUID in etc/instance.cfg.

DMA summary sharing and app namespaces

When a data model is accelerated, its data model definition is stored on the search head under the data model's app namespace. When data models share a summary, each of the data models involved need to be defined under the same app on their respective search heads. This enables search heads to seek shared summaries across their mutual app namespaces.

For example, let's say you have an "Authentication and Web" data model that you have defined on Search Head 1 under the Splunk_SA_CIM app. If you want to share its summary with an "Authentication and Web" data model on Search Head 2, the Search Head 2 data model must also be defined under the Splunk_SA_CIM app. If you share the summary to a data model associated with a different app on Search Head 2, Search Head 2 will not be able to find the summary.

In other words, if you want to share the summary for the data model defined on Search Head 1, you must apply the acceleration.source_guid setting to the appropriate data model stanza in /etc/apps/Splunk_SA_CIM/local/datamodels.conf on Search Head 2.

What changes for the data model that shares the DMA summary of another model

After you set acceleration.source_guid for a data model, searches of that data model draw upon the summaries associated with the provided GUID when possible. When a data model is sharing the summary of another data model, the data model has the following conditions applied to it:

  • You cannot use Splunk Web to edit the data model.
  • The acceleration settings in datamodels.conf are ignored by the Splunk software for the data model, because it is sharing the summary of another accelerated model.
  • Existing summaries for the data model are removed by the Splunk software. Work towards summary maintenance ceases.
  • The allow_old_summaries setting is set to true for the data model. This happens because there can be slight differences between the definition of the data model and definition of the data model at the remote search head or search head cluster whose summary it will be using. See When the data model definition changes and your summaries have not been updated to match it.

Because allow_old_summaries=true for data models that share remote DMA summaries, they do run the risk of using mismatched data if the root dataset constraints of the data model at the remote search head or search head cluster are changed.

Identify data models that are sharing DMA summaries

You can see whether a data model is sharing a DMA summary on the Data Models management page.

Steps

  1. Navigate to Settings > Data Models.
  2. Expand a row for an accelerated data model.

If the data model you have selected is sharing another model's DMA summary, you will see the following message at the top of the Acceleration section: "Source GUID detected. The summary information displayed will be that of the specified search head."

You will also see the Source GUID for the search head or search head cluster listed among the other DMA summary details in the Acceleration section. As the message indicates, this DMA information relates to the summary at the source GUID, not a summary that is generated for the data model that you are inspecting.

The Rebuild and Edit actions are removed from data models that share another model's summary.

If the summary you want to share is in a multisite indexer cluster

In multisite indexer clusters, the summaries reside with the primary bucket copy. Because a multisite cluster has multiple primaries, one for each site that supports search affinity, the summaries reside with the particular primary that the generating search head accessed when it ran its summary-creation search. Due to site affinity, that usually means that the summaries reside on primaries on the same site as the generating search head.

The issue here is this: when you have several search head clusters operating within a multisite indexer cluster, each of those SHCs is "assigned" to a particular site within that cluster. They won't automatically know which site within the indexer cluster to check for the summary they are sharing without potentially duplicating results. Here are two things you can do to deal with this situation:

  • Replicate summaries on all indexers. Go to server.conf and set summary_replication=true in the [clustering] stanza. This causes all searchable bucket copies to have associated summaries.
  • Make sure the involved search head clusters are searching the same site. You can direct the search head clusters to point at the site that holds the shared summary.

Considerations for searching data models with shared summaries

When you run a tstats search against a data model with a shared summary, set summariesonly=t to ensure search consistency. Otherwise you are running searches that might include differing sources of unsummarized data in their results.

Last modified on 28 July, 2020
PREVIOUS
Accelerate data models
  NEXT
Use summary indexing for increased reporting efficiency

This documentation applies to the following versions of Splunk® Enterprise: 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters