Implement search affinity in a multisite indexer cluster

One of the key benefits of multisite indexer clustering is that it allows you to configure a cluster so that search heads get search results only from data stored on their local sites. This reduces network traffic while still providing access to the entire set of data, because each site contains a full copy of the data. This benefit is known as search affinity.

For example, say you have two data centers in California, one in San Francisco and the other in Los Angeles. You set up a two-site cluster, with each site corresponding to a data center. Search affinity allows you to reduce long-distance network traffic. Search heads at the San Francisco data center get results only from the peers in San Francisco, while search heads in Los Angeles get results only from their local peers.

How search affinity works

For those sites that you want to support search affinity, you must configure multisite clustering so that the site has a full set of searchable data and a local search head. The search head on any particular site then gets data only from its local site, as long as that site is valid.

When search affinity is functioning, each search head sends its searches to all peers, across all sites, but only the local peers search their data (that, is, their primary bucket copies) and return results to the search head.

If a local peer holding some of the primary searchable data goes down and the site temporarily loses its valid state, the search head will, if necessary, get data from peers on remote sites while the local site is undergoing bucket fixing. During this time, the search head will still get as much of the data as possible from the local site.

Once the site regains its valid state, new searches again occur across only the local site.

For more details on how the cluster handles search affinity, see "Multisite indexer cluster architecture" and "Search locally in a multisite cluster".

Implement search affinity

With multisite clusters, search affinity is enabled by default. However, you must perform a few steps to take advantage of it. Specifically, you must ensure that both the searchable data and the search heads are available locally.

To implement search affinity:

1. Configure the site search factor so that you have at least one searchable copy on each site where you require search affinity.

One way to do this is to explicitly specify a search factor for each site that requires search affinity. For example, a four-site cluster with site_search_factor = origin:1, site1:1, site2:1, total:3 ensures that both site1 and site2 have searchable copies of every bucket. The third set of searchable copies will be spread across the two non-explicit sites, with no guarantee that either site will have a full set of searchable copies. Thus, search affinity is enabled for only site1 and site2. Site1 and site2 will each hold primary copies of all buckets.

There are also ways to configure the site search factor to ensure that all sites have searchable copies, even without explicitly specifying some or all of them. For example, a three-site cluster with site_search_factor = origin:1, total:3 guarantees one searchable copy per site, and thus enables search affinity for each site. Each site will have primary copies of all buckets.

For more information on how replication and search factors distribute copies across sites, see "Configure the site replication factor" and "Configure the site search factor".

2. Deploy a search head on each site where you require search affinity.

Disable search affinity

You can disable search affinity for any search head. When search affinity is disabled, the search head does not attempt to obtain search results from a single site only. Rather, it can obtain results from multiple sites. This can be useful, for example, if you have two data centers in close proximity with low latency, and you want to improve overall performance by spreading the processing across indexers on both sites.

What happens when search affinity is disabled

When search affinity is disabled on a search head, search results can come from indexers on any or all sites. If the site search factor stipulates searchable bucket copies on multiple sites, the search head uses undefined criteria to choose which of the searchable copies to search. It is likely to choose some bucket copies from one site and other bucket copies from other sites, so the results will come from multiple sites.

Search heads always select from primary bucket copies. For example, say you have a two site cluster with this search factor:

site_search_factor = origin:2, total:3

The origin site will store two searchable copies and the second site will store one searchable copy of each bucket. So, for some buckets (those originating on site1), site1 will have two searchable copies and for other buckets (those originating on site2), site2 will have two searchable copies. Each site, however, has only a single primary copy.

A search head with search affinity enabled limits its searches to the primary copies on its own site, when possible.

In contrast, a search head with search affinity disabled distributes its search across primary copies on both sites. For a given bucket, you cannot know whether it will select the primary on site1 or the primary on site2. It does tend to use the same primaries from one search to the next.

How to disable search affinity

To disable search affinity for a search head, set the search head's site value to "site0" in server.conf:

[general]
site = site0

[clustering]
multisite = true
...

By setting site=site0, you cause searches to behave like they would on a single-site cluster, with no preference for any particular site.

For more information on configuring multisite search heads, see "Configure the search heads."

Limit primaries to sites with search heads

By default, the indexer cluster limits primary bucket copies to sites with search heads. This behavior significantly reduces primary fixup and rebalance activity, particularly in cases where all search heads are assigned to only one or a few sites and search affinity is disabled.

This behavior can be changed in the manager node's server.conf file, with the setting assign_primaries_to_all_sites. By default, the setting is "false", which causes the cluster to limit primaries to sites with search heads:

[clustering]
assign_primaries_to_all_sites=false

When assign_primaries_to_all_sites=false, the manager assigns primaries only to sites currently with an online search head. For example, if no online search head is assigned to site1, no primaries will be assigned to site1. Similarly, if no search head is assigned to site0, no primaries will be assigned to site0.

When a search head joins a site that previously did not have an assigned search head, the manager gradually assigns primary status to one searchable copy of each bucket on that site, for all buckets with searchable copies on that site.

If a site loses all its search heads, all existing primaries on that site remain as such but the manager does not assign primary status to any new bucket copies on the site.

You can also use the CLI on the manager node to change this setting:

splunk edit cluster-config -assign_primaries_to_all_sites false|true

Related answers from Splunk Community

Implement search affinity in a multisite indexer cluster

How search affinity works

Implement search affinity

Disable search affinity

What happens when search affinity is disabled

How to disable search affinity

Limit primaries to sites with search heads

Comments

Implement search affinity in a multisite indexer cluster

Was this topic useful?