Implement search affinity in a multisite indexer cluster

One of the key benefits of multisite indexer clustering is that it allows you to configure a cluster so that search heads get search results only from data stored on their local sites. This reduces network traffic while still providing access to the entire set of data, because each site contains a full copy of the data. This benefit is known as search affinity.

For example, say you have two data centers in California, one in San Francisco and the other in Los Angeles. You set up a two-site cluster, with each site corresponding to a data center. Search affinity allows you to reduce long-distance network traffic. Search heads at the San Francisco data center get results only from the peers in San Francisco, while search heads in Los Angeles get results only from their local peers.

How search affinity works

For those sites that you want to support search affinity, you must configure multisite clustering so that the site has a full set of searchable data and a local search head. The search head on any particular site then gets data only from its local site, as long as that site is valid.

When search affinity is functioning, each search head sends its searches to all peers, across all sites, but only the local peers search their data (that, is, their primary bucket copies) and return results to the search head.

If a local peer holding some of the primary searchable data goes down and the site temporarily loses its valid state, the search head will, if necessary, get data from peers on remote sites while the local site is undergoing bucket fixing. During this time, the search head will still get as much of the data as possible from the local site.

Once the site regains its valid state, new searches again occur across only the local site.

For more details on how the cluster handles search affinity, see "Multisite indexer cluster architecture" and "Search locally in a multisite cluster".

Implement search affinity

With multisite clusters, search affinity is enabled by default. However, you must perform a few steps to take advantage of it. Specifically, you must ensure that both the searchable data and the search heads are available locally.

To implement search affinity:

1. Configure the site search factor so that you have at least one searchable copy on each site where you require search affinity.

One way to do this is to explicitly specify a search factor for each site that requires search affinity. For example, a four-site cluster with site_search_factor = origin:1, site1:1, site2:1, total:3 ensures that both site1 and site2 have searchable copies of every bucket. The third set of searchable copies will be spread across the two non-explicit sites, with no guarantee that either site will have a full set of searchable copies. Thus, search affinity is enabled for only site1 and site2. Site1 and site2 will each hold primary copies of all buckets.

There are also ways to configure the site search factor to ensure that all sites have searchable copies, even without explicitly specifying some or all of them. For example, a three-site cluster with site_search_factor = origin:1, total:3 guarantees one searchable copy per site, and thus enables search affinity for each site. Each site will have primary copies of all buckets.

For more information on how replication and search factors distribute copies across sites, see "Configure the site replication factor" and "Configure the site search factor".

2. Deploy a search head on each site where you require search affinity.

Disable search affinity

You can disable search affinity for any search head. When search affinity is disabled, the search head does not attempt to obtain search results from a single site only. Rather, it can obtain results from multiple sites. This can be useful, for example, if you have two data centers in close proximity with low latency, and you want to improve overall performance by spreading the processing across indexers on both sites.

What happens when search affinity is disabled

By setting a search head's site to "site0", you disable site affinity for that search head. When search affinity is disabled for a search head, its search results can come from indexers on any or all sites.

Like specified sites such as site1, site2, and so on, site0 has its own designated set of primaries. Ordinarily, the site 0 primaries consist of primaries from all sites in the cluster. For example, the site0 primary for bucketA might be on site1, the site0 primary for bucketB on site2, and so on. In this way, a search head set to site0 accesses data from multiple sites.

For a given bucket, you cannot know which site the site0 primary resides on. For more information on how site0 primaries are chosen, see Search globally in a multisite cluster.

How to disable search affinity

To disable search affinity for a search head, set the search head's site value to "site0" in server.conf:

[general]
site = site0

[clustering]
multisite = true
...

By setting site=site0, you cause searches to behave like they would on a single-site cluster, with no preference for any particular site.

For more information on configuring multisite search heads, see Configure the search heads.

Related answers from Splunk Community

Implement search affinity in a multisite indexer cluster

How search affinity works

Implement search affinity

Disable search affinity

What happens when search affinity is disabled

How to disable search affinity

Comments

Implement search affinity in a multisite indexer cluster

Was this topic useful?