
Accommodate many simultaneous searches
This topic provides high-level guidance on how to accommodate many simultaneous searches, search types, and concurrent users with your distributed Splunk Enterprise deployment.
Primary search performance impacts
The biggest performance impacts on a Splunk Enterprise deployment are:
- Number of concurrent users.
- Number of concurrent searches.
- Types of searches you use during the course of operation.
Each of these elements affects the deployment in different ways.
How concurrent users and searches affect performance
When a user submits a search request, the search request can take up to one CPU core on each indexer to process while it runs, by default. Any additional searches that the user submits also account for one CPU core. You can adjust the number of global concurrent searches that a machine can run. See "Expected performance and known limitations of real-time searches and reports" in the Search Manual.
The type of search the user invokes also impacts resource usage. The additional amount of CPU or disk usage varies depending on the search type. See "How search types affect Splunk Enterprise performance" in this manual.
How to maximize search performance
The best way to address the resource overhead for many concurrent searches is to configure the deployment to handle search requests completely within available physical machine memory. Do this by adding as much memory to systems as is feasible. While this is important for the machines that you designate as search heads, it is particularly important for indexers in your deployment, because indexers must index incoming data and search existing data.
For example, in a deployment that has up to 48 concurrent searches happening at a time, a single search that uses 200MB of memory translates to nearly 10GB of memory required to satisfy the 48 concurrent search requests. The amount of available memory is an important statistic. While performance on an indexer declines gradually with increased CPU usage from concurrent search jobs, it drops dramatically when the machine exhausts all available physical memory. A single reference indexer cannot handle 48 concurrent searches. At the very least, the machine needs additional memory for satisfactory performance.
Search performance: The ideal
Despite the memory constraints, the run time of a search increases proportionally as the number of free CPU cores on a machine decreases. For example, on an idle system with 8 available cores, the cores on the machine service the first 8 searches to arrive and finish within a short period of time, for this example, 10 seconds.
If 48 searches run concurrently, then completion time increases significantly as shown in the calculation.
No. of concurrent searches | / No. of avail. cores | = No. of searches per core | x No. of sec. per individual search | = Approx. time (sec.) per search |
---|---|---|---|---|
48 | 8 | 6 | 10 | 60 |
Because indexers do the bulk of the work in search operations (reading data off disk, decompressing it, extracting knowledge and reporting), it is best practice to add indexers to decrease the amount of time per search. In the next example, to return to the performance level of 10-second searches, deploy 6 indexers with 8 cores each.
8 concurrent searches - 6 indexers with 8 cores per indexer:
No. of concurrent searches | / No. of avail. cores | = No. of cores per search | x No. of sec. per individual search | = Approx. time (sec.) per search |
---|---|---|---|---|
8 | 48 | 6 | 10 | 1.6 |
48 concurrent searches - 6 indexers with 8 cores per indexer:
No. of concurrent searches | / No. of avail. cores | = No. of cores per search | x No. of sec. per individual search | = Approx. time (sec.) per search |
---|---|---|---|---|
48 | 48 | 1 | 10 | 10 |
In the previous example, one search head to service search requests is acceptable. It might be appropriate to set aside a second search head to create summary indexes.
Search performance: The reality
In many cases, the system is not idle before searches arrive. For example, if an indexer handles 150GB/day of data at peak times, then it can use up to 4 of the 8 available cores for indexing that data. In this case, search times increase significantly.
4 concurrent searches - one indexer with 4 of 8 cores available per indexer:
No. of concurrent searches | / No. of avail. cores | = No. of cores per search | x No. of sec. per individual search | = Approx. time (sec.) per search |
---|---|---|---|---|
4 | 4 | 1 | 10 | 10 |
48 concurrent searches - one indexer with 4 of 8 cores available per indexer:
No. of concurrent searches | / No. of avail. cores | = No. of searches per core | x No. of sec. per individual search | = Approx. time (sec.) per search |
---|---|---|---|---|
48 | 4 | 12 | 10 | 120 |
Increasing the number of cores per machine decreases the amount of time taken per search, but is not the most effective way to streamline search operations. One system with 16 cores has the following performance.
16 concurrent searches - one indexer with 12 of 16 cores available:
No. of concurrent searches | / No. of avail. cores | = No. of searches per core | x No. of sec. per individual search | = Approx. time (sec.) per search |
---|---|---|---|---|
16 | 12 | 1.33 | 10 | 13.3 |
48 concurrent searches - one indexer with 12 of 16 cores available:
No. of concurrent searches | / No. of avail. cores | = No. of searches per core | x No. of sec. per individual search | = Approx. Total time (sec.) per search |
---|---|---|---|---|
48 | 12 | 4 | 10 | 40 |
Two 8-core machines have the following performance profile.
8 concurrent searches - two indexers with 4 of 8 cores available per indexer:
No. of concurrent searches | / No. of avail. cores | = No. of cores per search | x No. of sec. per individual search | = Approx. time (sec.) per search |
---|---|---|---|---|
8 | 8 | 1 | 10 | 10 |
16 concurrent searches - two indexers with 4 of 8 cores available per indexer:
No. of concurrent searches | / No. of avail. cores | = No. of searches per core | x No. of sec. per individual search | = Approx. time (sec.) per search |
---|---|---|---|---|
16 | 8 | 2 | 10 | 20 |
48 concurrent searches - two indexers with 4 of 8 cores available per indexer:
No. of concurrent searches | / No. of avail. cores | = No. of searches per core | x No. of sec. per individual search | = Approx. time (sec.) per search |
---|---|---|---|---|
48 | 8 | 6 | 10 | 60 |
Two 8-core machines cost only slightly more than one 16-core machine. In addition, two 8-core machines provide significantly more available disk throughput than one 16-core machine does, based on hard disk spindle count. This is important for indexers because of the high disk bandwidth that they require.
Adding indexers reduces the indexing load on any system and frees CPU cores for searching. Also, because the performance of almost all types of search scales with the number of indexers, searches will be faster, which mitigates the effect of reduced performance from sharing resources amongst both indexing and searching. Increasing search speed reduces the chance of concurrent searches with concurrent users.
In real-world situations with hundreds of users, each user runs a search every few minutes, although not at the exact same time as other users. By adding indexers and reducing the search time, you reduce the concurrency factor and lower the resultant I/O and memory contention.
PREVIOUS Distribute indexing and searching |
NEXT Reference hardware |
This documentation applies to the following versions of Splunk® Enterprise: 6.2.0, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.2.14, 6.2.15
Feedback submitted, thanks!