How Splunk Enterprise looks through your data
As you scale your deployment up, the importance of understanding the different types of search and how they impact performance increases. This knowledge helps you determine how many indexers and search heads you should add to your distributed deployment.
Search types: the details
There are four basic types of search that you can invoke against data stored in a Splunk Enterprise index. Each of these search types affects the indexer in a different way. The search types are:
Dense. A dense search is a search that returns a large percentage (10% or more) of matching results for a given set of data in a given period of time. A reference server should be able to fetch up to 50,000 matching events per second for a dense search. Dense searches usually tax a server's CPU first, because of the overhead required to decompress the raw data stored in an index.
Sparse. Sparse searches return smaller numbers of results for a given set of data in a given period of time (anywhere from .01 to 1%) than dense searches do. A reference indexer should be able to fetch up to 5,000 matching events per second when executing a sparse search.
Super-sparse. A super-sparse search is a "needle in the haystack" search that retrieves only a very small number of results across the same set of data within the same time period as the other searches. A super-sparse search is very I/O intensive because the indexer must look through all of the buckets of an index to find the desired results. This can take up to two seconds per searched bucket. If you have a large amount of data stored on your indexer, there are a lot of buckets, and a super-sparse search can take a very long time to complete.
Rare. Rare searches are like super-sparse searches in that they match just a handful of results across a number of index buckets. The major difference with rare searches is that bloom filters - data structures that test whether or not an element is a member of a set - significantly reduce the number of buckets that need to be searched by eliminating those buckets which do not contain events that match the search request. This allows a rare search to complete anywhere from 20 to 100 times faster than a super-sparse search, for the same amount of data searched.
The following table summarizes the different search types. Note that for dense and sparse searches, performance is measured based on number of matching events, while with super-sparse and rare searches, performance is measured based on total indexed volume.
|Search type||Description||Ref. indexer throughput||Performance impact|
|Dense||Dense searches return a large percentage of results for a given set of data in a given period of time.||Up to 50,000 matching events per second||Generally CPU-bound|
|Sparse||Sparse searches return a smaller amount of results for a given set of data in a given period of time than dense searches do.||Up to 5,000 matching events per second||Generally CPU-bound|
|Super-sparse||Super-sparse searches return a very small number of results from each index bucket which match the search. Depending on how large the set of data is, these types of search can take a long period of time.||Up to 2 seconds per index bucket||Primarily I/O bound|
|Rare||Rare searches are similar to super-sparse searches, but are assisted by bloom filters which help eliminate index buckets that do not match the search request. Rare searches return results anywhere from 20 to 100 times faster than a super-sparse search does.||From 10 to 50 index buckets per second||Primarily I/O bound|
Distribute indexing and searching
This documentation applies to the following versions of Splunk® Enterprise: 5.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4, 5.0.5, 5.0.6, 5.0.7, 5.0.8, 5.0.9, 5.0.10, 5.0.11, 5.0.12, 5.0.13, 5.0.14, 5.0.15, 5.0.16, 5.0.17, 5.0.18, 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.0.15, 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.1.14