When sizing your Splunk Enterprise environment's hardware needs, a reference machine helps you understand when it is time to scale and distribute the deployment. Following is an example of such a machine. Refer to this configuration as the standard for the remainder of this chapter.
The reference machine described below produces the following index and search performance metrics for a given sample of data:
- Up to 20 megabytes per second (1700 GB per day) of raw indexing performance, provided no other Splunk activity is occurring.
- Up to 50,000 events per second for dense searches
- Up to 5,000 events per second for sparse searches
- Up to 2 seconds per index bucket for super-sparse searches
- From 10 to 50 buckets per second for rare searches with bloom filters
To find out more about the types of searches and how they affect Splunk Enterprise performance, read "How search types affect Splunk Enterprise performance" in this manual.
- Intel x86 64-bit chip architecture
- 2 CPUs, 6 cores per CPU (12 cores total), at least 2 Ghz per core
- 12 GB RAM
- Standard 1 Gb Ethernet NIC, optional 2nd NIC for a management network
- Standard 64-bit Linux or Windows distribution
The reference computer's disk subsystem should be capable of handling a high number of averaged Input/Output Operations Per Second (IOPS).
IOPS are a measurement of how much data throughput a hard drive can produce. Since a hard drive reads and writes at different speeds, there are IOPS numbers for disk reads and writes. The average IOPS is the blend between those two figures.
The more average IOPS a hard drive can produce, the more data it can index and search in a given period of time. While many variable items factor into the amount of IOPS that a hard drive can produce, the three most important elements are:
- its rotational speed (in revolutions per minute)
- its average latency (the amount of time it takes to spin its platters half a rotation)
- its average seek time (the amount of time it takes to retrieve a requested block of data.)
To get the most IOPS out of a hard drive, always choose those drives that have high rotational speeds and low average latency and seek times. Every drive manufacturer provides this information (and some provide much more).
For additional information on IOPS and how to calculate them, review the following articles:
- "Getting the hang of IOPS (http://www.symantec.com/connect/articles/getting-hang-iops-v13) on Symantec's Connect Community.
- "Analyzing I/O performance in Linux (http://www.cmdln.org/2010/04/22/analyzing-io-performance-in-linux) on CMDLN.ORG (A sysadmin blog).
For this application, we use eight 146-gigabyte, 15,000 RPM serial-attached SCSI (SAS) HDs in a Redundant Array of Independent Disks (RAID) 1+0 fault tolerance scheme as the disk subsystem. Each hard drive is capable of about 200 average IOPS. The combined array produces a little over 800 IOPS.
Important: Disk I/O often constrains Splunk Enterprise first, so always consider disk infrastructure first when specifying your hardware.
Splunk Enterprise performs fastest when deployed directly on to bare-metal hardware, as described above. However, Splunk Enterprise can and does deliver on virtual equipment. What's more, we fully support deploying it on virtual hardware.
Using the bare metal hardware as a baseline, Splunk Enterprise generally indexes data about 30% slower on a virtual machine (VM) than it does on a standard reference machine. Search performance is on par with the real-world hardware.
This is a best-case scenario that does not account for resource contention with other active VMs on the same physical server. It also does not take into account certain vendor-specific I/O enhancement techniques (such as Direct I/O or Raw Device Mapping).
Splunk Enterprise in the cloud
While you can run Splunk Enterprise in the cloud, there are various concerns that you must be aware of when doing so. In addition to the security concerns of running Splunk in a public cloud, you must also note that performance degrades significantly compared to bare-metal hardware. Using that benchmark as a baseline again, Splunk Enterprise indexing performance on a cloud-based computer is roughly half that of a real one. Searching suffers, too - results return anywhere from 15 to 20 percent slower than on a physical machine.
How Splunk Enterprise calculates disk storage
This documentation applies to the following versions of Splunk® Enterprise: 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.0.15, 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.1.14