How much space you will need

This topic describes how to estimate the size of your Splunk index and associated data so that you can plan your storage capacity requirements.

When Splunk indexes your data, the resulting data falls into two categories: the compressed, persisted raw data and the indexes that point to this data. With a little experimentation, you can estimate how much disk space you will need.

Typically, the compressed, persisted data amounts to approximately 10% of the raw data that comes into Splunk. The associated indexes range in size anywhere from 10% to 110% of the data they access. This value is affected strongly by the number of unique terms in the data. Depending on the data's characteristics, you might want to tune your segmentation settings.

The best way to get an idea of your space needs is to experiment by installing a copy of Splunk and indexing a representative sample of your data, and then checking the sizes of the resulting directories in defaultdb.

To do this, first index your sample. Then:

1. Go to $SPLUNK_HOME/var/lib/splunk/defaultdb/db.

2. Run du -shc hot_v*/rawdata to determine the size of the compressed, persisted raw data. Typically, this amounts to about 10% of the size of the original sample data set.

3. Run du -ch hot_v* and look at the last total line to see the size of the index.

4. Add the two values together.

This is the total size of the index and associated data for the sample you indexed. You can now use this to extrapolate the size requirements for your Splunk index and rawdata directories over time.

