How to limit data collection
About Data Volumes
In this topic we discuss different ways to control the type and quantity of data that you bring into Splunk for VMware. Collecting the correct type of data is important, but so also is limiting the quantity of data that you bring into Splunk as the amount of data you collect can directly affect your licensing requirements.
The amount of data collected can vary but around 300MB per host per day is a good rough estimate.
See Data volume requirements for more information on how to calculate the estimated peak data volume for your environment.
As a systems administrator you can limit data volume in a number of ways:
- Reduce the number of hosts from which to collect data.
- Reduce the number of performance metrics collected.
- Use NullQueue to filter log data.
Configure hosts from which to collect data
You can configure the number of hosts from which you can collect performance data and log data by modifying the whitelist or blacklist in the vCenter settings on the Collection Configuration dashboard. See Configure hosts for more information on how to do this.
Configure the performance metrics being collected
You can configure the performance metrics being collected using regular expressions in the configuration file ta_vmware_collection.conf
on the search head (where the scheduler resides). This is an advanced administration task. By default all performance metrics are collected. Look in Collection configuration reference to see a sample ta_vmware_collection.conf
file.
To change the default configuration create the file
SPLUNKHOME/etc/apps/Splunk_TA_vmware/local/ta_vmware_collection.conf
Refer to
SPLUNKHOME/etc/apps/Splunk_TA_vmware/default/ta_vmware_collection.conf
to see all the settings.
You can set whitelists for perf metrics for hosts, virtual machines, resource pools and clusters by adding the following to the configuration file:
The file should have a default
stanza followed by any or all of the following:
- host metrics, by setting a regular expression for the attribute host_instance_whitelist
- virtual machine metrics, by setting a regular expression for the attribute vm_instance_whitelist
- resource pool metrics, by setting a regular expression for the attribute rp_instance_whitelist
- cluster metrics, by setting a regular expression for the attribute cluster_instance_whitelist
In this example, we limit host and virtual machine performance metrics to cpu metrics only. See the complete list of performance metrics in this manual.
[default] host_instance_whitelist = ^p_[^_]*_cpu.* vm_instance_whitelist = ^p_[^_]*_cpu.*
Use NullQueue to filter log data
You can exclude vCenter log data and ESXi log data using nullQueue, Splunk's /dev/null equivalent. In this case you can nullQueue the data (drop the data) when our technology Add-on, TA-vmware (that sits on Indexers), receives it from the forwarder (the vCenter forwarder). Remember that vCenter system logs are collected by configuring syslog.
When you filter out data in this way, the filtered data is not forwarded or added to the Splunk index, and doesn't count toward your indexing volume. The forwarder discards the data and does not forward it to the indexer. There is also no change to the data - it is still being generated at the source and is written to the the logs on the local vCenter machine.
To exclude log data
- For vCenter log data, edit the
props.conf
file for Splunk_TA-vcenter on the forwarder on the vCenter. - For Esxi logs, edit the
props.conf
file for Splunk_TA_esxilogs on the intermediate forwarders for syslog data.
Example props.conf
for TA-vcenter
Uncomment the transforms-routing attributes that determine how to route the vpxd events. Uncomment the following lines of code:
For sourcetype = vmware:vclog:vpxd
, uncomment:
#TRANSFORMS-null1 = vmware_vpxd_level_null #TRANSFORMS-null4 = vmware_vpxd_retrieveContents_null #TRANSFORMS-null5 = vmware_vpxd_null
For sourcetype = vmware:vclog:vpxd-alert
, uncomment:
#TRANSFORMS-null2 = vmware_vpxd_level_null,vmware_vpxd_level_null2
For sourcetype = vmware:vclog:vpxd-profiler
, uncomment:
#TRANSFORMS-null3 = vmware_vpxd_level_null,vmware_vpxd_level_null2
When uncommented, props.conf
works with transforms.conf
to route the specified source types to nullQueue. The actual routing is done in transforms.conf.
For more information on nullQueue, see "Filter event data and send it to queues".
Thresholds | VMware quick reference |
This documentation applies to the following versions of Splunk® App for VMware (Legacy): 3.0, 3.0.1
Feedback submitted, thanks!