How to limit data collection
About Data Volumes
In this topic we discuss how to control the type and quantity of data that comes into the Splunk App for VMware. Collecting the correct type of data is important, but so also is limiting the quantity of data that you collect, as data volumes can directly affect your licensing requirements.
The amount of data collected can vary but around 300MB per host per day is a good rough estimate.
See "Data volume requirements" for more information on how to calculate the estimated peak data volume for your environment.
As a systems administrator you can limit data volume in a number of ways:
- Reduce the number of hosts from which to collect data.
- Reduce the number of performance metrics collected.
- Use NullQueue to filter log data.
Configure hosts from which to collect data
You can configure the number of hosts from which you can collect performance data and log data. To do this modify the whitelist or blacklist in the vCenter settings on the Collection Configuration dashboard. See "Configure hosts" for more information.
Configure the performance metrics collected
This is an advanced administration task.
You can configure the performance metrics collected using regular expressions in the configuration file ta_vmware_collection.conf
on the search head (where the scheduler resides). By default all performance metrics are collected. See "Collection configuration reference" to see a sample ta_vmware_collection.conf
file.
To change the default configuration create the file
SPLUNKHOME/etc/apps/Splunk_TA_vmware/local/ta_vmware_collection.conf
Refer to
SPLUNKHOME/etc/apps/Splunk_TA_vmware/default/ta_vmware_collection.conf
to see all of the settings.
You can set whitelists for performance metrics for hosts, virtual machines, resource pools and clusters in the configuration file.
The file must have a default
stanza followed by any or all of the following:
- host metrics, by setting a regular expression for the attribute host_instance_whitelist
- virtual machine metrics, by setting a regular expression for the attribute vm_instance_whitelist
- resource pool metrics, by setting a regular expression for the attribute rp_instance_whitelist
- cluster metrics, by setting a regular expression for the attribute cluster_instance_whitelist
In this example, we limit host and virtual machine performance metrics to cpu metrics only. See the complete list of performance metrics in this manual.
[default] host_instance_whitelist = ^p_[^_]*_cpu.* vm_instance_whitelist = ^p_[^_]*_cpu.*
Use NullQueue to filter log data
You can exclude vCenter log data and ESXi log data using nullQueue, Splunk's /dev/null equivalent. In this case you can nullQueue the data (drop the data) when our technology Add-on, TA-vmware (that sits on Indexers), receives it from the forwarder (the vCenter forwarder). Remember that vCenter system logs are collected by configuring syslog.
When you filter out data in this way, the filtered data is not forwarded or added to the Splunk index, and doesn't count toward your indexing volume. The forwarder discards the data and does not forward it to the indexer. There is also no change to the data. the data is still generated at the source and is written to the the logs on the local vCenter machine.
To exclude log data
- For vCenter log data, edit the
props.conf
file for Splunk_TA-vcenter on the forwarder on the vCenter. - For Esxi logs, edit the
props.conf
file for Splunk_TA_esxilogs on the intermediate forwarders for syslog data.
Example props.conf
for TA-vcenter
Uncomment the transforms-routing attributes that determine how to route the vpxd events. Uncomment the following lines of code:
For sourcetype = vmware:vclog:vpxd
, uncomment:
#TRANSFORMS-null1 = vmware_vpxd_level_null #TRANSFORMS-null4 = vmware_vpxd_retrieveContents_null #TRANSFORMS-null5 = vmware_vpxd_null
For sourcetype = vmware:vclog:vpxd-alert
, uncomment:
#TRANSFORMS-null2 = vmware_vpxd_level_null,vmware_vpxd_level_null2
For sourcetype = vmware:vclog:vpxd-profiler
, uncomment:
#TRANSFORMS-null3 = vmware_vpxd_level_null,vmware_vpxd_level_null2
When uncommented, props.conf
works with transforms.conf
to route the specified source types to nullQueue. The actual routing is done in transforms.conf
.
For more information on nullQueue, see "Filter event data and send it to queues".
Thresholds | Assign task priorities |
This documentation applies to the following versions of Splunk® App for VMware (Legacy): 3.0.2
Feedback submitted, thanks!