Splunk® App for VMware

Configuration Guide

Acrobat logo Download manual as PDF

On August 31, 2022, the Splunk App for VMware will reach its end of life. After this date, Splunk will no longer maintain or develop this product. The functionality in this app is migrating to a content pack in Data Integrations. Learn about the Content Pack for VMware Dashboards and Reports.
This documentation does not apply to the most recent version of VMW. Click here for the latest version.
Acrobat logo Download topic as PDF

Manage data collection

As a Splunk administrator monitoring a large environment, prioritize tasks using the Distributed Collection Scheduler during data collection.

Assign task priorities

Assign task priorities when you have a complex environment with multiple data collection nodes.

Task priorities let you control the resource distribution for data collection nodes. For example, if you have a memory-intensive task, run that task on data collection nodes that you provisioned with more memory than other data collection nodes in your environment.

You can change how data is collected by manually editing the configuration files. The configuration files hydra_node.conf and ta_vmware_collection.conf monitor task execution on the scheduler node.

Set task priorities

The <task>_priority field in ta_vmware_collection.conf determines the priority number for jobs for a task. Zero, a negative number, or a positive number are valid values for this field.

  • Zero is the default value for <task>_priority. A value of zero for this field indicates that there is no change in default data collection priorities for tasks.
  • A negative value increases the job priority. A negative value lowers the priority number but this increases the actual relative priority of a given task. The Distributed Collection Scheduler works on jobs in ascending order of priority number. That is, 1 is higher priority than 5.
  • A positive value decreases the job priority. A positive value increases the priority number but this lowers the actual relative priority of a given task. A positive priority number can result in job expiration if the environment is not overloaded.

ta_vmware_collection.conf lists all tasks.

These are all the tasks that should run everywhere
task = hostvmperf, otherperf, hierarchyinv, hostinv, vminv, clusterinv, datastoreinv, rpinv, task, event

The <task>_priority field determines the priority to each of the tasks.

The number to add to the priority number for jobs of a given task, negative number makes higher priority
task_priority = -60
event_priority = -60
hierarchyinv_priority = -120
  1. Stop the Distributed Collection Scheduler.
  2. Edit $SPLUNK_HOME/etc/apps/local/ta_vmware_collection.conf on the scheduler node (typically on the search head).
  3. Add the <task>_priority field to a task.
  4. Enter a value for the field.
  5. Restart Splunk Enterprise.
  6. Restart the Distributed Collection Scheduler.

Task priorities example

Assign the following values to task, event, and hierarchyinv in ta_vmware_collection.conf.

task_priority = -60

event_priority = -60

hierarchyinv_priority = -120

Unix epoch time determines the priority number for tasks. For example, if epoch is currently 188, using the values above for <task>_priority and 
hierarchyinv_priority, hierarchyinv events have a priority of 68, and task events have a priority of 128. The value for hierarchyinv events equals the Unix epoch time minus the task_priority value. The Distributed Collection Scheduler always collects hierarchyinv events before task events.

How jobs are assigned

The Distributed Collection Scheduler sets the capabilities of the workers on the data collection nodes. The Distributed Collection Scheduler assigns jobs only to nodes that have the capability of running those jobs, as defined in hydra_node.conf on the scheduling node. The Distributed Collection Scheduler sorts ready jobs based on their task weight. The Distributed Collection Scheduler load balances the jobs and checks the capabilities of the assigned jobs to make sure that jobs are distributed to optimize resources and to avoid overloading any one data collect node. A warning appears if the environment is unbalanced. A data collection node can not execute a task that has a task weight of zero. It reports an error and ignores all jobs associated with that task.

Last modified on 22 June, 2016
Deploy Splunk App for VMware in an indexer cluster deployment
Filter log data collection

This documentation applies to the following versions of Splunk® App for VMware: 3.1.1, 3.1.2, 3.1.3, 3.1.4, 3.2.0, 3.2.1, 3.2.2

Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters