Your environment

In an enterprise environment the data collection demands increase as your environment scales and with that you must scale Splunk for VMware to meet your demands.

At some point the FA will run out of CPU and memory resources, a vCenter will hit its limits in terms of its ability to respond to requests, and data collection will take longer for some data types than others which can prevent other data data from being collected. As data collection depends on running the actions associated with the data some actions may not get a chance to run or may not be able to run often enough to collect the complete data set. This will generate gaps in the data that can be see in the views and dashboards of the App. There are many factors to take into account, and all of them can vary widely within a single VMware environment and from one VMware environment to another. The best way to find out what works for you is to know your environment, know your needs and experiment a little.

There are two primary ways to scale the Solution:

Run multiple engines within a single FA.
Deploy multiple FAs, possibly with multiple engines running in each one.

The most difficult step is to take a single engine.conf file and distribute it across multiple FAs or multiple engines / engine.conf files so that you can scale the solution to cover a much larger environment. Generally speaking, we recommend that you maximize the processing power of a given FA before deploying more FAs.However there are many considerations and practical limitations that make it important to be able to use both methods effectively.

How to split up a single engine.conf file to meet the scale of your environment is difficult as there are no clear-cut rules for when you should create a new FA or spin up a new instance of the engine. Scaling decisions depend on the size and nature of your environment. This is relative to the resources available to the FA VM(s) and your VC instance(s), as well as how long certain actions take to complete (which also vary depending on the size and nature of the environment).

You can attempt to scale on your own. We also recommend that for very large scale environments tat you engage with Splunk professional services.

Configuration of a large scale complex environment

When scaling up the solution and to maximize your use of the FA VM we recommend that you:

Run multiple engines and
Use multiple (uniquely named) engine.conf files.

To do this you must also:

Scale your inputs.conf file to run them and
Create a scripted input in inputs.conf that calls Engine.pm and takes the absolute path of the engine.conf file as an argument. Put your inputs.conf file here:

$SPLUNK_HOME/etc/apps/splunk_for_vmware_appliance/local

Using multiple engines and multiple `engine.conf` files

This is an example of how to configure the FA VM to use multiple engines and multiple engine.conf files.

To configure the FA VM to work in a complex environment:

Log into the FA VM as the splunkadmin user.
Create an engine.conf file in the $SPLUNK_HOME/etc/apps/splunk_for_vmware_appliance/local.
Create an inputs.conf file in the local directory, $SPLUNK_HOME/etc/apps/splunk_for_vmware_appliance/local.
In the inputs.conf file, create a [default] stanza at the beginning of the file to assign the "host" setting for the FA’s log files. These files are forwarded automatically to the indexer(s) by a stanza in the default/inputs.conf file. Set the host setting for all FA-specific data that is sent to the indexer(s). This ensures that the FA’s logs are assigned the correct host field when sent to the indexer(s). Use the same value that you used previously when setting the FA VM's OS hostname when configuring the FA VM. It is also the same value used for the “fa” setting in the engine.conf [default] stanza. In this example the FA VM's OS hostname is splunkfa1
:[default]

:host = splunkfa1
Create a new scripted input stanza for each engine instance you want to run. Each scripted input calls Engine.pm with the absolute path of engine.conf as an argument.
Restart Splunk:
splunk restart

Note: Default scripted inputs (located in default/inputs.conf) are disabled by default.

How to invoke the engine from the command line

Engine.pm is the main script for gathering data from VMware. You can invoke the Engine from the command line to create an inputs.conf file.

Invoke the the engine from the command line as follows:

perl Engine.pm [-h] 
perl Engine.pm [-e config_file] [-c credentials_file] [-d module_name(s)] 

Options:
-h, --help: show this menu
-e, --engine: specify file to read engine configuration from (default: engine.conf)
-c, --credentials: specify file to read credentials from (default: credentials.conf)
-d, --debug: run in debug mode, you may also specify modules for debug mode,  they are specified in the form of a regex, for example, TaskDiscovery|EventDiscovery  means run both TaskDiscovery and EventDiscovery in debug mode.

This Engine module reads the configuration files, engine.conf and credentials.conf by default, collects the data from your VMware environment, and then sends it to Splunk.

How to structure an inputs.conf file is given in the next section.

How to specify a scripted input stanza in inputs.conf to invoke the engine

In inputs.conf you invoke the engine using a scripted input.

[script://./bin/Engine.pm -e /path/to/custom/engine.conf -c /path/to/custom/credentials.conf]

Note: specify engine.conf (the configuration file) using an absolute path. This is specifying the path starting at the root of the file system “/”, not a relative path. This is how it is specified in a stanza (when not using a credentials.conf file):

[script://./bin/Engine.pm -e /home/splunkadmin/opt/splunk/etc/apps/splunk_for_vmware_appliance/local/engine.conf]

Note: When naming configuration files, it is a good practice to name the files to reflect their function. In this case, we use "engine" as a prefix when naming the configuration file, engine.conf. When we run multiple engines this naming convention makes it easier for us to identify the specific conf files responsible for performing specific functions. You can give the actual conf file any name you want.

The debug parameter (-d, --debug) is optional and can be located anywhere on the command line, before or after the conf file locations.

Invoke the engine on the command line with the "debug" parameter specified:

   perl Engine.pm -d

Invoke the engine on the command line with "debug" specified with a regex of module names. This turns on debugging for all listed modules.

   perl Engine.pm -d Discovery
   perl Engine.pm -d EventDiscovery
   perl Engine.pm -d TaskDiscovery|EventDiscovery

You can also add the debug flags in the above examples to an inputs.conf stanza.

Note: Do not place spaces between the pipe characters | when listing multiple modules on the command line. The module names are the same as the values used in the “action” setting in an engine.conf file, for example HierarchyDiscovery, InventoryDiscovery, and so on. We recommend reserving the use of the debug option for a qualified Splunk support engineer.

Sample local/inputs.conf file for multiple engine instances

Now that you know how to invoke Engine.pm in an inputs.conf stanza, we will look at other settings that must be specified in a simple example.

In addition to invoking Engine.pm in the stanza header, you must also provide values for the "interval", "disabled", "sourcetype", and "index" settings. All of these settings take a known value that never changes, except for sourcetype (as explained below). In general you don't have to change the values for these settings.

The standard settings and values to be included in each inputs.conf stanza are:

interval = 60
disabled = false
sourcetype = <some unique string per stanza>
index = vmware

Important: Set a unique sourcetype label in each scripted input stanza to prevent the data streams from colliding and preventing garbage data an data loss.

In this example, the sourcetype label is not actually placed on the data. It is provided to disambiguate the data streams when the same input script (Engine.pm) is listed in multiple stanzas. The engine code applies the correct sourcetype itself at the time that the data is gathered. Each inputs.conf stanza should follow the form shown in the example above – It is important to include al of the settings.

When creating an inputs.conf file, you should also provide a [default] stanza at the top that sets the "host" setting for this FA VM. The value should be the FA VM's "OS hostname"; the same value used for the “fa” setting in the engine.conf [default] stanza. This ensures that the FA’s logs will be assigned the correct "host" field before being sent to the indexer(s).

Sample local/inputs.conf file set to run multiple engine instances

Using this sample you can create an inputs.conf file that works for your VMware environment taking into account the number of FA VMs you need to deploy, and the number of engine and engine.conf file instances you need to run.

# This is an example file showing a local/inputs.conf file running multiple engine instances.
# Set the host setting for all FA-specific data that is sent to the indexer(s). 
# This ensures that the FA’s logs will be assigned the right host field when sent to the indexer(s).
# The value is the same as the one you used for the FA VM's OS hostname (during FA VM configuration steps).
# It should also be the same value used for the “fa” setting in the engine.conf [default] stanza.
# This example assumes that the FA VM's OS hostname was set to "splunkfa1".
[default]
host = splunkfa1

# Create an engine instance to get hierarchy data
[script://./bin/Engine.pm -e /home/splunkadmin/opt/splunk/etc/apps/splunk_for_vmware_appliance/local/engine-hierarchy.conf]
interval = 60
disabled = false
sourcetype = vm_hierarchy
index = vmware

# Create an engine instance to get the remainder of the data
[script://./bin/Engine.pm -e /home/splunkadmin/opt/splunk/etc/apps/splunk_for_vmware_appliance/local/engine-other.conf]
interval = 60
disabled = false
sourcetype = vm_other
index = vmware

Now that we have created the inputs.conf file we can examine how to split a single engine.conf file into multiple files to monitor much higher-scale environments.

Related answers from Splunk Community

Your environment

Configuration of a large scale complex environment

Using multiple engines and multiple engine.conf files

How to invoke the engine from the command line

How to specify a scripted input stanza in inputs.conf to invoke the engine

Sample local/inputs.conf file for multiple engine instances

Sample local/inputs.conf file set to run multiple engine instances

Comments

Your environment

Was this topic useful?

Using multiple engines and multiple `engine.conf` files