Troubleshoot Splunk App for VMware
1. Review the release notes to determine if the trouble you are experiencing is a known issue.
2. Consider enabling the troubleshooting logs on data collection nodes to facilitate root cause investigation. See Enable troubleshooting logs.
3. Review the following problems for advice on how to resolve them.
Problems to troubleshoot
Problem
Splunk App for VMware does not seem able to make read-only API calls to vCenter Server systems.
Cause
You do not have the appropriate vCenter Server service login credentials for each vCenter Server.
Resolution
Obtain vCenter Server service login credentials for each vCenter server. See Permissions in vSphere.
Problem
You have configured vCenter Server 5.0 in Splunk App for VMware, but no data is coming in.
Cause
vCenter Server 5.0 and 5.1 are missing WSDL files that are required for Splunk App for VMware to make API calls to vCenter Server.
- reflect-message.xsd
- reflect-types.xsd
Resolution
Install the missing VMware WSDL files as documented in the vSphere Web Services SDK WSDL workaround in the VMware documentation. Note that the programdata
folder is typically a hidden folder.
Problem
The individual tests in the deployment section work, but the Splunk App for VMware main dashboard is empty.
Cause
This is typically caused by one of two issues:
- Time synchronization issues between the indexer, DCN, and vCenter Server
- Incorrect permissions assignments in Splunk Enterprise
Resolution
- Check for time gaps between the indexer, DCN, and vCenter Server. To adjust or disable the Network Time Protocol (NTP) on your DCN, change the NTP servers that the DCN uses by editing the
/etc/ntp.conf
file.
- Access the
/etc/ntp.conf
file. The following values are defaults for the servers in the file.# Use public servers from the pool.ntp.org project. # Please consider joining the pool (http://www.pool.ntp.org/join.html). server 0.centos.pool.ntp.org server 1.centos.pool.ntp.org server 2.centos.pool.ntp.org
- Replace the default values in the file with your NTP server values.
- Restart
ntpd
using the following command:sudo service ntpd restart
- Make sure that you have assigned the correct roles to users of Splunk Enterprise.
User | Role |
---|---|
admin user | splunk_vmware_admin |
all users of the Splunk App for VMware | splunk_vmware_user |
Problem
The DCNs are forwarding data using index=_internal tests, but Splunk App for VMware is not collecting any API data.
Cause
This is typically caused by one of two issues:
- Network connectivity issues from the Scheduler to the DCNs.
- You have not changed the DCN admin account password from its default value.
Resolution
- In the Splunk for VMware App Settings page, verify the accuracy of the settings in the collection page.
- Verify that the
admin
password for each DCN is not set tochangeme
. - Verify that each DCN has a fixed IP address. If Splunk App for VMware uses DCN host names instead of fixed IP addresses, verify that DNS lookups resolve to the correct IP addresses.
Problem
Splunk App for VMware works for 60 days, then stops.
Cause
The 60-day trial license for Splunk App for VMware has expired.
Resolution
Configure the DCN to join your Splunk Enterprise license pool.
splunk edit licenser-localslave -master_uri https://myhost:8089
Problem
Splunk App for VMware seems to be collecting only partial data. Hosts are missing, and so on.
Cause
There are insufficient DCNs to handle the data volume coming from the ESXi environment.
Resolution
1. From the Settings page in Splunk App for VMware, review the list of hosts for each vCenter Server environment.
2. Verify that there is one DCN for every 30 ESXi hosts and every 750 virtual machines.
3. In the Splunk App for VMware configuration pane for the DCN, make sure that the number of worker processes is one fewer than the number of CPU cores assigned to the machine.
Problem
The DCNs are not delivering data to the Splunk Enterprise indexers.
Cause
If the DCNs are configured correctly, this problem is typically the result of a connectivity issue.
Resolution
- Make sure that the DCN has an IP address and can resolve DNS.
- Verify that no firewalls are preventing communication between the DCN and port 9997 on the indexers and that the Scheduler can connect to ports 8089 and 8008 on each DCN. On the search head, search
index=_internal host=DCN-hostname-here
.
Problem
In Host Detail view of Splunk Enterprise version 6.0 or later, you see the following warning message:
Events may not be returned in sub-second order due to search memory limits configured in limits.conf:[search]:max_rawsize_perchunk. See seasrch.log for more information.
Or, in search.log
of Splunk Enterprise version 6.0 or later, you see the following warning message:
02-06-2014 15:47:04.353 ERROR databasePartitionPolicy - Max Raw Size Limit Exceeded 02-06-2014 15:47:04.467 INFO UnifiedSearch - Error in 'databasePartitionPolicy': Max Raw Size Limit Exceeded 02-06-2014 15:47:04.467 WARN CursoredSearch - Events may be returned not in exact sub-second order: M=1368 > N=1250, where M is the number of events read in the 1390841799th second, and N is max number of events to read in a single span. Note that N was scaled back because we exceeded limits.conf:[search]:max_rawsize_perchunk value=100000000
Cause
A bug: non-surppressed error message (SOLNVMW-3587)
Resolution
1. On indexers, add the following configuration in the limits.conf
file in the $SPLUNK_HOME/etc/system/local
directory:
limits.conf [search] max_rawsize_perchunk = 800000000
2. Restart the indexer.
3. Test to you if still see the error message in Splunk App for VMware Host Detail view.
Note that setting the value of max_rawsize_perchunk = 400000000
surppresses the warning message in the Host Detail view. However, in the search.log
file, you will still saw the following message:
02-07-2014 14:52:43.008 ERROR databasePartitionPolicy - Max Raw Size Limit Exceeded 02-07-2014 14:52:43.127 INFO UnifiedSearch - Error in 'databasePartitionPolicy': Max Raw Size Limit Exceeded
To mitigate the appearance of these error messages, set the max_rawsize_perchunk
to at least 600000000
.
Collect VMware vCenter Server Linux Appliance log data | Upgrade from tsidx namespaces to data model acceleration |
This documentation applies to the following versions of Splunk® App for VMware (Legacy): 3.1.1, 3.1.2, 3.1.3, 3.1.4, 3.2.0, 3.2.1, 3.2.2
Feedback submitted, thanks!