Splunk® Enterprise

Release Notes

Splunk Enterprise version 9.0 will no longer be supported as of June 14, 2024. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see How to upgrade Splunk Enterprise.

Workaround for network accessibility issues on Splunk Windows systems under certain conditions

Introduction

This page discusses how to work around an issue where network-intensive Splunk Enterprise operations on a Windows system can sometimes cause that system to become inaccessible from the network.

Symptoms

A Windows system that supports a Splunk Enterprise instance which performs network-intensive operations can become inaccessible from the network after a period of time. Problems usually begin within eight to twelve hours, but can start as late as 2-3 days depending on the amount of network activity that the instance sees. When this anomaly occurs, any attempts to connect to the system remotely fail, and you must restart the computer to return it to service.

You might see the following error in splunkd.log, or in the search.log file(s) created in the individual dispatch directory that each search (scheduled or real-time) generates:

01-16-2013 06:55:33.935 WARN  NetUtils - Error connecting - winsock error 10055\n

Cause

This problem has multiple causes:

  • By default, Windows configures a low number (5000) of available ephemeral, or short-lived, network (TCP) ports.
  • When you perform network-intensive activities in Splunk Enterprise, Splunk Enterprise generates a large number of short-lived network connections, which use these ports. Network-intensive activities include but are not limited to:
    • Running a large number of concurrent real-time searches (usually from an app).
    • Configuring a deployment client to connect to a deployment server which is on the same computer.
  • Once the Windows system runs out of available ports, it returns WSAENOBUFS (Windows Sockets error 10055) to any application that requests a port for network operations, and immediately becomes inaccessible from the network.

When this happens, the only way to fix the problem is to reboot the affected computer.

Note: While this problem most commonly occurs when you employ numerous concurrent real-time searches, any kind of search - and more importantly, any kind of network operation - can trigger the issue. The problem is not limited to Splunk, but Splunk can often cause the problem to appear.

This problem only appears on Windows systems.

Workaround

To work around this issue, you can complete one or both of the following steps.

Caution: The steps below require that you make administrative changes to your Windows system. These advanced changes might render your system unstable or unusable. If you are not able to make these changes, or are either unsure or uncomfortable about what to do, then contact your internal IT support organization for assistance.

1. Modify the Registry to increase the number of available user ports. Follow the instructions at "When you try to connect from TCP ports greater than 5000 you receive the error 'WSAENOBUFS'" (http://support.microsoft.com/kb/196271/en-us) on the Microsoft Support site to modify the Registry and increase the number of ephemeral TCP ports.

Important: We suggest you complete this step first, then restart your system. If the problem persists, then perform the next step.

2. Install a downloadable hotfix from Microsoft. If your system is a multiple-CPU system that runs either Windows Server 2008 R2 or Windows 7, then you can download and install a hotfix which addresses this specific issue. For information and instructions on how to download and apply the hotfix, see "Kernel sockets leak on a multiprocessor computer that is running Windows Server 2008 R2 or Windows 7" (http://support.microsoft.com/kb/2577795) on the Microsoft Support site.

Important: This option is available only for systems with multiple CPUs that run Windows Server 2008 R2 or Windows 7.

You must restart your computer after performing either of these actions.

Last modified on 09 August, 2019
Splunk Enterprise and anti-virus products   Performance Monitor inputs show maximum values of 100 percent usage for a process on multicore Microsoft Windows machines

This documentation applies to the following versions of Splunk® Enterprise: 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.0.13, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.1.9, 7.1.10, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 7.3.6, 7.3.7, 7.3.8, 7.3.9, 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8, 8.0.9, 8.0.10, 8.1.0, 8.1.1, 8.1.2, 8.1.3, 8.1.4, 8.1.5, 8.1.6, 8.1.7, 8.1.8, 8.1.9, 8.1.10, 8.1.11, 8.1.12, 8.1.13, 8.1.14, 8.2.0, 8.2.1, 8.2.2, 8.2.3, 8.2.4, 8.2.5, 8.2.6, 8.2.7, 8.2.8, 8.2.9, 8.2.10, 8.2.11, 8.2.12, 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4, 9.0.5, 9.0.6, 9.0.7, 9.0.8, 9.0.9, 9.0.10, 9.1.0, 9.1.1, 9.1.2, 9.1.3, 9.1.4, 9.1.5, 9.1.6, 9.1.7, 9.2.0, 9.2.1, 9.2.2, 9.2.3, 9.2.4, 9.3.0, 9.3.1, 9.3.2, 9.4.0


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters