Splunk's architecture and what gets installed
Contents
Splunk's architecture and what gets installed
This topic discusses Splunk's internal architecture and processes at a high level. If you're looking for information about third-party components used in Splunk, refer to the credits section in the Release notes.
Processes
A Splunk server runs two processes on your host, splunkd and splunkweb:
-
splunkdis a distributed C/C++ server that accesses, processes and indexes streaming IT data. It also handles search requests.splunkdprocesses and indexes your data by streaming it through a series of pipelines, each made up of a series of processors.- Pipelines are single threads inside the
splunkdprocess, each configured with a single snippet of XML. - Processors are individual, reusable C or C++ functions that act on the stream of IT data passing through a pipeline. Pipelines can pass data to one another via queues.
splunkdsupports a command line interface for searching and viewing results.
- Pipelines are single threads inside the
-
splunkwebis a Python-based application server based on CherryPy that provides the Splunk Web user interface. It allows users to search and navigate IT data stored by Splunk servers and to manage your Splunk deployment through a Web interface.
splunkweb and splunkd can both communicate with your Web browser via REST:
-
splunkdalso runs a Web server on port 8089 with SSL/HTTPS turned on by default. -
splunkwebruns a Web server on port 8000 without SSL/HTTPS by default.
Additional processes for Splunk on Windows
Nearly all of your interaction with Splunk involves the two main processes described above, as those services do the majority of the legwork. On Windows instances of Splunk, there are additional processes that are used by the data inputs you create on a Splunk instance. These scripted inputs run when configured by certain types of Windows-specific data input.
splunk-admon
splunk-admon.exe is spawned by splunkd whenever you configure an Active Directory (AD) monitoring input. splunk-admon's purpose is to attach to the nearest available AD domain controller and gather change events generated by AD. Those change events are then stored in Splunk.
splunk-perfmon
splunk-perfmon.exe (new for version 4.2) runs when Splunk has been set up to monitor performance data on the local machine. This service attaches to the Performance Data Helper libraries, which query the performance libraries on the system and extract performance metrics both instantaneously and over time.
splunk-regmon
splunk-regmon.exe runs when a Registry monitoring input is configured in Splunk. This scripted input initially writes a baseline for the Registry as it currently exists (if desired), then monitors changes to the Registry over time. Those changes come back into Splunk as searchable events.
splunk-winevtlog
This utility is used to test defined event log collections, and can output events as they are collected for investigation. Splunk has a Windows event log input processor built into the engine.
splunk-wmi
When you configure a performance monitoring, event log or other input against a remote computer, this program starts up. Depending on how the input is configured, either it attempts to attach to and read Windows event logs as they come over the wire, or it executes a Windows Query Language (WQL) query against the WMI provider on the specified remote machine(s). Those events are then stored in Splunk.
Architecture diagram
This documentation applies to the following versions of Splunk: 4.1 , 4.1.1 , 4.1.2 , 4.1.3 , 4.1.4 , 4.1.5 , 4.1.6 , 4.1.7 , 4.1.8 , 4.2 , 4.2.1 , 4.2.2 , 4.2.3 , 4.2.4 , 4.2.5 , 4.3 View the Article History for its revisions.
