How to get your data into Splunk
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
How to get your data into Splunk
No special plugins are required for Splunk to index data any network or local source. Splunk can index any IT data from any source in real time. We call this universal indexing.
Seriously, from any data source
A variety of flexible input methods let you index logs, configurations, traps and alerts, messages, scripts, and code and performance data from all your applications, servers and network devices. Monitor file systems for scripts and configuration changes, capture archive files, find and tail live application logs, connect to network ports to receive syslog, SNMP and other network-based instrumentation.
Splunk consumes any data you point it at. Before indexing data, you must add your data source as an input. The source is then listed as one of Splunk's default fields (whether it's a file, directory or network port).
Important: If you add an input, Splunk adds that input to a copy of
inputs.conf that belongs to the app you're currently in. This means that if you navigated to Splunk Manager directly from the Launcher and then added an input there, your input will be added to
$SPLUNK_HOME/etc/apps/launcher/local/inputs.conf. Make sure you're in the desired app when you add your inputs.
Ways to get data into Splunk
Specify data inputs via the following methods:
- The Manager in Splunk Web.
- Splunk's CLI.
- The inputs.conf configuration file.
- Forwarding from other systems
You can add most data sources using Splunk Web. For more extensive configuration options, use
Splunk accepts data inputs in a variety of ways. Here's a basic overview of your options.
Files and directories
A lot of the data you may be interested comes directly from files and directories. For the most part, you can use Splunk's files and directories monitor input processor to index data in files and directories.
You can also configure Splunk's file system change monitor to watch for changes in your file system. However, you shouldn't use both the file and directories monitor and the file system change monitor to follow the same directory or file. If you want to see changes in a directory, use the file system change monitor. If you want to index new events in a directory, use the file and directories monitor.
To monitor files and directories, see "Monitor files and directories".
To enable and configure the file system change monitor, see "Monitor changes to your file system".
TCP network ports
TCP is a reliable, connection-oriented protocol that should be used instead of UDP to transmit and receive data whenever possible. Splunk with an Enterprise license can receive data on any TCP port, allowing Splunk to receive remote data from syslog-ng and any other application that transmits via TCP.
To monitor data via TCP, see "Monitor network ports".
Splunk supports monitoring over UDP, but recommends using TCP instead whenever possible. UDP is generally undesirable as a transport because:
- It doesn't enforce delivery.
- It's not encrypted.
- There's no accounting for lost datagrams.
Refer to "Working with UDP connections" on the Splunk Community Wiki for recommendations if you must use UDP.
Splunk on Windows ships with the Windows inputs, well as pages in Splunk Manager for defining the Windows-specific input types listed below. Because of compatibility issues, you will not see Windows-specific inputs or Splunk Manager pages on non-Windows Splunk instances.
Splunk on Windows can add data from these Window-specific sources:
Important: You can index and search your Windows data on a non-Windows instance of Splunk, but you must first use a Windows instance of Splunk to gather the Windows data. You can easily do this by means of a Splunk forwarder running on Windows, configured to gather Windows inputs and then forward the data to the non-Windows instance of Splunk where searching and indexing will take place. See "Considerations for deciding how to monitor remote Windows data" for the details.
Need some help deciding how to get your Windows data into Splunk? Check out "Considerations for deciding how to monitor remote Windows data" in this manual.
Got custom data? It might need some extra TLC
Splunk can index any time-series data, usually without the need for additional configuration. If you've got logs from a custom application or device, you should try Splunk's defaults first. But if you're not getting the results you want, you can tweak a bunch of different things to make sure your events are indexed correctly.
- Are your events multi-line?
- Is your data in an unusual character set?
- Is Splunk not figuring out the timestamps correctly?
Not finding the events you're looking for?
When you add an input to Splunk, that input gets added relative to the app you're in. Some apps, like the *nix and Windows apps that ship with Splunk, write input data to a specific index (in the case of *nix and Windows, that is the 'os' index). If you're not finding data that you're certain is in Splunk, be sure that you're looking at the right index. You may want to add the 'os' index to the list of default indexes for the role you're using. For more information about roles, refer to the topic about roles in this manual.
Note: Splunk looks for the inputs it is configured to monitor every 24 hours starting from the time it was last restarted. This means that if you add a stanza to monitor a directory or file that doesn't exist yet, it could take up to 24 hours for Splunk to start indexing its contents. To ensure that your input is immediately recognized and indexed, add the input using Splunk Web or by using the
add command in the CLI.