What Splunk can index
What Splunk can index
The first step in using Splunk is to feed it data. Once Splunk gets some data, it immediately indexes it, so that it's available for searching. With its universal indexing ability, Splunk transforms your data into a series of individual events, consisting of searchable fields. There's lots you can do to massage the data before and after Splunk indexes it, but you don't usually need to. In most cases, Splunk can determine what type of data you're feeding it and handle it appropriately.
Basically, you point Splunk at data and it does the rest. In moments, you can start searching the data, or use it to create charts, reports, alerts, and other interesting outputs.
What kind of data?
Any data. In particular, any and all IT streaming and historical data. Stuff like event logs, web logs, live application logs, network feeds, system metrics, change monitoring, message queues, archive files, or anything else of interest. Any data. Really.
Point Splunk at a data source. Tell Splunk a bit about the source. That source then becomes a data input to Splunk. Splunk begins to index the data stream, transforming it into a series of individual events. You can view and search those events right away. If the results aren't exactly what you want, you can tweak the indexing process until you're satisfied.
The data can be on the same machine as the Splunk indexer (local data), or it can be on another machine altogether (remote data). You can easily get remote data into Splunk, either by using network feeds or by installing Splunk forwarders on the machines where the data originates. Forwarders are lightweight versions of Splunk that consume data and then forward it on to the main Splunk instance for indexing and searching. For more information on local vs. remote data, see "Where is my data?".
To make the job easier, Splunk offers lots of free apps and add-ons, with pre-configured inputs for things like Windows- or Linux-specific data sources, Cisco security data, Blue Coat data, and so on. Look in Splunkbase for an app or add-on that fits your needs. Splunk also comes with dozens of recipes for data sources like web logs, J2EE logs, or Windows performance metrics. You can get to these from the Add data section of Splunk Web, described later. If the recipes and apps don't cover your needs, then you can use Splunk's more general input configuration capabilities to specify your particular data source. These generic data sources are discussed here.
How to specify data inputs
You add new types of data to Splunk by telling it about them. There are a number of ways you can specify a data input:
- Apps. Splunk has a large and growing variety of apps and add-ons that offer preconfigured inputs for various types of data sources. Take advantage of Splunk apps and free yourself from having to configure the inputs yourself. For more information, see "Use apps".
- Splunk Web. You can configure most inputs using the Splunk Web data input pages. These provide a GUI-based approach to configuring inputs. You can access the Add data landing page from either Splunk Home or Manager. See "Use Splunk Web".
- Splunk's CLI. You can use the CLI (command line interface) to configure most types of inputs. See "Use the CLI".
- The inputs.conf configuration file. When you specify your inputs with Splunk Web or the CLI, the configurations get saved in an inputs.conf file. You can edit that file directly, if you prefer. To handle some advanced data input requirements, you might need to edit it. See "Edit inputs.conf".
In addition, if you use forwarders to send data from outlying machines to a central indexer, you can specify some inputs during forwarder installation. See "Use forwarders".
For more information on configuring inputs, see "Configure your inputs".
Types of data sources
As described earlier, Splunk provides tools to configure all sorts of data inputs, including many that are specific to particular application needs. Splunk also provides the tools to configure any arbitrary data input types. In general, you can categorize Splunk inputs as follows:
- Files and directories
- Network events
- Windows sources
- Other sources
Files and directories
A lot of the data you might be interested in comes directly from files and directories. For the most part, you can use Splunk's files and directories monitor input processor to get data from files and directories.
To monitor files and directories, see "Get data from files and directories".
Splunk can index data from any network port. For example, Splunk can index remote data from
syslog-ng or any other application that transmits via TCP. Splunk can also index UDP data, but we recommend using TCP instead whenever possible, for enhanced reliability.
Splunk can also receive and index SNMP events, alerts fired off by remote devices.
To get data from network ports, see "Get data from TCP and UDP ports".
To get SNMP data, see "Send SNMP events to Splunk".
The Windows version of Splunk includes a wide range of Windows-specific inputs. It also provides pages in Splunk Manager for defining the Windows-specific input types listed below:
- Windows Event Log data
- Windows Registry data
- WMI data
- Active Directory data
- Performance monitoring data
Important: You can index and search Windows data on a non-Windows instance of Splunk, but you must first use a Windows instance to gather the data. You can do this with a Splunk forwarder running on Windows. You configure the forwarder to gather Windows inputs and then forward the data to the non-Windows instance. See "Considerations for deciding how to monitor remote Windows data" for details.
For a more detailed introduction to using Windows data in Splunk, see "About Windows data and Splunk".
Splunk also supports other kinds of data sources. For example:
- FIFO queues
- Scripted inputs -- for getting data from APIs and other remote data interfaces and message queues.
Other things to consider
The topics that follow this one discuss issues to consider when specifying Splunk data:
- "Where is my data?". A concise explanation of remote vs. local data, and why it matters.
- "Use forwarders". How to use forwarders to simplify the remote collection of data.
- "Use apps". How to use Splunk apps to get your data into Splunk quickly.
- "How to get going". An overview of the process of getting and configuring data sources, with tips on best practices.
- "Configure your inputs". The ways you can configure data inputs in Splunk.
- "About Windows data and Splunk". An introduction to getting Windows data into Splunk.
- "What Splunk does with your data (and how to make it do it better)". What happens to your data once it enters Splunk, and how you can configure Splunk to make the data even more useful.