Add data to Splunk
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Add data to Splunk
This topic assumes that you have already downloaded, installed, and started a Splunk server. If you haven't yet, go back to the previous topic for instructions to do that.
Once you've started and logged into Splunk, you need to give it data that you can search. This topic walks you through downloading the sample dataset and adding it into Splunk.
Download the sample data file
This tutorial uses sample data from an online store, the Flower & Gift shop, to teach you about using Splunk. The sample data includes:
- Apache web server logs
- mySQL database logs
You can feed Splunk data from files and directories, network ports, and custom scripts, but for this tutorial, you will upload a compressed file directly to Splunk.
(You can of course put lots of other kinds of data into Splunk--check out "What Splunk can index" in the Getting Data In Manual for information about the different sorts of data Splunk can handle.)
To proceed with this tutorial, download (but do not uncompress) the sample data from here: sampledata.zip This sample data file is updated daily.
Important: For this tutorial, you will be adding the compressed file to Splunk, so you do not need to uncompress the sample data! Also, this tutorial is designed to be completed in a matter of hours. But, if you want to spread it out over a few days, just download a new sample data file and add it!
Add the sample data to Splunk
Logging into Splunk should have taken you to Splunk Home. If it isn't the first view that you see, use the App list to select Home.
1. In Splunk Home, click Add data.
2. Under Choose how you want Splunk to consume your data, click From files and directories.
This takes you to the Home > Add data > Files & Directories > Add new view. This is where you will upload the sample data file. Normally, this is all you need to do and Splunk handles the rest without any changes needed. For the purposes of this tutorial, however, you will also edit some of the properties.
3. Under Source, select Upload and index a file and browse for the sample data file that you just downloaded.
The source of an event tells you where it came from. If you collect data from files and directories, the "source" is the full pathname of the file or directory. In the case of a network-based source, the source is the protocol and port, such as UDP:514.
4. Select More settings.
5. Under Host and Set host, choose regex on path.
An event's host value is typically the hostname, IP address, or fully qualified domain name of the network host from which the event originated.
If you take a look at the Sampledata.zip file, it contains four directories (folders): three of the folders are named for Apache web servers and one is a MySQL server. You want to set the host value to the names of these folders.
By selecting regex on path, you're telling Splunk to use a regular expression (regex) to match the segment of the path within the compressed file that you want to set as your host value.
6. Under Regular expression, copy and paste:
Sampledata.zip:./([^/]+)/
7. Leave the value of Source type as "Automatic".
The source type of an event tells you what kind of data it is, usually based on how it's formatted. Examples of source types are access_combined or cisco_syslog. This classification lets you search for the same type of data across multiple sources and hosts.
8. Under Index, leave the destination index as default.
9. Click Save.
When it's finished, Splunk displays a message saying the upload was successful. Click Start searching and proceed to the next topic in this tutorial to look at your data in the Search app.
More about getting data in
This topic only discusses one type of input, uploading a local file, which is all you need to run through the tutorial. For information about all other type of data inputs Splunk can handle and how to add them, refer to the Getting Data In Manual beginning with the topic, "What Splunk can monitor".
This documentation applies to the following versions of Splunk: 4.2 , 4.2.1 , 4.2.2 , 4.2.3 , 4.2.4 , 4.2.5 View the Article History for its revisions.
Comments
Why is the regex *after* the filename?






Hi Deveritt, the regex is used to match the path to the source file within zipped file, not including the zipped file's name.
Sophy