Add a sourcetype
If you want to search your virtual indexes by sourcetype, you must first configure them for your data via
Any common data input format can be a source type, though most source types are log formats. If your data is unusual, you might need to create a source type with customized event processing settings. And if your data source contains heterogeneous data, you might need to assign the source type on a per-event (rather than a per-source) basis.
See "Why sourcetypes matter" in the Splunk Enterprise documentation to learn more about why you might want to use sourcetyping in your HDFS data.
To add a sourcetype to an HDFS data source, you can simply add a stanza to
$HUNK_HOME/etc/system/local/props.conf. When defining sourcetypes for HDFS data, keep in mind that searches of HDFS data occur at search-time, not index time and that Hunk only reads the latest timestamps and not original HDFS timestamps. As a result, timestamp recognition may not always works as expected.
In the example below, we add two sourcetypes. A new sourcetype
access_combined represents data from the access_combined log files.
mysqld will let you search data from the specified
[source::.../access_combined.log] sourcetype=access_combined priority=100 [source::.../mysqld.log] sourcetype=mysqld priority=100
(You do not need to restart Hunk)
Once you do this, you can search your HDFS by sourcetypes. For more information about searching, including searching by sourcetypes, see "Use fields to search" in the Splunk Enterprise Search Tutorial.
Note the following when adding a sourcetype:
- Structured Data Header Extractions from props.conf do not work with Hunk
- While search time extractions should work with Hunk, it's easier to use the
SimpleCSVRecordReaderto do what you're looking for (if the file has a header) by adding it to the default list:
#append the SimpleCSVRecordReader to the default list: vix.splunk.search.recordreader = ...,com.splunk.mr.input.SimpleCSVRecordReader vix.splunk.search.recordreader.csv.regex = <a regex to match csv files> vix.splunk.search.recordreader.csv.dialect = tsv
Set up a provider and virtual index in the configuration file
Set up a virtual index in the configuration file
This documentation applies to the following versions of Hunk®(Legacy): 6.1, 6.1.1, 6.1.2, 6.1.3, 6.2, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13, 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.4.9, 6.4.10, 6.4.11