Splunk Cloud

Getting Data In

Download manual as PDF

Download topic as PDF

Use the Set Sourcetype page on a heavy forwarder for Splunk Cloud

The Set Sourcetype page lets you improve event processing by previewing how your data will be indexed. Use this page to confirm that the forwarder indexes your data as you want it to appear.

Preview data prior to indexing

The Set Sourcetype page appears after you use the Upload or Monitor pages to specify a single file as a source of data.

On the Set Sourcetype page, you can make adjustments to how the software indexes your data. You can adjust and improve the indexing process interactively so that when Splunk software indexes and stores the incoming data, the event data ends up in the format that you want.

Assign the correct source type to your data

The Set Sourcetype page helps you apply the correct source type to your incoming data. The source type is one of the default fields that is assigned to all incoming data, and determines how Splunk software formats the data during indexing. By assigning the correct source type to your data, the indexed version of the data (the event data) will look the way you want it to, with correct timestamps and event breaks.

Splunk Cloud comes with many predefined source types, and attempts to assign the correct source type to your data based on its format. In some cases, you might need to manually select a different predefined source type to the data. In other cases, you might need to create a new source type with customized event processing settings.

The page displays how Splunk software will index the data based on the application of a predefined source type. You can modify the settings interactively and save those modifications as a new source type.

Use the Set Sourcetype page to:

  • See what your data will look like without any changes, using the default event-processing configuration.
  • Apply a different source type to see whether it offers results more to your liking.
  • Modify settings for timestamps and event breaks to improve the quality of the indexed data and save the modifications as a new source type.
  • Create a new source type from scratch. The source types you create here also need to be created on your search head. On your Splunk Cloud instance, go to Settings > Source types to create a new source type.

For information on source types, see Why source types matter in this manual.

Use the Set Sourcetype page

When the Set Sourcetype page loads, the forwarder chooses a source type based on the data you specified. You can accept that recommendation or change it.

Here is an example of the Set Sourcetype page.

71 SetSourcetype.png

  1. Look at the preview pane on the right side of the page to see how the forwarder will index the data. Review event breaks and time stamps.
  2. (Optional) View the event summary by clicking on the View event summary link on the right. Splunk Web displays the event summary in a new window. See View event summary.
  3. If the data appears the way that you want, then proceed to Step 5. Otherwise, choose from one of the following options:
  4. After making the changes, return to Step 1 to preview the data again.
  5. After you are satisfied with the results, click Next to proceed to the Input Settings page.

Choose an existing source type

If the data does not appear the way that you want, see whether or not an existing source type fixes the problem.

If the forwarder can detect a source type, the source type is displayed in the Sourcetype <sourcetype> button. If a source type cannot be determined, Sourcetype: System Defaults is displayed.

71 setsourcetype existing.png

  1. Click the Sourcetype: <sourcetype> button to see a list of source type categories. Under each category is a list of source types within that category.
  2. Hover over the category that best represents your data. As you do, the source types under that category appear in a pop-up menu to the right.
  3. Select the source type that best represents your data. Splunk Web updates the data preview pane to show how the data looks under the new source type. You might need to scroll to see all source types in a category.
  4. Review your data again.

View event summary

71 eventsummary.png

You can see a summary of the events within the data sample by clicking the View Event Summary link on the right side of the page. This summary shows the following information:

  • The size of the sample data, in bytes.
  • The number of events that were present in the sample.
  • The chart that represents the distribution of the events over time. Splunk software uses date stamps within the file to determine how to display this chart.
  • A breakdown of the number of lines each event in the sample took up.

Adjust time stamps and event breaks

If you choose an existing source type without success, then you can manually adjust how the forwarder processes timestamps and event line breaks for the incoming data.

To manually adjust time stamp and event line breaking parameters, use the Event Breaks, Timestamp, Delimited Settings, and Advanced drop-down tabs on the left pane of the Set Sourcetypes page. The preview pane updates as you make changes to the settings.

Some tabs appear only if the forwarder detects that a file is a certain type, or if you select a specific source type for a file.

  • The Event breaks tab appears when the forwarder cannot determine how to line-break the file, or if you select a source type that does not have line breaking defined.
  • The Delimited settings tab appears only when the forwarder detects that you want to import a structured data file, or you select a source type for structured data (such as csv).

For more information about how to adjust time stamps and event breaks, see Modify event processing.

  1. Click the Event breaks tab. The tab displays the Break type buttons, which control how the forwarder line-breaks the file into events.
    • Auto: Detect event breaks based on the location of the time stamp.
    • By Line: Breaks every line into a single event.
    • Regex…: Uses the specified regular expression to determine line breaking.
  2. Click the Timestamps tab. The tab expands to show options for extraction. Select from one of the following options.
    • Auto: Extract timestamps automatically by looking in the file for timestamp events.
    • Current time: Apply the current time to all events detected.
    • Advanced: Specify the time zone, timestamp format (in a specific format known as strptime()), and any fields that comprise the timestamp.
  3. Click the Delimited settings tab to display delimiting options.
    71 delimitedsettings.png
    • Field delimiter: The delimiting character for structured data files, such as comma-separated value (CSV) files.
    • Quote character: The character that the forwarder uses to determine when something is in quotes.
    • File preamble: A regular expression that tells the forwarder to ignore one or more preamble lines (lines that don't contain any actual data) in the structured data file.
    • Field Names: How to determine field names: Automatically, based on line number, based on a comma-separated list, or through a regular expression.
    After the results look the way you want, save your changes as a new source type, which you can then apply to the data when it is indexed.
  4. Click the Advanced tab to display fields that let you enter attribute/value pairs that get committed directly to the props.conf configuration file.

    The Advanced tab requires advanced knowledge of Splunk features, and changes made here might negatively affect the indexing of your data. Consider consulting a member of Splunk Professional Services for help in configuring these options.

Make configuration changes in the Advanced tab

  1. Click a field to edit props.conf entries that the forwarder generates based on your previous choices.
  2. Click on the X to the right of an attribute/value field pair to delete that pair.
  3. Click New setting to create a new attribute/value field pair and specify a valid attribute for props.conf. To view valid attributes, see props.conf in the Splunk EnterpriseAdmin Manual.
  4. Click Apply settings to commit the changes to the props.conf file.

Next step

Modify Input Settings

Workaround for previewing data in cases where the Set Sourcetype page is not available

The "Set Sourcetype" page works on single files only, and can only access files that reside on the Splunk deployment or have been uploaded there. Although it does not directly process network data or directories of files, you can work around those limitations.

Preview network data

You can direct some sample network data into a file, which you can then either upload or add as a file monitoring input. Several external tools can do this. On *nix, the most popular tool is netcat.

For example, if you listen for network traffic on UDP port 514, you can use netcat to direct some of that network data into a file.

nc -lu 514 > sample_network_data

For best results, run the command inside a shell script that has logic to kill netcat after the file reaches a size of 2MB. By default, Splunk software reads only the first 2MB of data from a file when you preview it.

After you have created the "sample_network_data" file, you can add it as an input, preview the data, and assign any new source types to the file.

Preview directories of files

If all the files in a directory are similar in content, then you can preview a single file and be confident that the results will be valid for all files in the directory. However, if you have directories with files of heterogeneous data, preview a set of files that represents the full range of data in the directory. Preview each type of file separately, because specifying any wildcard causes Splunk Web to disable the "Set Sourcetype" page.)

File size limit

Splunk Web displays the first 2MB of data from a file in the "Set Sourcetypes" page. In most cases, this amount provides a sufficient sampling of your data. If you have Splunk Enterprise, you can sample a larger quantity of data by changing the max_preview_bytes attribute in limits.conf. See more information about limits.conf in the limits.conf section of the Splunk Enterprise Admin Manual. Alternatively, you can edit the file to reduce large amounts of similar data, so that the remaining 2MB of data contains a representation of all the types of data in the original file.

PREVIOUS
Use a gateway forwarder to forward data to Splunk Cloud
  NEXT
Modify event processing

This documentation applies to the following versions of Splunk Cloud: 7.1.3, 7.1.6, 7.2.3, 7.2.4, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 8.0.0


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters