Splunk® Enterprise

Getting Data In

Download manual as PDF

Download topic as PDF

Configure event line breaking

Some events consist of more than one line. Splunk software handles most multiline events correctly by default. If you have multiline events that Splunk software doesn't handle properly, you can configure the software to change its line breaking behavior.

How Splunk software determines event boundaries

Splunk software determines event boundaries in two steps:

  1. Line breaking, which uses the LINE_BREAKER attribute regular expression value to split the incoming stream of bytes into separate lines. By default, the LINE_BREAKER is any sequence of newlines and carriage returns (that is, ([\r\n]+)).
  2. Line merging, which only occurs when you configure the SHOULD_LINEMERGE setting to "true" (the default). This step uses all the other line merging settings (for example, BREAK_ONLY_BEFORE, BREAK_ONLY_BEFORE_DATE, MUST_BREAK_AFTER, etc.) to merge the previously separated lines into events.

If the second step does not run (because you set the SHOULD_LINEMERGE attribute to "false"), then the events are the individual lines that the LINE_BREAKER attribute determines. The first step is relatively efficient, while the second is relatively slow. Appropriate use of the LINE_BREAKER regular expression can produce the results you want in the first step. This is valuable if a significant amount of your data consists of multiline events.

How to configure event boundaries

Many event logs have a strict one-line-per-event format, but some do not. Splunk software can often recognize the event boundaries, but if event boundary recognition does not work properly, you can set custom rules in props.conf to establish event boundaries.

Edit props.conf to configure multiline events

  1. Examine the event format.
  2. Determine a pattern in the events to set as the start or end of an event.
  3. Edit $SPLUNK_HOME/etc/system/local/props.conf.
  4. In the props.conf file, set the necessary attributes to configure your data.
  5. Save the file and close it.
  6. Restart Splunk Enterprise to commit the changes.

There are two ways to handle multiline events:

Break and reassemble the data stream into events

This method usually simplifies the configuration process, as it gives you access to several attributes that you can use to define line-merging rules.

  1. Specify a stanza in props.conf that represents the stream of data you want to break and reassemble into events.
  2. In that stanza, use the LINE_BREAKER attribute to break the data stream into multiple lines.
  3. Set the SHOULD_LINEMERGE attribute to true.
  4. Set your line-merging attributes (BREAK_ONLY_BEFORE, etc.) to specify how to reassemble the lines into events.

If your data conforms well to the default LINE_BREAKER setting (any number of newlines and carriage returns), you don't need to alter the LINE_BREAKER setting. Instead, set SHOULD_LINEMERGE=true and use the line-merging attributes to reassemble it.

Break the data stream directly into real events with the LINE_BREAKER setting

Using the LINE_BREAKER setting to define event boundaries might increase your indexing speed, but is somewhat more difficult to work with. If you find that indexing is slow and a significant amount of your data consists of multiline events, this method can provide significant improvement.

  1. Specify a stanza in props.conf that represents the stream of data you want to break directly into events.
  2. Configure the LINE_BREAKER setting with a regular expression that matches the boundary that you want to use to break up the raw data stream into events.
  3. Set the SHOULD_LINEMERGE setting to false.

Line breaking general attributes

The following tables list the settings in the props.conf file that affect line breaking.

Attribute Description Default
TRUNCATE = <non-negative integer> Change the default maximum line length (in bytes). Although this attribute is a byte measurement, Splunk rounds down line length when this attribute would otherwise land mid-character for multibyte characters.

Set to 0 if you never want truncation (very long lines are, however, often a sign of garbage data).

10000 bytes
LINE_BREAKER = <regular expression> A regular expression that determines how the raw text stream gets broken into initial events, before any line merging takes place (if specified by the SHOULD_LINEMERGE attribute, described below).

The expression must contain a capturing group (a pair of parentheses that defines an identified subcomponent of the match.)

Wherever the expression matches, Splunk software considers the start of the first capturing group to be the end of the previous event, and considers the end of the first capturing group to be the start of the next event.

Splunk software discards the contents of the first capturing group. This content will not be present in any event, as Splunk software considers this text to come between lines.

You can realize a significant boost to processing speed when you use LINE_BREAKER to delimit multiline events (as opposed to using SHOULD_LINEMERGE to reassemble individual lines into multiline events). Consider using this method if a significant portion of your data consists of multiline events.

See the props.conf specification file for information on how to use LINE_BREAKER with branched expressions and additional information.

([\r\n]+) (Splunk software breaks data into an event for each line, delimited by any number of carriage return (\r) or newline (\n) characters.
LINE_BREAKER_LOOKBEHIND = <integer> When there is leftover data from a previous raw chunk, LINE_BREAKER_LOOKBEHIND indicates the number of characters before the end of the raw chunk (with the next chunk concatenated) that Splunk software applies the LINE_BREAKER regular expression. You might want to increase this value from its default if you are dealing with especially large or multiline events. 100 characters
SHOULD_LINEMERGE = [true|false] When set to true, Splunk software combines several input lines into a single event, with configuration based on the attributes described in the next section. true

Attributes that apply only when the SHOULD_LINEMERGE setting is true

When you set SHOULD_LINEMERGE=true (the default), use these attributes to define line breaking behavior.

Attribute Description Default
BREAK_ONLY_BEFORE_DATE = [true|false] When set to true, Splunk software creates a new event if it encounters a new line with a date. true

Note: If you configure the DATETIME_CONFIG setting to CURRENT or NONE, this attribute is not meaningful, because in those cases, Splunk software does not identify timestamps.

BREAK_ONLY_BEFORE = <regular expression> When set, Splunk software creates a new event if it encounters a new line that matches the regular expression. empty string
MUST_BREAK_AFTER = <regular expression> When set and the regular expression matches the current line, Splunk software always creates a new event for the next input line. Splunk software might still break before the current line if another rule matches. empty string
MUST_NOT_BREAK_AFTER = <regular expression> When set and the current line matches the regular expression, Splunk software does not break on any subsequent lines until the MUST_BREAK_AFTER expression matches. empty string
MUST_NOT_BREAK_BEFORE = <regular expression> When set and the current line matches the regular expression, Splunk software does not break the last event before the current line. empty string
MAX_EVENTS = <integer> Specifies the maximum number of input lines that Splunk software adds to any event. The software breaks the event after it reads the specified number of lines. 256 lines

Examples of configuring event line breaking

Specify event breaks

[my_custom_sourcetype]
BREAK_ONLY_BEFORE = ^\d+\s*$

Assume that any line that consists of only digits is the start of a new event for any data whose source type is set to my_custom_sourcetype.

Merge multiple lines into a single event

The following log event contains several lines that are part of the same request. The differentiator between requests is "Path". For this example, assume that all these lines need to be shown as a single event entry.

{{"2006-09-21, 02:57:11.58", 122, 11, "Path=/LoginUser Query=CrmId=ClientABC&ContentItemId=TotalAccess&SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&SessionTime=25368&ReturnUrl=http://www.clientabc.com, Method=GET, IP=209.51.249.195, Content=", ""}}
{{"2006-09-21, 02:57:11.60", 122, 15, "UserData:<User CrmId="clientabc" UserId="p12345678"><EntitlementList></EntitlementList></User>", ""}}
{{"2006-09-21, 02:57:11.60", 122, 15, "New Cookie: SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&CrmId=clientabc&UserId=p12345678&AccountId=&AgentHost=man&AgentId=man, MANUser: Version=1&Name=&Debit=&Credit=&AccessTime=&BillDay=&Status=&Language=&Country=&Email=&EmailNotify=&Pin=&PinPayment=&PinAmount=&PinPG=&PinPGRate=&PinMenu=&", ""}}

To index this multiline event properly, use the Path differentiator in your configuration. Add the following to your $SPLUNK_HOME/etc/system/local/props.conf:

[source::source-to-break]
SHOULD_LINEMERGE = True
BREAK_ONLY_BEFORE = Path=

This code tells Splunk software to merge the lines of the event, and only break before the term Path=.

Multiline event line breaking and segmentation limitations

Splunk software applies line breaking and segmentation limitations to extremely large events:

  • Events over MAX_EVENTS lines. If Splunk software encounters a multiline event that exceeds the number of lines that has been specified in MAX_EVENTS, it breaks the event at that limit, sets the BREAK_ONLY_BEFORE_DATE setting to false (if it is true), and then drops any MUST_NOT_BREAK_BEFORE or MUST_NOT_BREAK_AFTER rules. This can result in events not being line broken as you would expect. To work around the problem, you can raise the MAX_EVENTS setting, but you might get better results by changing the SHOULD_LINEMERGE setting to false and by specifying the event boundary with the LINE_BREAKER setting.
  • Lines over 10,000 bytes. Splunk software breaks lines over 10,000 bytes into multiple lines of 10,000 bytes each when it indexes them. It appends the field meta::truncated to the end of each truncated section. It still groups these lines into a single event.
  • Segmentation for events over 100,000 bytes. In search results, Splunk Web displays the first 100,000 bytes of an event. Segments after those first 100,000 bytes of a very long line are still searchable, however.
  • Segmentation for events over 1,000 segments. In search results, Splunk Web displays the first 1,000 segments of an event as segments separated by whitespace and highlighted on mouseover. It displays the rest of the event as raw text without interactive formatting.

Answers

Have questions? Visit Splunk Answers and see what questions and answers the Splunk community has around line breaking.

PREVIOUS
Configure character set encoding
  NEXT
Configure event timestamps

This documentation applies to the following versions of Splunk® Enterprise: 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13, 6.3.14, 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.4.9, 6.4.10, 6.4.11, 6.5.0, 6.5.1, 6.5.1612 (Splunk Cloud only), 6.5.2, 6.5.3, 6.5.4, 6.5.5, 6.5.6, 6.5.7, 6.5.8, 6.5.9, 6.5.10, 6.6.0, 6.6.1, 6.6.2, 6.6.3, 6.6.4, 6.6.5, 6.6.6, 6.6.7, 6.6.8, 6.6.9, 6.6.10, 6.6.11, 6.6.12, 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.3.0


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters