
Configure event line breaking
Some events consist of more than one line. Splunk Enterprise handles most multiline events correctly by default. If you have multiline events that Splunk Enterprise doesn't handle properly, you need to configure the software to change its line breaking behavior.
How Splunk Enterprise determines event boundaries
Splunk Enterprise determines event boundaries in two steps:
1. Line breaking, which uses the LINE_BREAKER
attribute's regular expression (regex) value to split the incoming stream of bytes into separate lines. By default, the LINE_BREAKER
is any sequence of newlines and carriage returns (that is, ([\r\n]+)
).
2. Line merging, which only occurs when the SHOULD_LINEMERGE
attribute is set to "true" (the default). This step uses all the other line merging settings (for example, BREAK_ONLY_BEFORE, BREAK_ONLY_BEFORE_DATE, MUST_BREAK_AFTER,
etc.) to merge the previously-separated lines into events.
If the second step does not run (because you set the SHOULD_LINEMERGE
attribute to "false"), then the events are simply the individual lines determined by LINE_BREAKER
. The first step is relatively efficient, while the second is relatively slow. If you are clever with the LINE_BREAKER
regex, you can often make Splunk Enterprise get the desired result by using only the first step, and skipping the second step. This is particularly valuable if a significant amount of your data consists of multiline events.
How to configure event boundaries
Many event logs have a strict one-line-per-event format, but some do not. Usually, Splunk Enterprise can automatically recognize the event boundaries. However, if event boundary recognition is not working right, you can set custom rules in props.conf.
To configure multiline events, first examine the format of the events. Determine a pattern in the events to set as the start or end of an event. Then, edit $SPLUNK_HOME/etc/system/local/props.conf
, and set the necessary attributes to configure your data.
There are two ways to handle multiline events:
- Break the data stream into lines and reassemble into events. This method usually simplifies the configuration process, as it gives you access to several attributes that you can use to define line-merging rules. Use the
LINE_BREAKER
attribute to break the data stream into multiple lines. Along with this, setSHOULD_LINEMERGE=true
and set your line-merging attributes (BREAK_ONLY_BEFORE
, etc.) to tell Splunk how to reassemble the lines into events. If your data conforms well to the defaultLINE_BREAKER
setting (any number of newlines and carriage returns), you don’t need to alterLINE_BREAKER
. Instead, just setSHOULD_LINEMERGE=true
and use the line-merging attributes to reassemble it. - Break the data stream directly into real events using the
LINE_BREAKER
feature. This might increase your indexing speed, but is somewhat more difficult to work with. If you're finding that indexing is slow and a significant amount of your data consists of multiline events, this method can provide significant improvement. Use theLINE_BREAKER
attribute withSHOULD_LINEMERGE=false
.
These attributes are described below.
Line breaking general attributes
These are the props.conf
attributes that affect line breaking:
Attribute | Description | Default |
---|---|---|
TRUNCATE = <non-negative integer>
|
* Change the default maximum line length (in bytes). Note that although this attribute is a byte measurement, Splunk rounds down line length when this attribute would otherwise land mid-character for multibyte characters.
|
10000 bytes |
LINE_BREAKER = <regular expression>
|
* Specifies a regex that determines how the raw text stream gets broken into initial events, before any line merging takes place (if specified by the SHOULD_LINEMERGE attribute, described below).
|
([\r\n]+) (This means Splunk Enterprise breaks data into an event for each line, delimited by any number of carriage return (\r ) or newline (\n ) characters.
|
LINE_BREAKER_LOOKBEHIND = <integer>
|
* When there is leftover data from a previous raw chunk, LINE_BREAKER_LOOKBEHIND indicates the number of characters before the end of the raw chunk (with the next chunk concatenated) that Splunk applies the LINE_BREAKER regex. You might want to increase this value from its default if you are dealing with especially large or multiline events.
|
100 characters |
SHOULD_LINEMERGE = [true|false]
|
* When set to true, Splunk Enterprise combines several input lines into a single event, with configuration based on the attributes described in the next section. | true |
Attributes that apply only when SHOULD_LINEMERGE is set to true
When SHOULD_LINEMERGE=true
(the default), use these attributes to define line breaking behavior:
Attribute | Description | Default |
---|---|---|
BREAK_ONLY_BEFORE_DATE = [true|false]
|
* When set to true, Splunk Enterprise creates a new event if, and only if, it encounters a new line with a date. | true
|
BREAK_ONLY_BEFORE = <regular expression>
|
* When set, Splunk Enterprise creates a new event if, and only if, it encounters a new line that matches the regular expression. | empty string |
MUST_BREAK_AFTER = <regular expression>
|
* When set and the regular expression matches the current line, Splunk Enterprise always creates a new event for the next input line.
|
empty string |
MUST_NOT_BREAK_AFTER = <regular expression>
|
* When set and the current line matches the regular expression, Splunk Enterprise does not break on any subsequent lines until the MUST_BREAK_AFTER expression matches.
|
empty string |
MUST_NOT_BREAK_BEFORE = <regular expression>
|
* When set and the current line matches the regular expression, Splunk Enterprise does not break the last event before the current line. | empty string |
MAX_EVENTS = <integer>
|
* Specifies the maximum number of input lines that will be added to any event.
|
256 lines |
Examples
Specify event breaks
[my_custom_sourcetype] BREAK_ONLY_BEFORE = ^\d+\s*$
This example instructs Splunk Enterprise to divide events by assuming that any line that consists of only digits is the start of a new event. It does this for any data whose source type is set to my_custom_sourcetype
.
Merge multiple lines into a single event
The following log event contains several lines that are part of the same request. The differentiator between requests is "Path". For this example, assume that all these lines need to be shown as a single event entry.
{{"2006-09-21, 02:57:11.58", 122, 11, "Path=/LoginUser Query=CrmId=ClientABC&ContentItemId=TotalAccess&SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&SessionTime=25368&ReturnUrl=http://www.clientabc.com, Method=GET, IP=209.51.249.195, Content=", ""}}
{{"2006-09-21, 02:57:11.60", 122, 15, "UserData:<User CrmId="clientabc" UserId="p12345678"><EntitlementList></EntitlementList></User>", ""}}
{{"2006-09-21, 02:57:11.60", 122, 15, "New Cookie: SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&CrmId=clientabc&UserId=p12345678&AccountId=&AgentHost=man&AgentId=man, MANUser: Version=1&Name=&Debit=&Credit=&AccessTime=&BillDay=&Status=&Language=&Country=&Email=&EmailNotify=&Pin=&PinPayment=&PinAmount=&PinPG=&PinPGRate=&PinMenu=&", ""}}
To index this multiple line event properly, use the Path
differentiator in your configuration. Add the following to your $SPLUNK_HOME/etc/system/local/props.conf
:
[source::source-to-break] SHOULD_LINEMERGE = True BREAK_ONLY_BEFORE = Path=
This code tells Splunk Enterprise to merge the lines of the event, and only break before the term Path=
.
Multiline event line breaking and segmentation limitations
Splunk Enterprise applies line breaking and segmentation limitations to extremely large events:
- Lines over 10,000 bytes: Splunk Enterprise breaks lines over 10,000 bytes into multiple lines of 10,000 bytes each when it indexes them. It appends the field
meta::truncated
to the end of each truncated section. However, Splunk Enterprise still groups these lines into a single event. - Segmentation for events over 100,000 bytes: In search results, Splunk Enterprise only displays the first 100,000 bytes of an event. Segments after those first 100,000 bytes of a very long line are still searchable, however.
- Segmentation for events over 1,000 segments: In search results, Splunk Enterprise displays the first 1,000 individual segments of an event as segments separated by whitespace and highlighted on mouseover. It displays the rest of the event as raw text without interactive formatting.
Answers
Have questions? Visit Splunk Answers and see what questions and answers the Splunk community has around line breaking.
PREVIOUS Configure character set encoding |
NEXT Configure event timestamps |
This documentation applies to the following versions of Splunk® Enterprise: 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.0.15, 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.1.14, 6.2.0, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.2.14, 6.2.15
Feedback submitted, thanks!