Configure event line breaking
Some events consist of more than one line. The Splunk platform handles most multiline events correctly by default. If you have multiline events that the Splunk platform doesn't handle properly, you can configure it to change its line breaking behavior.
If you use Splunk Cloud Platform, you can do the following:
- Forward any data where you need to configure event line breaking, because there is no way to configure event line breaking in the Splunk Web interface. You can use a heavy forwarder to break incoming data into lines and subsequently merge them as you want into events prior to sending data to your Splunk Cloud Platform instance.
- If you have access to the Edge Processor solution, you can use Edge Processors to configure event line breaking, see Using source types to break and merge data in Edge Processors and About the Edge Processor solution in the Splunk Use Edge Processors manual.
If you use Splunk Enterprise, you can configure the settings and follow the procedures in this topic on any instance that indexes the incoming data stream.
How the Splunk platform determines event boundaries
The Splunk platform determines event boundaries in two phases:
- Line breaking, which uses the
LINE_BREAKER
setting to split the incoming stream of data into separate lines. By default, theLINE_BREAKER
value is any sequence of newlines and carriage returns. In regular expression format, this is represented as the following string:([\r\n]+)
. You don't normally need to adjust this setting, but in cases where it's necessary, you must configure it in the props.conf configuration file on the forwarder that sends the data to Splunk Cloud Platform or a Splunk Enterprise indexer. TheLINE_BREAKER
setting expects a value in regular expression format. - Line merging, which uses the
SHOULD_LINEMERGE
setting to merge previously separated lines into events. By default, the Splunk platform performs line merging, and the value forSHOULD_LINEMERGE
istrue
. You don't normally need to adjust this setting, but in cases where it is necessary, you must configure this setting in the props.conf configuration file on the forwarder that sends the data to Splunk Cloud Platform. If you configure the Splunk platform to not perform line merging by setting theSHOULD_LINEMERGE
attribute tofalse
, then the platform splits the incoming data into lines according to what theLINE_BREAKER
setting determines.
Line breaking is relatively efficient for the Splunk platform, while line merging is relatively slow. Using the LINE_BREAKER
setting can produce the results you want in the line breaking phase. This is valuable if a significant amount of your data consists of multiline events.
There are additional configuration settings that help you break your incoming data stream into events, such as line-breaking.
How to configure event boundaries
Many event logs have a strict one-line-per-event format, but others don't. The Splunk platform can often recognize the event boundaries, but if event boundary recognition doesn't occur, or happens incorrectly, you can set custom rules in the props.conf configuration file to establish event boundaries.
Requirements for configuring event boundaries
Before you attempt to configure event boundaries for your events, confirm that you have the following:
- An understanding of regular expressions. The
LINE_BREAKER
setting uses a regular expression to determine what the boundary of an event is. - One of the following, depending on whether you use Splunk Cloud Platform or Splunk Enterprise:
- A heavy forwarder that has been configured to send data to your Splunk Cloud Platform instance. You can download the Splunk Cloud Platform universal forwarder credentials package that comes with your Splunk Cloud Platform instance and install it on a Splunk heavy forwarder.
- A Splunk Enterprise indexer or heavy forwarder, if you use Splunk Enterprise.
- A file that represents the data stream where you want to configure custom line breaking.
Edit the props.conf configuration file to configure multiline events
- Examine the file that you want to index to determine its event format.
- In the file, look for a pattern in the events to set as the start or end of an event.
- Using a text editor, on the forwarder you have configured to send data to Splunk Cloud Platform, edit the $SPLUNK_HOME/etc/system/local/props.conf configuration file.
- In the props.conf configuration file, add the necessary line breaking and line merging settings to configure the forwarder to perform the correct line breaking on your incoming data stream.
- Save the file and close it.
- Restart the forwarder to commit the changes.
There are two ways to handle multiline events:
- Break and reassemble the data stream into events.
- Break the data stream directly into real events with the
LINE_BREAKER
setting.
Break and reassemble the data stream into events
This method oftentimes simplifies the configuration process, as it gives you access to several settings that you can use to define line-merging rules.
You must perform these steps on the heavy forwarder that you have designated to send data to your Splunk Cloud Platform instance.
- On the forwarder that is to send data to your Splunk Cloud Platform instance, use a text editor to open $SPLUNK_HOME/etc/system/local/props.conf for editing.
- In this file, specify a stanza in the props.conf configuration file that represents the stream of data you want to break and reassemble into events.
- In that stanza, configure the
LINE_BREAKER
setting with a regular expression that breaks the data stream into multiple lines. - Add the
SHOULD_LINEMERGE
setting to the stanza, and set its value totrue
. - Configure additional line-merging settings, such as
BREAK_ONLY_BEFORE
and others, to specify how the forwarder is to reassemble the lines into events. For more information on the line-merging settings, see Attributes that apply only when the SHOULD_LINEMERGE setting is true later in this topic.
If your data conforms well to the default LINE_BREAKER
value, which is any number of newlines and carriage returns, you don't need to change the LINE_BREAKER
setting. Instead, set SHOULD_LINEMERGE=true
and use the line-merging settings to reassemble the data.
Break the data stream directly into real events with the LINE_BREAKER setting
Using the LINE_BREAKER
setting to define event boundaries might increase your indexing speed, but is somewhat more difficult to work with. If you find that indexing is slow and a significant amount of your data consists of multiline events, this method can provide significant improvement.
- Specify a stanza in props.conf that represents the stream of data you want to break directly into events.
- Under this stanza, configure the
LINE_BREAKER
setting with a regular expression that matches the boundary that you want to use to break up the raw data stream into events. - Add the
SHOULD_LINEMERGE
setting, and configure it tofalse
.
Line breaking general settings
The following tables list the settings in the props.conf file that affect line breaking.
Attribute | Description | Default |
---|---|---|
TRUNCATE = <non-negative integer>
|
Changes the default maximum line length, in bytes. Although this setting is a byte measurement, the Splunk platform rounds down line length when this attribute would otherwise land mid-character for multibyte characters.
Set to 0 if you never want truncation. However, very long lines are often a sign of garbage data. |
10000 |
LINE_BREAKER = <regular expression>
|
A regular expression that determines how the Splunk platform breaks the raw text stream into initial events, before any line merging takes place. This setting is dependent upon the SHOULD_LINEMERGE setting, described later.
The expression must contain a capturing group, which is a pair of parentheses that defines an identified subcomponent of the match. Wherever the expression matches, the Splunk platform considers the start of the first capturing group to be the end of the previous event, and considers the end of the first capturing group to be the start of the next event. The platform discards the contents of the first capturing group. This content will not be present in any event, as the platform considers this text to come between lines. You can realize a significant boost to processing speed when you use the See the props.conf specification file for information on how to use |
([\r\n]+) The Splunk platform breaks data into an event for each line, delimited by any number of carriage return (\r ) or newline (\n ) characters.
|
LINE_BREAKER_LOOKBEHIND = <integer>
|
When there is leftover data from a previous raw chunk, LINE_BREAKER_LOOKBEHIND indicates the number of characters before the end of the raw chunk, with the next chunk concatenated, where the Splunk platform applies the LINE_BREAKER regular expression. You might want to increase this value from its default if you are dealing with especially large or multiline events.
|
100 |
SHOULD_LINEMERGE = [true|false]
|
When set to true , the Splunk platform combines several input lines into a single event, with configuration based on the settings described in the next section.
|
true |
Attributes that apply only when the SHOULD_LINEMERGE setting is true
When you set SHOULD_LINEMERGE
to the default of true
, use these additional settings to define line breaking behavior.
Attribute | Description | Default |
---|---|---|
BREAK_ONLY_BEFORE_DATE = [true|false]
|
When set to true, the Splunk platform creates a new event if it encounters a new line with a date. | true
If you configure the |
BREAK_ONLY_BEFORE = <regular expression>
|
When set, the Splunk platform creates a new event if it encounters a new line that matches the regular expression. | empty string |
MUST_BREAK_AFTER = <regular expression>
|
When set, and the regular expression matches the current line, the Splunk platform always creates a new event for the next input line. The platform might still break before the current line if another rule matches. | empty string |
MUST_NOT_BREAK_AFTER = <regular expression>
|
When set, and the current line matches the regular expression, the Splunk platform doesn't break on any subsequent lines until the MUST_BREAK_AFTER expression matches.
|
empty string |
MUST_NOT_BREAK_BEFORE = <regular expression>
|
When set and the current line matches the regular expression, the Splunk platform doesn't break the last event before the current line. | empty string |
MAX_EVENTS = <integer>
|
Specifies the maximum number of input lines that the Splunk platform adds to any event. The software breaks the event after it reads the specified number of lines. | 256 lines |
Examples of configuring event line breaking
Specify event breaks
The following example configures the Splunk platform to identify any line that consists of only digits as the start of a new event for any data whose source type is set to my_custom_sourcetype
.
[my_custom_sourcetype] BREAK_ONLY_BEFORE = ^\d+\s*$
Merge multiple lines into a single event
The following log event contains several lines that are part of the same request. The differentiator between requests is "Path".
{{"2006-09-21, 02:57:11.58", 122, 11, "Path=/LoginUser Query=CrmId=ClientABC&ContentItemId=TotalAccess&SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&SessionTime=25368&ReturnUrl=http://www.clientabc.com, Method=GET, IP=209.51.249.195, Content=", ""}}
{{"2006-09-21, 02:57:11.60", 122, 15, "UserData:<User CrmId="clientabc" UserId="p12345678"><EntitlementList></EntitlementList></User>", ""}}
{{"2006-09-21, 02:57:11.60", 122, 15, "New Cookie: SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&CrmId=clientabc&UserId=p12345678&AccountId=&AgentHost=man&AgentId=man, MANUser: Version=1&Name=&Debit=&Credit=&AccessTime=&BillDay=&Status=&Language=&Country=&Email=&EmailNotify=&Pin=&PinPayment=&PinAmount=&PinPG=&PinPGRate=&PinMenu=&", ""}}
To index this multiline event properly, use the Path
differentiator in your configuration. Add the following to your $SPLUNK_HOME/etc/system/local/props.conf
file.
[source::source-to-break] SHOULD_LINEMERGE = True BREAK_ONLY_BEFORE = Path=
This code configures the Splunk platform to merge the lines of the event, and only break before the term Path=
.
Multiline event line breaking and segmentation limitations
The Splunk platform applies line breaking and segmentation limitations to extremely large events:
Limitation | Description |
---|---|
Events over MAX_EVENTS lines
|
If the platform encounters a multiline event that exceeds the number of lines that you specified in MAX_EVENTS , it breaks the event at that limit, sets the BREAK_ONLY_BEFORE_DATE setting to false if it is true, and then drops any MUST_NOT_BREAK_BEFORE or MUST_NOT_BREAK_AFTER rules. This can result in events not being line broken as you would expect. To work around the problem, you can raise the MAX_EVENTS setting, but you might get better results by changing the SHOULD_LINEMERGE setting to false and by specifying the event boundary with the LINE_BREAKER setting.
|
Lines that exceed 10,000 bytes in length. | The Splunk platform uses the LINE_BREAKER and TRUNCATE settings to evaluate and break events over 10kB into multiple lines of 10kB each. It adds the index time field meta::truncated . If you have also configured SHOULD_LINEMERGE to true , the platform evaluates any additional event data using the props.conf rules until it can create a complete event.
|
Segmentation for events over 100,000 bytes | In search results, Splunk Web displays the first 100,000 bytes of an event. Segments after those first 100,000 bytes of a very long line are still searchable, however. |
Segmentation for events over 1,000 segments | In search results, Splunk Web displays the first 1,000 segments of an event as segments separated by whitespace and highlighted on mouseover. It displays the rest of the event as raw text without interactive formatting. |
Configure character set encoding | Configure event timestamps |
This documentation applies to the following versions of Splunk Cloud Platform™: 8.2.2112, 8.2.2201, 8.2.2202, 8.2.2203, 9.0.2205, 9.0.2208, 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403 (latest FedRAMP release), 9.2.2406
Feedback submitted, thanks!