Splunk® Data Stream Processor

Function Reference

Acrobat logo Download manual as PDF

Acrobat logo Download topic as PDF

Apply Line Break

This topic describes how to use the function in the Splunk Data Stream Processor.

Description

Breaks and merges universal forwarder events using a specified break type. Events typically come from the universal forwarder in 64KB chunks, and require additional parsing to be processed in DSP correctly. Use this function to configure DSP to break and merge these events so that it is in the format that you want.

This function performs two actions: line breaking and line merging.

  • In the line breaking phase, this function uses a specified line breaker pattern to split the incoming stream of data into separate lines.
  • In the line merging phase, this function takes the separated lines from the line breaking phase and merges them into events.

For convenience, this function is also included with the universal forwarder template. See Process data from a universal forwarder in DSP.

Function Input/Output Schema

Function Input
collection<record<R>>
This function takes in collections of records with schema R.
Function Output
collection<record<R>>
This function outputs collections of records with schema R.

Syntax

The required fields are in bold.

If linebreak_type = auto:

linebreak
linebreak_type = auto

If linebreak_type = advanced:

linebreak
linebreak_type = advanced
line_breaker = <regex>
truncate = <integer>

If linebreak_type = config:

linebreak
linebreak_type = config
props_conf = <string>

Required arguments

linebreak_type
Syntax: auto | advanced | config
Description: See the table for a description of each break type. All three of these break types perform line merging after breaking by default.
Example: auto
Break Type Description
Auto Break events based on the location of timestamps in the data, and merges lines after the timestamp. Creates a new event when another timestamp is detected. This setting uses DSP's built-in timestamp rules to detect timestamps.
Advanced Break events based on a custom regular expression pattern. The default pattern is [\r\n]+, which breaks data into an event for each line, delimited by any number of carriage return (\r) or newline (\n) characters.
Config Break events based on an existing line breaking props.conf configuration. See props.conf. Use this break type if you want to reuse and migrate your existing props.conf line_breaking settings to DSP.
props_conf
Syntax: string
Description: Required if you are using the config linebreak_type. The line breaking stanza configurations from props.conf. See configure event line breaking in the Splunk Enterprise documentation for a description of available line breaking attributes.
Example: MAX_EVENTS=3 MUST_NOT_BREAK_BEFORE=^.*fifth SHOULD_LINEMERGE=true BREAK_ONLY_BEFORE_DATE=false

Optional arguments

line_breaker
Syntax: Regular expression
Description: A regular expression that determines how the incoming data is broken into initial events, before line merging takes place.
Default: [\r\n]+
Example: [\r\n]+
truncate
Syntax: a non-negative integer
Description: The default maximum line length, in bytes. DSP rounds down line length when this attribute would otherwise land mid-character for multibyte characters.
Default: 10000
Example: 5000

SPL2 examples

Examples of common use cases follow. The following examples in this section assume that you are using the SPL2 Pipeline Builder.

If you are using the SPL2 Pipeline Builder, you must escape any backslash ( \ ) characters. If you are using the Canvas Builder, backslash characters are automatically escaped. See Using regular expressions in the Canvas Builder vs the SPL2 Pipeline Builder .

1. Reuse an existing props.conf setting that creates a new event when it encounters a new line with a date and merges the lines following the date.

... | apply_line_break props_conf="BREAK_ONLY_BEFORE_DATE=true\nSHOULD_LINEMERGE=true" linebreak_type="config" |...;

Incoming event:

Event {
 body = "This is the first line. 
12/31/2017-05:43:11.325
This is the third line. 
This is the fourth line. 
Fifth. 
"
}

Outgoing events:

Event {
 body = "This is the first line."
}
Event {
  body = 12/31/2017-05:43:11.325This is the third line.This is the fourth line.Fifth." 
}

2. Break an event at every new line using a regular expression pattern.

... | apply_line_break line_breaker="[\\r\\n]+" linebreak_type="advanced" |...;

Incoming event:

Event {
 body = "This is the first line. 
12/31/2017-05:43:11.325
This is the third line. 
"
}

Outgoing events:

Event {
 body = "This is the first line."
}
Event {
  body = "12/31/2017-05:43:11.325" 
}
Event {
   body = "This is the third line."
} 

3. Break a single event into multiple events using an existing props.conf configuration

... | apply_line_break props_conf="SHOULD_LINEMERGE = True\nBREAK_ONLY_BEFORE = Path=" linebreak_type="config" |...;

Incoming event:

Event {
 body = "This is first event \r\n{{"2007-09-21, 02:57:11.58", 122, 11, "Path=/LoginUser Query=CrmId=ClientABC&ContentItemId=TotalAccess&SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&SessionTime=25368&ReturnUrl=http://www.clientabc.com, Method=GET, IP=209.51.249.195, Content=", ""}}\n{{"2006-09-21, 02:57:11.60", 122, 15, "UserData:<User CrmId="clientabc" UserId="p12345678"><EntitlementList></EntitlementList></User>", ""}}\n{{"2006-09-21, 02:57:11.60", 122, 15, "New Cookie: SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&CrmId=clientabc&UserId=p12345678&AccountId=&AgentHost=man&AgentId=man, MANUser: Version=1&Name=&Debit=&Credit=&AccessTime=&BillDay=&Status=&Language=&Country=&Email=&EmailNotify=&Pin=&PinPayment=&PinAmount=&PinPG=&PinPGRate=&PinMenu=&", ""}}\n"
}

Outgoing events:

Event {
 body = "This is first event"
}
Event {
 body = "{{"2007-09-21, 02:57:11.58", 122, 11, "Path=/LoginUser Query=CrmId=ClientABC&ContentItemId=TotalAccess&SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&SessionTime=25368&ReturnUrl=http://www.clientabc.com, Method=GET, IP=209.51.249.195, Content=", ""}}{{"2006-09-21, 02:57:11.60", 122, 15, "UserData:<User CrmId="clientabc" UserId="p12345678"><EntitlementList></EntitlementList></User>", ""}}{{"2006-09-21, 02:57:11.60", 122, 15, "New Cookie: SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&CrmId=clientabc&UserId=p12345678&AccountId=&AgentHost=man&AgentId=man, MANUser: Version=1&Name=&Debit=&Credit=&AccessTime=&BillDay=&Status=&Language=&Country=&Email=&EmailNotify=&Pin=&PinPayment=&PinAmount=&PinPG=&PinPGRate=&PinMenu=&", ""}}"
}

4. Break events at the timestamp and merges the lines following the timestamp until another timestamp is found

| apply_line_break linebreak_type="auto" |...;

Incoming event:

Event {
 body = "06-01-2020 11:17:29.259 -0700 INFO StatusMgr - destHost=my.host.com, destIp=44.226.233.96, destPort=9997, eventType=connect_try, publisher=tcpout, sourcePort=8089, statusee=TcpOutputProcessor\n06-01-2020 11:17:29.628 -0700 INFO StatusMgr - destHost=my.host.com, destIp=44.226.233.96, destPort=9997, eventType=connect_done, publisher=tcpout, sourcePort=8089, statusee=TcpOutputProcessor\n
"
}

Outgoing events:

Event {
 body = "06-01-2020 11:17:29.259 -0700 INFO StatusMgr - destHost=my.host.com, destIp=44.226.233.96, destPort=9997, eventType=connect_try, publisher=tcpout, sourcePort=8089, statusee=TcpOutputProcessor"
}
Event {
 body = "06-01-2020 11:17:29.628 -0700 INFO StatusMgr - destHost=my.host.com, destIp=44.226.233.96, destPort=9997, eventType=connect_done, publisher=tcpout, sourcePort=8089, statusee=TcpOutputProcessor"
}
Last modified on 23 October, 2020
PREVIOUS
Aggregate with Trigger
  NEXT
Apply ML Model

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0


Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters