On October 30, 2022, all 1.2.x versions of the Splunk Data Stream Processor will reach its end of support date. See the Splunk Software Support Policy for details.
Apply Line Break
This topic describes how to use the function in the .
Description
The Apply Line Break function breaks and merges universal forwarder events using a specified break type. Events typically come from the universal forwarder in 64KB chunks, and require additional parsing to be processed in the correctly. Use this function to configure the to break and merge these events so that it is in the format that you want.
This function performs two actions: line breaking and line merging.
- In the line breaking phase, this function uses a specified line breaker pattern to split the incoming stream of data into separate lines.
- In the line merging phase, this function takes the separated lines from the line breaking phase and merges them into events.
This function must be added immediately after the source function or, in the case there are multiple sources, after the union of sources. See the "Examples" section for examples on how to use this function.
Function Input/Output Schema
- Function Input
- collection<record<R>>
- This function takes in collections of records with schema R.
- Function Output
- collection<record<R>>
- This function outputs collections of records with schema R.
Syntax
The required fields are in bold.
If linebreak_type = auto:
- linebreak
- linebreak_type = auto
If linebreak_type = advanced:
- linebreak
- linebreak_type = advanced
- line_breaker = <regex>
- truncate = <integer>
If linebreak_type = config:
- linebreak
- linebreak_type = config
- props_conf = <string>
Required arguments
- linebreak_type
- Syntax: auto | advanced | config
- Description: See the table for a description of each break type. All three of these break types perform line merging after breaking by default.
- Example in Canvas View: auto
Break Type Description Auto Break events based on the location of timestamps in the data, and merges lines after the timestamp. Creates a new event when another timestamp is detected. This setting uses built-in timestamp rules to detect timestamps. Advanced Break events based on a custom regular expression pattern. The default pattern is [\r\n]+
, which breaks data into an event for each line, delimited by any number of carriage return (\r) or newline (\n) characters.Config Break events based on an existing line breaking props.conf
configuration. See props.conf. Use this break type if you want to reuse and migrate your existingprops.conf
line_breaking settings to the .
- props_conf
- Syntax: string
- Description: Required if you are using the config linebreak_type. The line breaking stanza configurations from props.conf. See configure event line breaking in the Splunk Enterprise documentation for a description of available line breaking attributes.
- Example in Canvas View: MAX_EVENTS=3 MUST_NOT_BREAK_BEFORE=^.*fifth SHOULD_LINEMERGE=true BREAK_ONLY_BEFORE_DATE=false
Optional arguments
- line_breaker
- Syntax: Regular expression
- Description: A regular expression that determines how the incoming data is broken into initial events, before line merging takes place.
- Default: [\r\n]+
- Example in Canvas View: [\r\n]+
- truncate
- Syntax: a non-negative integer
- Description: The default maximum line length, in bytes. The rounds down line length when this attribute would otherwise land mid-character for multibyte characters.
- Default: 10000
- Example in Canvas View: 5000
Examples
Examples of common use cases follow. These examples assume that you have added the function to your pipeline.
If you are using the SPL2 Pipeline Builder, you must escape any backslash ( \ ) characters. If you are in the Canvas View, backslash characters are automatically escaped. See Using regular expressions in the Canvas View vs the SPL2 Pipeline View.
1. SPL2 Example: Reuse an existing props.conf setting that creates a new event when it encounters a new line with a date and merges the lines following the date.
This example assumes that you are in the SPL View.
| from forwarders("forwarders:all") | apply_line_breaking props_conf="BREAK_ONLY_BEFORE_DATE=true\nSHOULD_LINEMERGE=true" linebreak_type="config" |...;
Incoming event:
Event { body = "This is the first line. 12/31/2017-05:43:11.325 This is the third line. This is the fourth line. Fifth. " }
Outgoing events:
Event { body = "This is the first line." }
Event { body = 12/31/2017-05:43:11.325This is the third line.This is the fourth line.Fifth." }
2. SPL2 Example: Break an event at every new line using a regular expression pattern.
This example assumes that you are in the SPL View, and shows a partial pipeline with two sources, Splunk DSP Firehose and the Forwarders service.
$firehose = | from splunk_firehose(); $forwarders = | from forwarders("forwarders:all"); | from $forwarders | union $statement_2 | apply_line_breaking linebreak_type="auto";
Incoming event:
Event { body = "This is the first line. 12/31/2017-05:43:11.325 This is the third line. " }
Outgoing events:
Event { body = "This is the first line." }
Event { body = "12/31/2017-05:43:11.325" }
Event { body = "This is the third line." }
3. SPL2 Example: Break a single event into multiple events using an existing props.conf configuration
This example assumes that you are in the SPL View.
| from forwarders("forwarders:all") | apply_line_breaking props_conf="SHOULD_LINEMERGE = True\nBREAK_ONLY_BEFORE = Path=" linebreak_type="config" |...;
Incoming event:
Event { body = "This is first event \r\n{{"2007-09-21, 02:57:11.58", 122, 11, "Path=/LoginUser Query=CrmId=ClientABC&ContentItemId=TotalAccess&SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&SessionTime=25368&ReturnUrl=http://www.clientabc.com, Method=GET, IP=209.51.249.195, Content=", ""}}\n{{"2006-09-21, 02:57:11.60", 122, 15, "UserData:<User CrmId="clientabc" UserId="p12345678"><EntitlementList></EntitlementList></User>", ""}}\n{{"2006-09-21, 02:57:11.60", 122, 15, "New Cookie: SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&CrmId=clientabc&UserId=p12345678&AccountId=&AgentHost=man&AgentId=man, MANUser: Version=1&Name=&Debit=&Credit=&AccessTime=&BillDay=&Status=&Language=&Country=&Email=&EmailNotify=&Pin=&PinPayment=&PinAmount=&PinPG=&PinPGRate=&PinMenu=&", ""}}\n" }
Outgoing events:
Event { body = "This is first event" }
Event { body = "{{"2007-09-21, 02:57:11.58", 122, 11, "Path=/LoginUser Query=CrmId=ClientABC&ContentItemId=TotalAccess&SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&SessionTime=25368&ReturnUrl=http://www.clientabc.com, Method=GET, IP=209.51.249.195, Content=", ""}}{{"2006-09-21, 02:57:11.60", 122, 15, "UserData:<User CrmId="clientabc" UserId="p12345678"><EntitlementList></EntitlementList></User>", ""}}{{"2006-09-21, 02:57:11.60", 122, 15, "New Cookie: SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&CrmId=clientabc&UserId=p12345678&AccountId=&AgentHost=man&AgentId=man, MANUser: Version=1&Name=&Debit=&Credit=&AccessTime=&BillDay=&Status=&Language=&Country=&Email=&EmailNotify=&Pin=&PinPayment=&PinAmount=&PinPG=&PinPGRate=&PinMenu=&", ""}}" }
4. SPL2 Example: Break events at the timestamp and merges the lines following the timestamp until another timestamp is found
This example assumes that you are in the SPL View.
| from forwarders("forwarders:all") | apply_line_breaking linebreak_type="auto" |...;
Incoming event:
Event { body = "06-01-2020 11:17:29.259 -0700 INFO StatusMgr - destHost=my.host.com, destIp=44.226.233.96, destPort=9997, eventType=connect_try, publisher=tcpout, sourcePort=8089, statusee=TcpOutputProcessor\n06-01-2020 11:17:29.628 -0700 INFO StatusMgr - destHost=my.host.com, destIp=44.226.233.96, destPort=9997, eventType=connect_done, publisher=tcpout, sourcePort=8089, statusee=TcpOutputProcessor\n " }
Outgoing events:
Event { body = "06-01-2020 11:17:29.259 -0700 INFO StatusMgr - destHost=my.host.com, destIp=44.226.233.96, destPort=9997, eventType=connect_try, publisher=tcpout, sourcePort=8089, statusee=TcpOutputProcessor" }
Event { body = "06-01-2020 11:17:29.628 -0700 INFO StatusMgr - destHost=my.host.com, destIp=44.226.233.96, destPort=9997, eventType=connect_done, publisher=tcpout, sourcePort=8089, statusee=TcpOutputProcessor" }
Aggregate with Trigger | Apply ML Model (beta) |
This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0, 1.2.1-patch02, 1.2.1, 1.2.2-patch02, 1.2.4, 1.2.5, 1.3.0, 1.3.1, 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 1.4.5, 1.4.6
Feedback submitted, thanks!