Splunk® Data Stream Processor

Function Reference

Acrobat logo Download manual as PDF


Acrobat logo Download topic as PDF

Apply Line Break

This topic describes how to use the function in the .

Description

Breaks and merges universal forwarder events using a specified break type. Events typically come from the universal forwarder in 64KB chunks, and require additional parsing to be processed in the correctly. Use this function to configure the to break and merge these events so that it is in the format that you want.

This function performs two actions: line breaking and line merging.

  • In the line breaking phase, this function uses a specified line breaker pattern to split the incoming stream of data into separate lines.
  • In the line merging phase, this function takes the separated lines from the line breaking phase and merges them into events.

This function must be added immediately after the source function or, in the case there are multiple sources, after the union of sources. See the "SPL2 examples" section for examples on how to use this function.

Function Input/Output Schema

Function Input
collection<record<R>>
This function takes in collections of records with schema R.
Function Output
collection<record<R>>
This function outputs collections of records with schema R.

Syntax

The required fields are in bold.

If linebreak_type = auto:

linebreak
linebreak_type = auto

If linebreak_type = advanced:

linebreak
linebreak_type = advanced
line_breaker = <regex>
truncate = <integer>

If linebreak_type = config:

linebreak
linebreak_type = config
props_conf = <string>

Required arguments

linebreak_type
Syntax: auto | advanced | config
Description: See the table for a description of each break type. All three of these break types perform line merging after breaking by default.
Example in Canvas View: auto
Break Type Description
Auto Break events based on the location of timestamps in the data, and merges lines after the timestamp. Creates a new event when another timestamp is detected. This setting uses built-in timestamp rules to detect timestamps.
Advanced Break events based on a custom regular expression pattern. The default pattern is [\r\n]+, which breaks data into an event for each line, delimited by any number of carriage return (\r) or newline (\n) characters.
Config Break events based on an existing line breaking props.conf configuration. See props.conf. Use this break type if you want to reuse and migrate your existing props.conf line_breaking settings to the .
props_conf
Syntax: string
Description: Required if you are using the config linebreak_type. The line breaking stanza configurations from props.conf. See configure event line breaking in the Splunk Enterprise documentation for a description of available line breaking attributes.
Example in Canvas View: MAX_EVENTS=3 MUST_NOT_BREAK_BEFORE=^.*fifth SHOULD_LINEMERGE=true BREAK_ONLY_BEFORE_DATE=false

Optional arguments

line_breaker
Syntax: Regular expression
Description: A regular expression that determines how the incoming data is broken into initial events, before line merging takes place.
Default: [\r\n]+
Example in Canvas View: [\r\n]+
truncate
Syntax: a non-negative integer
Description: The default maximum line length, in bytes. The rounds down line length when this attribute would otherwise land mid-character for multibyte characters.
Default: 10000
Example in Canvas View: 5000

SPL2 examples

Examples of common use cases follow. The following examples in this section assume that you are in the SPL View.

If you are using the SPL2 Pipeline Builder, you must escape any backslash ( \ ) characters. If you are using the Canvas Builder, backslash characters are automatically escaped. See Using regular expressions in the Canvas Builder vs the SPL2 Pipeline Builder.

1. Reuse an existing props.conf setting that creates a new event when it encounters a new line with a date and merges the lines following the date.

| from forwarders("forwarders:all") | apply_line_breaking props_conf="BREAK_ONLY_BEFORE_DATE=true\nSHOULD_LINEMERGE=true" linebreak_type="config" |...;

Incoming event:

Event {
 body = "This is the first line. 
12/31/2017-05:43:11.325
This is the third line. 
This is the fourth line. 
Fifth. 
"
}

Outgoing events:

Event {
 body = "This is the first line."
}
Event {
  body = 12/31/2017-05:43:11.325This is the third line.This is the fourth line.Fifth." 
}

2. Break an event at every new line using a regular expression pattern.

The following example shows a partial pipeline with two sources, Splunk DSP Firehose and the Forwarders service.

$firehose = | from splunk_firehose();
$forwarders = | from forwarders("forwarders:all");
| from $forwarders | union $statement_2 | apply_line_breaking linebreak_type="auto";

Incoming event:

Event {
 body = "This is the first line. 
12/31/2017-05:43:11.325
This is the third line. 
"
}

Outgoing events:

Event {
 body = "This is the first line."
}
Event {
  body = "12/31/2017-05:43:11.325" 
}
Event {
   body = "This is the third line."
} 

3. Break a single event into multiple events using an existing props.conf configuration

| from forwarders("forwarders:all") | apply_line_breaking props_conf="SHOULD_LINEMERGE = True\nBREAK_ONLY_BEFORE = Path=" linebreak_type="config" |...;

Incoming event:

Event {
 body = "This is first event \r\n{{"2007-09-21, 02:57:11.58", 122, 11, "Path=/LoginUser Query=CrmId=ClientABC&ContentItemId=TotalAccess&SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&SessionTime=25368&ReturnUrl=http://www.clientabc.com, Method=GET, IP=209.51.249.195, Content=", ""}}\n{{"2006-09-21, 02:57:11.60", 122, 15, "UserData:<User CrmId="clientabc" UserId="p12345678"><EntitlementList></EntitlementList></User>", ""}}\n{{"2006-09-21, 02:57:11.60", 122, 15, "New Cookie: SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&CrmId=clientabc&UserId=p12345678&AccountId=&AgentHost=man&AgentId=man, MANUser: Version=1&Name=&Debit=&Credit=&AccessTime=&BillDay=&Status=&Language=&Country=&Email=&EmailNotify=&Pin=&PinPayment=&PinAmount=&PinPG=&PinPGRate=&PinMenu=&", ""}}\n"
}

Outgoing events:

Event {
 body = "This is first event"
}
Event {
 body = "{{"2007-09-21, 02:57:11.58", 122, 11, "Path=/LoginUser Query=CrmId=ClientABC&ContentItemId=TotalAccess&SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&SessionTime=25368&ReturnUrl=http://www.clientabc.com, Method=GET, IP=209.51.249.195, Content=", ""}}{{"2006-09-21, 02:57:11.60", 122, 15, "UserData:<User CrmId="clientabc" UserId="p12345678"><EntitlementList></EntitlementList></User>", ""}}{{"2006-09-21, 02:57:11.60", 122, 15, "New Cookie: SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&CrmId=clientabc&UserId=p12345678&AccountId=&AgentHost=man&AgentId=man, MANUser: Version=1&Name=&Debit=&Credit=&AccessTime=&BillDay=&Status=&Language=&Country=&Email=&EmailNotify=&Pin=&PinPayment=&PinAmount=&PinPG=&PinPGRate=&PinMenu=&", ""}}"
}

4. Break events at the timestamp and merges the lines following the timestamp until another timestamp is found

| from forwarders("forwarders:all") | apply_line_breaking linebreak_type="auto" |...;

Incoming event:

Event {
 body = "06-01-2020 11:17:29.259 -0700 INFO StatusMgr - destHost=my.host.com, destIp=44.226.233.96, destPort=9997, eventType=connect_try, publisher=tcpout, sourcePort=8089, statusee=TcpOutputProcessor\n06-01-2020 11:17:29.628 -0700 INFO StatusMgr - destHost=my.host.com, destIp=44.226.233.96, destPort=9997, eventType=connect_done, publisher=tcpout, sourcePort=8089, statusee=TcpOutputProcessor\n
"
}

Outgoing events:

Event {
 body = "06-01-2020 11:17:29.259 -0700 INFO StatusMgr - destHost=my.host.com, destIp=44.226.233.96, destPort=9997, eventType=connect_try, publisher=tcpout, sourcePort=8089, statusee=TcpOutputProcessor"
}
Event {
 body = "06-01-2020 11:17:29.628 -0700 INFO StatusMgr - destHost=my.host.com, destIp=44.226.233.96, destPort=9997, eventType=connect_done, publisher=tcpout, sourcePort=8089, statusee=TcpOutputProcessor"
}
Last modified on 20 April, 2021
PREVIOUS
Aggregate with Trigger
  NEXT
Apply ML Model (beta)

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0, 1.2.1


Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters