DSP functions by category
functions are divided into the following categories:
- Streaming functions perform specific actions on the data flowing through a pipeline.
- Source functions are a type of streaming function that ingests data from a data source into a pipeline.
- Sink functions are a type of streaming function that sends data from a pipeline to a data destination. A sink function is a type of streaming function.
- Scalar functions perform operations such as calculations, conversions, and comparisons inside of streaming functions.
See the following sections for more information about each category and the functions they contain.
Streaming functions
Streaming functions are the basic building blocks of a data pipeline. Use these functions to perform specific actions on the data flowing through a pipeline.
You can select streaming functions from the function picker in the and view them as nodes in a data pipeline. When you stream data through a pipeline, the data moves from one node (or streaming function) to the next, and each function acts on the data as it passes through.
The following table lists all the streaming functions that are available in the except for source and sink functions. See the Source functions and Sink functions sections on this page for more information.
Functions | Description |
---|---|
Adaptive Thresholding
|
Detects anomalies on the stream and dynamically generates threshold values based on observed data values. |
Aggregate and Trigger
|
Triggers an event output based on a custom condition over a set of aggregated events. |
Apply Line Break
|
Break and merge events sent from a universal forwarder. |
Apply ML Model
|
Use to see how a trained model performs in the DSP environment. |
Apply Timestamp Extraction
|
Extract a timestamp using a designated extraction type. |
Batch Bytes
|
Batches incoming byte arrays by size, count, or milliseconds and outputs a single byte array concatenated by a user defined separator. |
Batch Records
|
Batches records before sending them to an index or a third-party sink. |
Bin
|
Puts continuous numerical values into discrete sets, or bins, by adjusting the value of <field> so that all of the items in a particular set have the same value. |
Break Events
|
Break events using a Java regular expression as a delimiter. |
Datagen
|
Function under ongoing development. Generates synthetic data. |
Drift Detection
|
Identifies large scale shifts and abrupt changes in a time-series data stream. |
Eval
|
Calculate an expression and put the resulting value into the record as a new field. |
Fields
|
Select a subset of fields from a record. |
From
|
Used to retrieve data from a specific source function in the SPL2 Pipeline Builder. |
Into
|
Used to send data to a specific sink function in the SPL2 Pipeline Builder. |
Key_by
|
Group a stream of records by one or more field(s) and returns a grouped stream. |
Lookup
|
Invokes field value lookups. |
Merge Events
|
Parses data received from a universal forwarder into a stream of complete events for a Splunk Index. |
Mvexpand
|
Expand the values in a multivalue field into separate events, one event for each value in the multivalue field. |
Pairwise Categorical Outlier Detection
|
Detects anomalous combinations of values from two categorical variables. |
Parse regex (rex)
|
Extract or rename fields using Java regular expression-named capturing groups. |
Rename
|
Rename one or more fields. |
Select
|
Assigns an alternative name to a field or applies scalar functions to a group of fields. |
Sentiment Analysis
|
Classifies raw text as positive, negative, or neutral. |
Sequential Outlier Detection
|
Identify anomalous events in time-series sequence data. |
Stats
|
Applies one or more aggregation functions on a stream of events in a specified time window. |
Time Series Decomposition (STL)
|
Automatically decomposes time series data streams into trend, seasonal, and remainder components in real time. |
To Splunk JSON
|
Format records to adhere to the Splunk HEC event JSON or the Splunk HEC metric JSON format. |
Union
|
Combines streams with the same input schema into one stream with all of the events of the input streams. |
Where
|
Keep records that pass a Boolean function. |
Source functions
Source functions are a type of streaming function. Use a source function at the beginning of a pipeline to ingest data from a data source into your pipeline.
The following table lists the source functions that are available in the .
Functions | Description |
---|---|
Splunk DSP Firehose
|
Get data sent from Splunk DSP Ingest, Forwarders, and Collect API services. |
Forwarders Service
|
Get data from the Splunk Forwarders service. |
Ingest Service
|
Get data from the Ingest service. |
Amazon CloudWatch
|
Get data from Amazon CloudWatch. |
Amazon Kinesis Data Stream
|
Get data from Amazon Kinesis Data Stream. |
Amazon Metadata
|
Get data from the resources and infrastructure in Amazon Web Services (AWS). |
Amazon S3
|
Get data from Amazon S3. |
Apache Pulsar
|
Get data from Apache Pulsar. |
Kafka |
Get data from Apache or Confluent Kafka. |
Microsoft Azure Event Hubs
|
Get data from Microsoft Azure Event Hubs. |
Microsoft Azure Monitor
|
Get data from Microsoft Azure Monitor. |
Google Cloud Monitoring
|
Get data from Google Cloud Monitoring. |
Google Cloud Pub/Sub
|
Get data from Google Cloud Pub/Sub topics. |
Microsoft 365
|
Get data from the Office 365 Management Activity API. |
Sink functions
Sink functions are a type of streaming function. Use a sink function at the end of a pipeline to send data from your pipeline to a data destination.
To see the output data of a sink function, you must search for it in the intended data destination. You can't view the output data from a sink function by running a preview session.
The following table lists the sink functions that are available in the .
Functions | Description |
---|---|
Send to a Splunk Index with Batching
|
Sends data to an external Splunk Enterprise environment. The Splunk Enterprise Indexes function combines the actions of three underlying DSP functions into one for user convenience: To Splunk JSON, Batch Bytes, Splunk Enterprise. |
Send to a Splunk Index
|
Sends data to an external Splunk Enterprise environment. |
Send to a Splunk Index (Default for Environment)
|
Sends data to your default, pre-configured Splunk Enterprise instance. |
Send to Amazon Kinesis Data Streams
|
Sends data to an Amazon Kinesis Data Stream using an AWS access key and secret key authentication. |
Send to Amazon S3
|
Sends data to Amazon S3. |
Send to Kafka
|
Sends data to an Apache Kafka topic using a Kafka connection. |
Send to Microsoft Azure Event Hubs (Beta)
|
Sends data to Microsoft Azure Event Hubs. |
Send to Null
|
Sends data to a dev/null sink. |
Send Metrics Data to SignalFx
|
Sends metric data to SignalFx. |
Send Trace Data to SignalFx
|
Sends trace data to SignalFx. |
Scalar functions
Scalar functions perform operations such as calculations, conversions, or comparisons in the context of the streaming function where you call them.
When configuring a streaming function, you can call scalar functions that dynamically resolve values instead of specifying a literal value. For example, when configuring the Send to a Splunk Index with Batching sink function, you can specify which Splunk index to send the data to. Instead of specifying a single index name so that the function sends every record to that index, you can dynamically route records to different indexes by calling a scalar function that resolves the index name based on the contents of the record.
To call a scalar function, you must type an expression using the required syntax. Scalar functions are not visible in the DSP UI as pipeline nodes or options that you can select.
The following table lists the scalar functions that are available in the .
Function Category | Function list | Description |
---|---|---|
Casting functions | cast
|
Converts an expression from one data type to another. |
ucast
|
Provides a way to cast maps and collections, regardless of the data type that the map or collection may contain. | |
Aggregate functions | average
|
Returns the average of the values in the specified field. |
count
|
Returns the number of non-null values in a time window. | |
estdc
|
Calculates an approximated distinct count value for any field. | |
max
|
Returns the maximum value in a time window. | |
mean
|
Calculates the average (mean) of values in a time window. | |
min
|
Returns the minimum value in a time window. | |
perc
|
Computes the approximate q-th percentile value of a numeric field input field. | |
sum
|
Returns the sum of values in a time window. | |
Conditional scalar functions | cidrmatch
|
Returns TRUE or FALSE based on whether an IPv4 address matches an IPv4 CIDR notation. |
coalesce
|
Takes a variable number of arguments and returns the first value that is not NULL. | |
if
|
Assigns an expression if the value is true, and another expression if the value is false. | |
in
|
Returns TRUE if one of the values in a list matches a value in the field you specify. | |
like
|
Returns TRUE if TEXT matches PATTERN. | |
null if equal (nullif)
|
Compares two fields and returns NULL if two fields are equal. | |
validate
|
Returns the first string value corresponding to the first test that evaluates to FALSE.
| |
Conversion scalar functions | base64_decode
|
Converts a Base64-encoded string to bytes. |
base64_encode
|
Converts a byte array value to a Base64-encoded string. | |
deserialize_json_object
|
Converts a JSON byte string into a map. | |
from_json_array
|
Converts a JSON string into an array of the JSON structure, including nested keys. | |
from json object
|
Converts a JSON string into a map of the JSON structure, including nested keys. | |
gunzip
|
Decompresses a GZipped byte array. | |
gzip
|
Returns Gzipped-compressed bytes. | |
inet_aton
|
Converts a string IPv4 or IPv6 IP address and returns the address as type Long. | |
inet_ntoa
|
Converts a decimal IP address to dotted-decimal form. | |
parse_bool
|
Parses a string as a boolean. | |
parse_double
|
Parses a string and returns the numeric value as a Double. | |
parse_float
|
Parses a string and returns the numeric value as a Float. | |
parse_int
|
Parses a string as an integer. | |
parse long
|
Parses a string and returns the numeric value as Long. | |
parse_millis
|
Converts Splunk Enterprise "time" format to DSP "timestamp" format. | |
parse_nanos
|
Converts Splunk Enterprise "time" format to DSP "nanos" format. | |
serialize_json
|
Converts the current record into a JSON byte string. | |
serialize_json_collection
|
Converts a map of JSON structure into a JSON byte array. | |
to_bytes
|
Converts a string to a byte string. | |
to_json
|
Converts a map of a JSON object's structure to a JSON string. | |
to_string
|
Converts a byte array to a String. | |
tostring
|
Converts a number to a string. | |
Cryptographic scalar functions | md5
|
Computes and returns the MD5 hash of a byte value X. |
sha1
|
Computes and returns the secure hash of a byte value X based on the FIPS compliant SHA-1 hash function. | |
sha256
|
Computes and returns the secure hash of a byte value X based on the FIPS compliant SHA-256 hash function. | |
sha512
|
Computes and returns the secure hash of a byte value X based on the FIPS compliant SHA-512 hash function. | |
Date and Time scalar functions | relative_time
|
Applies a relative time specifier to a UNIX time value. |
strftime
|
This function formats a UNIX timestamp into a human-readable timestamp. | |
strptime
|
This function parses a date string into a UNIX timestamp. | |
time
|
This function returns the wall-clock time, in the UNIX time format, with millisecond resolution. | |
Iterator scalar functions | filter
|
Filters elements of a list. |
for_each
|
For each element of a list, evaluate an expression Y and return a new list containing the results. | |
iterator
|
Iterates through a list and temporarily assigns each element in list for use in the iterator scalar functions. | |
List scalar functions | length
|
Returns the character length of a given input. |
mvdedup
|
Removes duplicates from a list. | |
mvappend
|
Takes an arbitrary list of arguments, where each argument is a single string or a list of strings, and returns all elements as a single flattened list. | |
mvindex
|
Returns the element at the list at the index. | |
mvjoin
|
Takes all of the values in a list and appends them together delimited by STR. | |
mvrange
|
Returns a list for a range of numbers. | |
mvsort
|
Takes a list and returns the list with the elements of the list sorted lexicographically. | |
split
|
Splits a string using a delimiter. | |
Map scalar functions | contains_key
|
Checks a map for a specified key. |
create_map
|
Creates a new map object at runtime. | |
flatten
|
Flattens a list or a map | |
length
|
Returns the character length of a given input. | |
map_delete
|
Removes a key from a map. | |
map_get
|
Returns the value corresponding to a key in the map input. | |
map_keys
|
Returns a list of keys in a map. | |
map_merge
|
Merge two or more maps into a single map. | |
map_set
|
Insert or overwrite key-value pairs in a map. | |
map_values
|
Returns all values in a map. | |
spath
|
Extract a value from a map or collection. | |
Mathematical scalar functions | abs
|
Takes a number and returns its absolute value. |
ceil
|
Rounds a number up to the next highest integer. | |
exp
|
Takes a number value and returns the exponential evalue .
| |
floor
|
Rounds a number down to the nearest whole integer. | |
log
|
Takes one or two numbers and returns the logarithm of the first argument value using the second argument base as the base.
| |
natural logarithm (ln)
|
Takes a number and returns its natural logarithm. | |
mod
|
Divides two numbers and returns the remainder. | |
pi
|
Returns the constant pi to 11 digits of precision. | |
power of base (pow)
|
Takes two numbers base and exp , and returns baseexp .
| |
random integer (randomint)
|
Returns a random integer in the range of 0 to 231-1. | |
round value (round)
|
Takes a number value and returns value rounded to the nearest whole number.
| |
round value (round)
|
Takes two numbers, value and num_decimals , and returns value rounded to the amount of decimal places specified by num_decimals .
| |
sqrt
|
Takes a number value and returns its square root.
| |
String manipulation scalar functions | concat
|
Combines the first and second strings together. |
extract_grok
|
Extracts matching groups with a Grok-compatible pattern. | |
extract_key_value
|
Extract key-value pairs. | |
extract_regex
|
Uses a Java regular expression to extract capturing groups from the input. | |
join
|
Joins a list of strings using a delimiter and returns a single string. | |
len
|
Returns the character length of a string. | |
lower
|
Converts a string to lowercase. | |
ltrim
|
Trims extra characters from the left side. | |
match_regex
|
Matches inputs against a pattern defined with a Java regular expression. | |
match_wildcard
|
Matches inputs against a wildcard pattern. | |
replace
|
Performs a regex replacement on a string. | |
rtrim
|
Trims extra characters from the right side. | |
substr
|
Returns a substring of a string. | |
trim
|
Trim extra characters. | |
upper
|
Converts a string to uppercase. | |
url_decode
|
Takes a URL string and returns the unescaped or decoded URL string. | |
url_encode
|
Encodes a string for the query string parameters in a URL. |
Configuring functions in DSP pipelines | Structure of DSP function descriptions |
This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0, 1.2.1-patch02
Feedback submitted, thanks!