Splunk® Data Stream Processor

Function Reference

On October 30, 2022, all 1.2.x versions of the Splunk Data Stream Processor will reach its end of support date. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see the Upgrade the Splunk Data Stream Processor topic.
This documentation does not apply to the most recent version of Splunk® Data Stream Processor. For documentation on the most recent version, go to the latest release.

DSP functions by category

functions are divided into the following categories:

  • Streaming functions perform specific actions on the data flowing through a pipeline.
  • Source functions are a type of streaming function that ingests data from a data source into a pipeline.
  • Sink functions are a type of streaming function that sends data from a pipeline to a data destination. A sink function is a type of streaming function.
  • Scalar functions perform operations such as calculations, conversions, and comparisons inside of streaming functions.

See the following sections for more information about each category and the functions they contain.

Streaming functions

Streaming functions are the basic building blocks of a data pipeline. Use these functions to perform specific actions on the data flowing through a pipeline.

You can select streaming functions from the function picker in the and view them as nodes in a data pipeline. When you stream data through a pipeline, the data moves from one node (or streaming function) to the next, and each function acts on the data as it passes through.

The following table lists all the streaming functions that are available in the except for source and sink functions. See the Source functions and Sink functions sections on this page for more information.

Functions Description
Adaptive Thresholding Detects anomalies on the stream and dynamically generates threshold values based on observed data values.
Aggregate and Trigger Triggers an event output based on a custom condition over a set of aggregated events.
Apply Line Break Break and merge events sent from a universal forwarder.
Apply ML Model Use to see how a trained model performs in the DSP environment.
Apply Timestamp Extraction Extract a timestamp using a designated extraction type.
Batch Bytes Batches incoming byte arrays by size, count, or milliseconds and outputs a single byte array concatenated by a user defined separator.
Batch Records Batches records before sending them to an index or a third-party sink.
Bin Puts continuous numerical values into discrete sets, or bins, by adjusting the value of <field> so that all of the items in a particular set have the same value.
Break Events Break events using a Java regular expression as a delimiter.
Datagen Function under ongoing development. Generates synthetic data.
Drift Detection Identifies large scale shifts and abrupt changes in a time-series data stream.
Eval Calculate an expression and put the resulting value into the record as a new field.
Fields Select a subset of fields from a record.
From Used to retrieve data from a specific source function in the SPL2 Pipeline Builder.
Into Used to send data to a specific sink function in the SPL2 Pipeline Builder.
Key_by Group a stream of records by one or more field(s) and returns a grouped stream.
Lookup Invokes field value lookups.
Merge Events Parses data received from a universal forwarder into a stream of complete events for a Splunk Index.
Mvexpand Expand the values in a multivalue field into separate events, one event for each value in the multivalue field.
Pairwise Categorical Outlier Detection Detects anomalous combinations of values from two categorical variables.
Parse regex (rex) Extract or rename fields using Java regular expression-named capturing groups.
Rename Rename one or more fields.
Select Assigns an alternative name to a field or applies scalar functions to a group of fields.
Sentiment Analysis Classifies raw text as positive, negative, or neutral.
Sequential Outlier Detection Identify anomalous events in time-series sequence data.
Stats Applies one or more aggregation functions on a stream of events in a specified time window.
Time Series Decomposition (STL) Automatically decomposes time series data streams into trend, seasonal, and remainder components in real time.
To Splunk JSON Format records to adhere to the Splunk HEC event JSON or the Splunk HEC metric JSON format.
Union Combines streams with the same input schema into one stream with all of the events of the input streams.
Where Keep records that pass a Boolean function.

Source functions

Source functions are a type of streaming function. Use a source function at the beginning of a pipeline to ingest data from a data source into your pipeline.

The following table lists the source functions that are available in the .

Functions Description
Splunk DSP Firehose Get data sent from Splunk DSP Ingest, Forwarders, and Collect API services.
Forwarders Service Get data from the Splunk Forwarders service.
Ingest Service Get data from the Ingest service.
Amazon CloudWatch Get data from Amazon CloudWatch.
Amazon Kinesis Data Stream Get data from Amazon Kinesis Data Stream.
Amazon Metadata Get data from the resources and infrastructure in Amazon Web Services (AWS).
Amazon S3 Get data from Amazon S3.
Apache Pulsar Get data from Apache Pulsar.
Kafka

Get data from Apache or Confluent Kafka.

Microsoft Azure Event Hubs Get data from Microsoft Azure Event Hubs.
Microsoft Azure Monitor Get data from Microsoft Azure Monitor.
Google Cloud Monitoring Get data from Google Cloud Monitoring.
Google Cloud Pub/Sub Get data from Google Cloud Pub/Sub topics.
Microsoft 365 Get data from the Office 365 Management Activity API.

Sink functions

Sink functions are a type of streaming function. Use a sink function at the end of a pipeline to send data from your pipeline to a data destination.

To see the output data of a sink function, you must search for it in the intended data destination. You can't view the output data from a sink function by running a preview session.

The following table lists the sink functions that are available in the .

Functions Description
Send to a Splunk Index with Batching Sends data to an external Splunk Enterprise environment. The Splunk Enterprise Indexes function combines the actions of three underlying DSP functions into one for user convenience: To Splunk JSON, Batch Bytes, Splunk Enterprise.
Send to a Splunk Index Sends data to an external Splunk Enterprise environment.
Send to a Splunk Index (Default for Environment) Sends data to your default, pre-configured Splunk Enterprise instance.
Send to Amazon Kinesis Data Streams Sends data to an Amazon Kinesis Data Stream using an AWS access key and secret key authentication.
Send to Amazon S3 Sends data to Amazon S3.
Send to Kafka Sends data to an Apache Kafka topic using a Kafka connection.
Send to Microsoft Azure Event Hubs (Beta) Sends data to Microsoft Azure Event Hubs.
Send to Null Sends data to a dev/null sink.
Send Metrics Data to SignalFx Sends metric data to SignalFx.
Send Trace Data to SignalFx Sends trace data to SignalFx.

Scalar functions

Scalar functions perform operations such as calculations, conversions, or comparisons in the context of the streaming function where you call them.

When configuring a streaming function, you can call scalar functions that dynamically resolve values instead of specifying a literal value. For example, when configuring the Send to a Splunk Index with Batching sink function, you can specify which Splunk index to send the data to. Instead of specifying a single index name so that the function sends every record to that index, you can dynamically route records to different indexes by calling a scalar function that resolves the index name based on the contents of the record.

To call a scalar function, you must type an expression using the required syntax. Scalar functions are not visible in the DSP UI as pipeline nodes or options that you can select.

The following table lists the scalar functions that are available in the .

Function Category Function list Description
Casting functions cast Converts an expression from one data type to another.
ucast Provides a way to cast maps and collections, regardless of the data type that the map or collection may contain.
Aggregate functions average Returns the average of the values in the specified field.
count Returns the number of non-null values in a time window.
estdc Calculates an approximated distinct count value for any field.
max Returns the maximum value in a time window.
mean Calculates the average (mean) of values in a time window.
min Returns the minimum value in a time window.
perc Computes the approximate q-th percentile value of a numeric field input field.
sum Returns the sum of values in a time window.
Conditional scalar functions cidrmatch Returns TRUE or FALSE based on whether an IPv4 address matches an IPv4 CIDR notation.
coalesce Takes a variable number of arguments and returns the first value that is not NULL.
if Assigns an expression if the value is true, and another expression if the value is false.
in Returns TRUE if one of the values in a list matches a value in the field you specify.
like Returns TRUE if TEXT matches PATTERN.
null if equal (nullif) Compares two fields and returns NULL if two fields are equal.
validate Returns the first string value corresponding to the first test that evaluates to FALSE.
Conversion scalar functions base64_decode Converts a Base64-encoded string to bytes.
base64_encode Converts a byte array value to a Base64-encoded string.
deserialize_json_object Converts a JSON byte string into a map.
from_json_array Converts a JSON string into an array of the JSON structure, including nested keys.
from json object Converts a JSON string into a map of the JSON structure, including nested keys.
gunzip Decompresses a GZipped byte array.
gzip Returns Gzipped-compressed bytes.
inet_aton Converts a string IPv4 or IPv6 IP address and returns the address as type Long.
inet_ntoa Converts a decimal IP address to dotted-decimal form.
parse_bool Parses a string as a boolean.
parse_double Parses a string and returns the numeric value as a Double.
parse_float Parses a string and returns the numeric value as a Float.
parse_int Parses a string as an integer.
parse long Parses a string and returns the numeric value as Long.
parse_millis Converts Splunk Enterprise "time" format to DSP "timestamp" format.
parse_nanos Converts Splunk Enterprise "time" format to DSP "nanos" format.
serialize_json Converts the current record into a JSON byte string.
serialize_json_collection Converts a map of JSON structure into a JSON byte array.
to_bytes Converts a string to a byte string.
to_json Converts a map of a JSON object's structure to a JSON string.
to_string Converts a byte array to a String.
tostring Converts a number to a string.
Cryptographic scalar functions md5 Computes and returns the MD5 hash of a byte value X.
sha1 Computes and returns the secure hash of a byte value X based on the FIPS compliant SHA-1 hash function.
sha256 Computes and returns the secure hash of a byte value X based on the FIPS compliant SHA-256 hash function.
sha512 Computes and returns the secure hash of a byte value X based on the FIPS compliant SHA-512 hash function.
Date and Time scalar functions relative_time Applies a relative time specifier to a UNIX time value.
strftime This function formats a UNIX timestamp into a human-readable timestamp.
strptime This function parses a date string into a UNIX timestamp.
time This function returns the wall-clock time, in the UNIX time format, with millisecond resolution.
Iterator scalar functions filter Filters elements of a list.
for_each For each element of a list, evaluate an expression Y and return a new list containing the results.
iterator Iterates through a list and temporarily assigns each element in list for use in the iterator scalar functions.
List scalar functions length Returns the character length of a given input.
mvdedup Removes duplicates from a list.
mvappend Takes an arbitrary list of arguments, where each argument is a single string or a list of strings, and returns all elements as a single flattened list.
mvindex Returns the element at the list at the index.
mvjoin Takes all of the values in a list and appends them together delimited by STR.
mvrange Returns a list for a range of numbers.
mvsort Takes a list and returns the list with the elements of the list sorted lexicographically.
split Splits a string using a delimiter.
Map scalar functions contains_key Checks a map for a specified key.
create_map Creates a new map object at runtime.
flatten Flattens a list or a map
length Returns the character length of a given input.
map_delete Removes a key from a map.
map_get Returns the value corresponding to a key in the map input.
map_keys Returns a list of keys in a map.
map_merge Merge two or more maps into a single map.
map_set Insert or overwrite key-value pairs in a map.
map_values Returns all values in a map.
spath Extract a value from a map or collection.
Mathematical scalar functions abs Takes a number and returns its absolute value.
ceil Rounds a number up to the next highest integer.
exp Takes a number value and returns the exponential evalue.
floor Rounds a number down to the nearest whole integer.
log Takes one or two numbers and returns the logarithm of the first argument value using the second argument base as the base.
natural logarithm (ln) Takes a number and returns its natural logarithm.
mod Divides two numbers and returns the remainder.
pi Returns the constant pi to 11 digits of precision.
power of base (pow) Takes two numbers base and exp, and returns baseexp.
random integer (randomint) Returns a random integer in the range of 0 to 231-1.
round value (round) Takes a number value and returns value rounded to the nearest whole number.
round value (round) Takes two numbers, value and num_decimals, and returns value rounded to the amount of decimal places specified by num_decimals.
sqrt Takes a number value and returns its square root.
String manipulation scalar functions concat Combines the first and second strings together.
extract_grok Extracts matching groups with a Grok-compatible pattern.
extract_key_value Extract key-value pairs.
extract_regex Uses a Java regular expression to extract capturing groups from the input.
join Joins a list of strings using a delimiter and returns a single string.
len Returns the character length of a string.
lower Converts a string to lowercase.
ltrim Trims extra characters from the left side.
match_regex Matches inputs against a pattern defined with a Java regular expression.
match_wildcard Matches inputs against a wildcard pattern.
replace Performs a regex replacement on a string.
rtrim Trims extra characters from the right side.
substr Returns a substring of a string.
trim Trim extra characters.
upper Converts a string to uppercase.
url_decode Takes a URL string and returns the unescaped or decoded URL string.
url_encode Encodes a string for the query string parameters in a URL.
Last modified on 12 April, 2021
Configuring functions in DSP pipelines   Structure of DSP function descriptions

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0, 1.2.1-patch02


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters