Splunk® Data Stream Processor

Use the Data Stream Processor

On April 3, 2023, Splunk Data Stream Processor will reach its end of sale, and will reach its end of life on February 28, 2025. If you are an existing DSP customer, please reach out to your account team for more information.
This documentation does not apply to the most recent version of Splunk® Data Stream Processor. For documentation on the most recent version, go to the latest release.

Functions

For the full list of functions that you can use in your pipeline, see Functions by category.

Streaming functions

Streaming functions operate on a stream of records. When your events are sent from the Splunk Forwarders Service or the Ingest REST API to your data pipeline, they are in the schema format described here for events and here for metrics with an additional kind attribute that defaults to the type of data you are sending in. For functions that are passed to streaming functions, you can access fields from that event by using get("field-name"). See the streaming functions list here.

Scalar functions

The following list shows supported scalar functions with a description of what the function does, the API function name, and an example using the function in streams DSL. Scalar functions are not full nodes in a pipeline, but instead help perform certain operations inside of a streaming function. You can use scalar functions to do things like addition and subtraction, comparisons, conversions, or other similar tasks. See the scalar functions list here.

Scalar functions operate in the context of the streaming function that they are called in.

Records and Collections

A collection is a list of records. A record is a structured object that has a specific schema. You can also refer to a record as an event. When your records are sent from the Splunk Forwarders Service or the Ingest REST API to your data pipeline, they are in the schema format described here for events and here for metrics.

record<R> denotes a record with a given schema, <R>. If a function inputs a collection of records that follow a given schema and then outputs a collection of records following a different schema, the function's input and output are denoted like this:

  • Input: collection<record<R>>
  • Output: <collection<record<S>>

These records flow through your data pipeline. Your pipeline's functions process your records. A collection of records flows from one function to the next via an edge between the functions. Depending on the situation, a collection might contain ordered lists, sets, or bags. A bag is unordered, and can contain duplicate records.

Streams JSON data types

Streams JSON supports the following data types.

  • Integer - a 32-bit whole number.
  • Long - a 64-bit whole number.
  • Float - a 32-bit floating point number.
  • Double - a 64-bit floating point number.
  • String - a sequence or string of characters.
  • Bytes - a byte buffer used to read/write data.
  • Boolean - a data type with only two possible values, either true or false.
  • Collection - a group or list of variables that share the same data type.
  • Map - a collection of key/value pairs.
  • Union - a JSON array consisting of more than one data type.
  • Regex - a regular expression that must be contained between two / slashes. You can add regex flags after the second /. For example, /.*teardown.*outside.*inside/i.

Casting between data types

Streams JSON provides an explicit cast function to convert between primitive data types. Complex data types are not supported for casting. The following methods convert the data type in the first column to the data type in the top row:

From / To Integer Long Float Double String Bytes Boolean
Integer - Identity * Identity * Identity * toString * Invalid Invalid
Long Identity ** - Identity * Identity * toString * Invalid Invalid
Float Floor ** Floor ** - Identity * toString * Invalid Invalid
Double Floor ** Floor ** Identity ** - toString * Invalid Invalid
String parseInt ** parseLong ** parseFloat ** parseDouble ** - Invalid Invalid
Bytes Invalid Invalid Invalid Invalid Invalid - Invalid
Boolean Invalid Invalid Invalid Invalid toString * Invalid -
  • Data type pairs in green, which also have one asterisk, succeed for all inputs.
  • Data type pairs in yellow, which also have two asterisks, do cast, but can produce unexpected results:
    • Casting from a Long to Integer will wrap-around if the value is larger than MAX_INT or less than MIN_INT.
    • Casting from a floating-point number to an Integer or Long will result in the maximum or minimum value for the target type if the real value is too large.
    • Casting from a Double to a Float will produce +Inf or -Inf if the value is too large.
    • Casting NaN from a Double or Float will produce 0 if the output type is Integer or Long.
    • Casting a String to a numeric type will return NULL if the value is not a numeric type.
  • Data type pairs in red, which are also marked "Invalid," cannot be converted and fail the type check.

Casting unions

The runtime data type of a Union can be one of many types. Casting is done to ensure that a Union value is a specific type before using it as an argument in a function.

A Union type can be cast to any of its contained types. If the runtime data type of the field in the Union type does not support a specific type-cast, the value will be set to NULL. For example, if a Union type contains types (Long, Bytes) and is used in a LessThan expression:

lt(get("unionField"), 50); When the field contains a Long, the value will be used as a Long and Less Than will work as expected. When the field contains Bytes, the value cannot be cast to Long, so it will be cast to NULL, and LessThan will also return NULL.

Last modified on 09 August, 2019
 

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.0.0


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters