Splunk® Data Stream Processor

Use the Data Stream Processor

On April 3, 2023, Splunk Data Stream Processor will reach its end of sale, and will reach its end of life on February 28, 2025. If you are an existing DSP customer, please reach out to your account team for more information.
This documentation does not apply to the most recent version of Splunk® Data Stream Processor. For documentation on the most recent version, go to the latest release.

Working with metrics data

Metrics data often have multiple nested fields within top-level fields. To simplify your data structure, you can extract nested data in different ways. The following use cases show how you can work with your data in these ways:

  • Flatten fields with multivalue data.
  • Flatten fields with nested data.
  • Extract, create, and delete a nested map.
  • Extract a list of nested keys or values from a top-level field.
  • Extract an element from a list.
  • Extract and promote a nested field to a top-level field.

Flatten a field containing an array of JSON objects into multiple records

When a field in your record contains multiple JSON objects corresponding to different metrics, you can flatten that field into multiple records. The following example uses the mvexpand function to flatten a body field containing an array of three metrics into three individual records.

  1. Before using the mvexpand function, the body field has to be cast to a list. From the Data Pipelines Canvas view, click the + icon at the position on your pipeline where you want to flatten your data and choose Eval from the function picker.
  2. In the View Configurations tab of the Eval function, enter the following SPL2 expression in the function field:
    body = ucast(body, "collection<map<string, any>>", null)
  3. From the Data Pipelines Canvas view, click the + icon after the Eval function and choose MV Expand from the function picker.
  4. In the View Configurations tab of the MV Expand function, complete the following fields:
    Field Description Example
    field Name of the field you want to expand. You can specify only one field to expand. body
    limit The number of values to expand in the multivalue field array. If there are any remaining values in the array, those values are dropped. If limit = 0 or null, then the limit is treated as 1000, the maximum limit. 0
  5. Click Start Preview, and then click the MV Expand function to confirm that the function works as expected. Here's an example of what your data might look like before and after using the mvexpand function:
    SPL2 example Data before extraction Data after extraction
     ... | mvexpand limit=0 body
    {
       body: 
       [
           { 
                name: "cpu.util",
                unit: "percent",
                type: "g",
                value: 45,
                dimensions: {
                    InstanceId: "i-065d598370ac25b90",
                    Region: "us-west-1"
                }
            },
            {  
                name: "mem.util",
                unit: "gigabytes",
                type: "g",
                value: 20,
                dimensions: {
                    InstanceId: "i-065d598370ac25b90",
                    Region: "us-west-1"
                }
            },
            {
                name: "net.in",
                unit: "bytes/second",
                type: "g",
                value: 3000,
                dimensions: { 
                    InstanceId: "i-065d598370ac25b90",
                    Region: "us-west-1"
                }
            }
        ],
        source_type: "aws:cloudwatch",
        kind: "metric",
        attributes: {  
            default_unit: "",
            default_type: "",
            _splunk_connection_id: "rest_api:all"
        }
    }
    { 
        body: { 
            name: "cpu.util",
            unit: "percent",
            type: "g",
            value: 45,
            dimensions: { 
                InstanceId: "i-065d598370ac25b90",
                Region: "us-west-1"
            }
        },
        source_type: "aws:cloudwatch",
        kind: "metric",
        attributes: { 
            default_unit: "",
            default_type: "",
            _splunk_connection_id: "rest_api:all"
        }
    }
    { 
        body: { 
            name: "mem.util",
            unit: "gigabytes",
            type: "g",
            value: 20,
            dimensions: { 
                InstanceId: "i-065d598370ac25b90",
                Region: "us-west-1"
            }
        },
        source_type: "aws:cloudwatch",
        kind: "metric",
        attributes: { 
            default_unit: "",
            default_type: "",
            _splunk_connection_id: "rest_api:all"
        }
    }
    { 
        body: { 
            name: "net.in",
            unit: "bytes/second",
            type: "g",
            value: 3000,
            dimensions: { 
                InstanceId: "i-065d598370ac25b90",
                Region: "us-west-1"
            }
        },
        source_type: "aws:cloudwatch",
        kind: "metric",
        attributes: { 
            default_unit: "",
            default_type: "",
            _splunk_connection_id: "rest_api:all"
        }
    }

Flatten a map or a list with nested data

Your data might contain a list of nested lists, a map of nested maps, or a combination of both. Flattening fields with such nested data can make extracting data easier. Use the flatten(X,Y) scalar function under the Map category to do this.

  1. From the Data Pipeline Canvas view, click the + icon at the position on your pipeline where you want to extract data from, and then choose Eval from the function picker.
  2. In the View Configurations tab of the Eval function, enter the following SPL2 expression in the function field:
    field_name = flatten(field_name)

    If you are flattening a map, you can optionally pass another parameter to specify the delimiter used to separate keys in the returned map:

    field_name = flatten(field_name, your_delimiter)
  3. In the View Configurations tab of the Eval function, click Update to update the records with your changes.
  4. Click Start Preview, and then click the Eval function to confirm that the function works as expected. Here are some examples of what your data might look like before and after flattening:
    SPL2 example Data before flattening Data after flattening Notes
    ... | eval flattened_list = flatten(list_field)
    [1, null, "foo", ["1-deep", ["2-deep"]], [], 100] [1, null, "foo", "1-deep", "2-deep", 100] Returns the flattened list in a new top-level field called flattened_list.
     ... | eval flattened_map = flatten(map_field)
    {"baz": {"foo": 1, "bar": "thing"}, "quux": 3} {"quux":3,"baz.foo":1,"baz.bar":"thing"} Returns the flattened map in a new top-level field called flattened_map.
    ... | eval flattened_map = flatten(map_field, "::")
    {"baz": {"foo": 1, "bar": "thing"}, "quux": 3} {"quux":3,"baz::bar":"thing","baz::foo":1} Returns the flattened map in a new top-level field called flattened_map. Also, delimits the keys in the map with ::.
     ... | eval flattened_map = flatten(map_field)
    [[1, 2, 3], [{"key1": {"innerkey1": "innerval1"}}]] [1,2,3,{"key1":{"innerkey1":"innerval1"}}] Returns the flattened lists in a new top-level field called flattened_list_with_nested_map. Does not flatten the nested maps that are included in the original list.

Extract a nested map from one field and add it to another field

When your data first gets read into the Splunk Data Stream Processor (DSP), the body field might contain nested key-value pairs that you want to move to a different top-level field. The following example uses the map_set scalar function under the Map category to move the dimensions key from body into attributes.

  1. From the Data Pipelines Canvas view, click the + icon at the position on your pipeline where you want to extract data from, and then choose Eval from the function picker.
  2. In the View Configurations tab of the Eval function, enter the following SPL2 expression in the function field:
    attributes = map_set(attributes, "some_key", "some_value")
  3. You can also remove the extracted key-value pairs from the body field after extracting them. In the View Configurations tab of the Eval function, click + Add to create a new function field and enter the following SPL2 expression the newly added function field:
    body = map_delete(body, ["some_key"])
  4. In the View Configurations tab of the Eval function, click Update to update the records with your transformations.
  5. Click Start Preview, and then click the Eval function to confirm the functions work as expected. Here's an example of what your data might look like before and after extraction:
    SPL2 example Data before extraction Data after extraction
    ... | eval attributes=map_set(attributes, "dimensions", {"InstanceId": "i-065d598370ac25b90", "Region": "us-west-1"}), body = map_delete(body, ["dimensions"])
    { 
        body: { 
            name: "mem.util",
            unit: "gigabytes",
            type: "g",
            value: 20,
            dimensions: {
                InstanceId: "i-065d598370ac25b90",
                Region: "us-west-1"
            }
        },
        source_type: "aws:cloudwatch",
        kind: "metric",
        attributes: { 
            default_unit: "",
            default_type: "",
            _splunk_connection_id: "rest_api:all"
        }
    }
    {  
        attributes: { 
            default_unit: "",
            default_type: "",
            _splunk_connection_id: "rest_api:all",
            dimensions: {  
                InstanceId: "i-065d598370ac25b90",
                Region: "us-west-1"
            }
        },
        body: {  
            name: "mem.util",
            unit: "gigabytes",
            type: "g",
            value: 20
        },
        source_type: "aws:cloudwatch",
        kind: "metric"
    }

Extract all nested keys or values in a map

When a top-level field in your data is a map of multiple key-value pairs, you can get a list of all the nested keys or all the nested values within this top-level field by using the map_keys and map_values scalar functions under the Map category.

  1. From the Data Pipeline Canvas view, and then click the + icon at the position on your pipeline where you want to extract data from and choose Eval from the function picker.
  2. In the View Configurations tab of the Eval function, enter the following SPL2 expression depending on which information you want to extract:
    Information to extract SPL2 expression Output
    Keys
    keys = map_keys(field_name)
    Create a new keys top-level field containing the list of keys extracted from the top-level field you pass in.
    Values
    values = map_values(field_name)
    Create a new values top-level field containing the list of values extracted from the top-level field you pass in.
  3. In the View Configurations tab of the Eval function, click Update to update the records with the newly created field.
  4. Click Start Preview, and then click the Eval function to make sure it's working as expected. Here's an example of what your data might look like:
    SPL2 example Data example Function output
     ... | eval keys = map_keys(attributes)
    { 
        body: { 
            name: "mem.util",
            unit: "gigabytes",
            type: "g",
            value: 20,
            dimensions: {
                InstanceId: "i-065d598370ac25b90",
                Region: "us-west-1"
            }
        },
        source_type: "aws:cloudwatch",
        kind: "metric",
        attributes: { 
            default_unit: "gigabytes",
            default_type: "g",
            _splunk_connection_id: "rest_api:all"
        }
    }
    A new top-level field is added to the data schema:
    keys: [
           "default_unit",
           "default_type",
           "_splunk_connection_id"
        ]
    
     ... | eval values = map_values(attributes)
    A new top-level field is added to the data schema:
    values: [
           "gigabytes",
           "g",
           "rest_api:all"
        ]
    

Extract an element from a list

When a top-level field in your data contains a list but you only want to work with a specific element in that list, you can extract that element if you know its index position in the list. This example uses the mvindex scalar function under the List category to extract a record from an array in the body field.

  1. From the Data Pipelines Canvas view, click the + icon at the position on your pipeline where you want to extract data from, and then choose Eval from the function list.
  2. Before using the mvindex function, the body field has to be cast to a list. In the View Configurations tab of the Eval function, enter the following SPL2 expression in the function field:
    body = ucast(body, "collection<map<string, any>>", null)
  3. In the View Configurations tab of the Eval function, click + Add to create a new function field and enter the following SPL2 expression the newly added function field:
    extracted_element = mvindex(body, 0)

    The first argument of the mvindex function is the name of the field from which you want to extract data. The second argument is the index indicating the position of the element you want to extract.

    Index numbers can be negative. -1 gets the last element in a list, -2 gets the second to last element in a list, and so on. If the index is out of range or does not exist, the function returns null.

  4. In the View Configurations tab of the Eval function, click Update to update the records with your transformations.
  5. Click Start Preview, and then click the Eval function to confirm the functions work as expected. Here's an example of what your data might look like before and after extraction:
    SPL2 example Data before extraction Data after extraction
    ... | eval body = ucast(body, "collection<map<string, any>>", null), extracted_element = mvindex(body, 0)
    {
        body: [
            {
                name: "cpu.util",
                unit: "percent",
                type: "g",
                value: 45,
                dimensions: {
                    InstanceId: "i-065d598370ac25b90",
                    Region: "us-west-1"
                }
            },
            {
                name: "mem.util",
                unit: "gigabytes",
                type: "g",
                value: 20,
                dimensions: { 
                    InstanceId: "i-065d598370ac25b90",
                    Region: "us-west-1"
                }
            },
            {
                name: "net.in",
                unit: "bytes/second",
                type: "g",
                value: 3000,
                dimensions: {
                    InstanceId: "i-065d598370ac25b90",
                    Region: "us-west-1"
                }
            }
        ],
        attributes: {
            default_unit: "",
            default_type: "g",
            _splunk_connection_id: "rest_api:all"
        }
    }
    {
        body: [
            {
                name: "cpu.util",
                unit: "percent",
                type: "g",
                value: 45,
                dimensions: {
                    InstanceId: "i-065d598370ac25b90",
                    Region: "us-west-1"
                }
            },
            {
                name: "mem.util",
                unit: "gigabytes",
                type: "g",
                value: 20,
                dimensions: {
                    InstanceId: "i-065d598370ac25b90",
                    Region: "us-west-1"
                }
            },
            {
                name: "net.in",
                unit: "bytes/second",
                type: "g",
                value: 3000,
                dimensions: {
                    InstanceId: "i-065d598370ac25b90",
                    Region: "us-west-1"
                }
            }
        ],
        extracted_element: {
            name: "cpu.util",
            unit: "percent",
            type: "g",
            value: 45,
            dimensions: {
                InstanceId: "i-065d598370ac25b90",
                Region: "us-west-1"
            }
        },
        attributes: {
            default_unit: "",
            default_type: "g",
            _splunk_connection_id: "rest_api:all"
        }
    }
    

Promote a nested field to a top-level field

Some fields in your data might contain nested values that you want to extract and assign to a top-level field. For example, the attributes field of your data contains an index key whose value you want to extract in order to format your records so that they match the Splunk HEC metric JSON format. You can extract the value of the index key with the map_get scalar function under the Map category.

  1. From the Data Pipelines Canvas view, click the + icon at the position on your pipeline where you want to extract data from, and then choose To Splunk JSON from the function picker.
  2. In the View Configurations tab of the To Splunk JSON function, enter the following SPL2 expression in the index field:
    cast(map_get(attributes, "index"), "string")

    The SPL2 expression also casts the extracted value to string so that it can be admitted as an input for the To Splunk JSON function. See Casting and To Splunk JSON in the Data Stream Processor Function Reference for more details.

  3. In the View Configurations tab of the To Splunk JSON function, toggle the keep_attributes button if you want the attributes map to be available as index-extracted fields in the Splunk platform.
  4. Click Start Preview, and then click the To Splunk JSON function to confirm that the function works as expected. Here's an example of what your data might look like before and after extraction:
    SPL2 example Data before extraction Data after extraction
     ... | to_splunk_json index = cast(map_get(attributes, "index") keep_attributes=true
    {
        host: "myhost",
        source: "mysource",
        source_type: "mysourcetype",
        kind: "metric",
        body: [ 
            { 
                name: "Hello World"
            }
        ],
        attributes: {  
            atrr1: "val1",
            index: "myindex"
        }
    }
    {
        json: "{"event":"Hello World", "source":"mysource", "sourcetype":"mysourcetype", "host":"myhost", "index":"myindex", "fields":{"attr1":"val1"}}"
    }
Last modified on 31 August, 2020
Extracting fields in events data   Filtering data

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.1.0


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters