Working with metrics data
Metrics data often have multiple nested fields within top-level fields. To simplify your data structure, you can extract nested data in different ways. The following use cases show how you can work with your data in these ways:
- Flatten fields with multivalue data.
- Flatten fields with nested data.
- Extract, create, and delete a nested map.
- Extract a list of nested keys or values from a top-level field.
- Extract an element from a list.
- Extract and promote a nested field to a top-level field.
Flatten a field containing an array of JSON objects into multiple records
When a field in your record contains multiple JSON objects corresponding to different metrics, you can flatten that field into multiple records. The following example uses the mvexpand function to flatten a body
field containing an array of three metrics into three individual records.
- Before using the
mvexpand
function, thebody
field has to be cast to a list. From the Data Pipelines Canvas view, click the + icon at the position on your pipeline where you want to flatten your data and choose Eval from the function picker. - In the View Configurations tab of the Eval function, enter the following SPL2 expression in the function field:
body = ucast(body, "collection<map<string, any>>", null)
- From the Data Pipelines Canvas view, click the + icon after the Eval function and choose MV Expand from the function picker.
- In the View Configurations tab of the MV Expand function, complete the following fields:
Field Description Example field Name of the field you want to expand. You can specify only one field to expand. body limit The number of values to expand in the multivalue field array. If there are any remaining values in the array, those values are dropped. If limit = 0 or null, then the limit is treated as 1000, the maximum limit. 0 - Click Start Preview, and then click the MV Expand function to confirm that the function works as expected.
Here's an example of what your data might look like before and after using the
mvexpand
function:SPL2 example Data before extraction Data after extraction ... | mvexpand limit=0 body
{ body: [ { name: "cpu.util", unit: "percent", type: "g", value: 45, dimensions: { InstanceId: "i-065d598370ac25b90", Region: "us-west-1" } }, { name: "mem.util", unit: "gigabytes", type: "g", value: 20, dimensions: { InstanceId: "i-065d598370ac25b90", Region: "us-west-1" } }, { name: "net.in", unit: "bytes/second", type: "g", value: 3000, dimensions: { InstanceId: "i-065d598370ac25b90", Region: "us-west-1" } } ], source_type: "aws:cloudwatch", kind: "metric", attributes: { default_unit: "", default_type: "", _splunk_connection_id: "rest_api:all" } }
{ body: { name: "cpu.util", unit: "percent", type: "g", value: 45, dimensions: { InstanceId: "i-065d598370ac25b90", Region: "us-west-1" } }, source_type: "aws:cloudwatch", kind: "metric", attributes: { default_unit: "", default_type: "", _splunk_connection_id: "rest_api:all" } }
{ body: { name: "mem.util", unit: "gigabytes", type: "g", value: 20, dimensions: { InstanceId: "i-065d598370ac25b90", Region: "us-west-1" } }, source_type: "aws:cloudwatch", kind: "metric", attributes: { default_unit: "", default_type: "", _splunk_connection_id: "rest_api:all" } }
{ body: { name: "net.in", unit: "bytes/second", type: "g", value: 3000, dimensions: { InstanceId: "i-065d598370ac25b90", Region: "us-west-1" } }, source_type: "aws:cloudwatch", kind: "metric", attributes: { default_unit: "", default_type: "", _splunk_connection_id: "rest_api:all" } }
Flatten a map or a list with nested data
Your data might contain a list of nested lists, a map of nested maps, or a combination of both. Flattening fields with such nested data can make extracting data easier. Use the flatten(X,Y)
scalar function under the Map category to do this.
- From the Data Pipeline Canvas view, click the + icon at the position on your pipeline where you want to extract data from, and then choose Eval from the function picker.
- In the View Configurations tab of the Eval function, enter the following SPL2 expression in the function field:
field_name = flatten(field_name)
If you are flattening a map, you can optionally pass another parameter to specify the delimiter used to separate keys in the returned map:
field_name = flatten(field_name, your_delimiter)
- In the View Configurations tab of the Eval function, click Update to update the records with your changes.
- Click Start Preview, and then click the Eval function to confirm that the function works as expected. Here are some examples of what your data might look like before and after flattening:
SPL2 example Data before flattening Data after flattening Notes ... | eval flattened_list = flatten(list_field)
[1, null, "foo", ["1-deep", ["2-deep"]], [], 100]
[1, null, "foo", "1-deep", "2-deep", 100]
Returns the flattened list in a new top-level field called flattened_list
.... | eval flattened_map = flatten(map_field)
{"baz": {"foo": 1, "bar": "thing"}, "quux": 3}
{"quux":3,"baz.foo":1,"baz.bar":"thing"}
Returns the flattened map in a new top-level field called flattened_map
.... | eval flattened_map = flatten(map_field, "::")
{"baz": {"foo": 1, "bar": "thing"}, "quux": 3}
{"quux":3,"baz::bar":"thing","baz::foo":1}
Returns the flattened map in a new top-level field called flattened_map
. Also, delimits the keys in the map with::
.... | eval flattened_map = flatten(map_field)
[[1, 2, 3], [{"key1": {"innerkey1": "innerval1"}}]]
[1,2,3,{"key1":{"innerkey1":"innerval1"}}]
Returns the flattened lists in a new top-level field called flattened_list_with_nested_map
. Does not flatten the nested maps that are included in the original list.
Extract a nested map from one field and add it to another field
When your data first gets read into the Splunk Data Stream Processor (DSP), the body
field might contain nested key-value pairs that you want to move to a different top-level field. The following example uses the map_set
scalar function under the Map category to move the dimensions
key from body
into attributes
.
- From the Data Pipelines Canvas view, click the + icon at the position on your pipeline where you want to extract data from, and then choose Eval from the function picker.
- In the View Configurations tab of the Eval function, enter the following SPL2 expression in the function field:
attributes = map_set(attributes, "some_key", "some_value")
- You can also remove the extracted key-value pairs from the
body
field after extracting them. In the View Configurations tab of the Eval function, click + Add to create a new function field and enter the following SPL2 expression the newly added function field:body = map_delete(body, ["some_key"])
- In the View Configurations tab of the Eval function, click Update to update the records with your transformations.
- Click Start Preview, and then click the Eval function to confirm the functions work as expected. Here's an example of what your data might look like before and after extraction:
SPL2 example Data before extraction Data after extraction ... | eval attributes=map_set(attributes, "dimensions", {"InstanceId": "i-065d598370ac25b90", "Region": "us-west-1"}), body = map_delete(body, ["dimensions"])
{ body: { name: "mem.util", unit: "gigabytes", type: "g", value: 20, dimensions: { InstanceId: "i-065d598370ac25b90", Region: "us-west-1" } }, source_type: "aws:cloudwatch", kind: "metric", attributes: { default_unit: "", default_type: "", _splunk_connection_id: "rest_api:all" } }
{ attributes: { default_unit: "", default_type: "", _splunk_connection_id: "rest_api:all", dimensions: { InstanceId: "i-065d598370ac25b90", Region: "us-west-1" } }, body: { name: "mem.util", unit: "gigabytes", type: "g", value: 20 }, source_type: "aws:cloudwatch", kind: "metric" }
Extract all nested keys or values in a map
When a top-level field in your data is a map of multiple key-value pairs, you can get a list of all the nested keys or all the nested values within this top-level field by using the map_keys
and map_values
scalar functions under the Map category.
- From the Data Pipeline Canvas view, and then click the + icon at the position on your pipeline where you want to extract data from and choose Eval from the function picker.
- In the View Configurations tab of the Eval function, enter the following SPL2 expression depending on which information you want to extract:
Information to extract SPL2 expression Output Keys keys = map_keys(field_name)
Create a new keys
top-level field containing the list of keys extracted from the top-level field you pass in.Values values = map_values(field_name)
Create a new values
top-level field containing the list of values extracted from the top-level field you pass in. - In the View Configurations tab of the Eval function, click Update to update the records with the newly created field.
- Click Start Preview, and then click the Eval function to make sure it's working as expected. Here's an example of what your data might look like:
SPL2 example Data example Function output ... | eval keys = map_keys(attributes)
{ body: { name: "mem.util", unit: "gigabytes", type: "g", value: 20, dimensions: { InstanceId: "i-065d598370ac25b90", Region: "us-west-1" } }, source_type: "aws:cloudwatch", kind: "metric", attributes: { default_unit: "gigabytes", default_type: "g", _splunk_connection_id: "rest_api:all" } }
A new top-level field is added to the data schema: keys: [ "default_unit", "default_type", "_splunk_connection_id" ]
... | eval values = map_values(attributes)
A new top-level field is added to the data schema: values: [ "gigabytes", "g", "rest_api:all" ]
Extract an element from a list
When a top-level field in your data contains a list but you only want to work with a specific element in that list, you can extract that element if you know its index position in the list. This example uses the mvindex
scalar function under the List category to extract a record from an array in the body
field.
- From the Data Pipelines Canvas view, click the + icon at the position on your pipeline where you want to extract data from, and then choose Eval from the function list.
- Before using the
mvindex
function, thebody
field has to be cast to a list. In the View Configurations tab of the Eval function, enter the following SPL2 expression in the function field:body = ucast(body, "collection<map<string, any>>", null)
- In the View Configurations tab of the Eval function, click + Add to create a new function field and enter the following SPL2 expression the newly added function field:
extracted_element = mvindex(body, 0)
The first argument of the
mvindex
function is the name of the field from which you want to extract data. The second argument is the index indicating the position of the element you want to extract.Index numbers can be negative. -1 gets the last element in a list, -2 gets the second to last element in a list, and so on. If the index is out of range or does not exist, the function returns null.
- In the View Configurations tab of the Eval function, click Update to update the records with your transformations.
- Click Start Preview, and then click the Eval function to confirm the functions work as expected. Here's an example of what your data might look like before and after extraction:
SPL2 example Data before extraction Data after extraction ... | eval body = ucast(body, "collection<map<string, any>>", null), extracted_element = mvindex(body, 0)
{ body: [ { name: "cpu.util", unit: "percent", type: "g", value: 45, dimensions: { InstanceId: "i-065d598370ac25b90", Region: "us-west-1" } }, { name: "mem.util", unit: "gigabytes", type: "g", value: 20, dimensions: { InstanceId: "i-065d598370ac25b90", Region: "us-west-1" } }, { name: "net.in", unit: "bytes/second", type: "g", value: 3000, dimensions: { InstanceId: "i-065d598370ac25b90", Region: "us-west-1" } } ], attributes: { default_unit: "", default_type: "g", _splunk_connection_id: "rest_api:all" } }
{ body: [ { name: "cpu.util", unit: "percent", type: "g", value: 45, dimensions: { InstanceId: "i-065d598370ac25b90", Region: "us-west-1" } }, { name: "mem.util", unit: "gigabytes", type: "g", value: 20, dimensions: { InstanceId: "i-065d598370ac25b90", Region: "us-west-1" } }, { name: "net.in", unit: "bytes/second", type: "g", value: 3000, dimensions: { InstanceId: "i-065d598370ac25b90", Region: "us-west-1" } } ], extracted_element: { name: "cpu.util", unit: "percent", type: "g", value: 45, dimensions: { InstanceId: "i-065d598370ac25b90", Region: "us-west-1" } }, attributes: { default_unit: "", default_type: "g", _splunk_connection_id: "rest_api:all" } }
Promote a nested field to a top-level field
Some fields in your data might contain nested values that you want to extract and assign to a top-level field. For example, the attributes
field of your data contains an index
key whose value you want to extract in order to format your records so that they match the Splunk HEC metric JSON format. You can extract the value of the index
key with the map_get
scalar function under the Map category.
- From the Data Pipelines Canvas view, click the + icon at the position on your pipeline where you want to extract data from, and then choose To Splunk JSON from the function picker.
- In the View Configurations tab of the To Splunk JSON function, enter the following SPL2 expression in the index field:
cast(map_get(attributes, "index"), "string")
The SPL2 expression also casts the extracted value to string so that it can be admitted as an input for the
To Splunk JSON
function. See Casting and To Splunk JSON in the Data Stream Processor Function Reference for more details. - In the View Configurations tab of the To Splunk JSON function, toggle the keep_attributes button if you want the attributes map to be available as index-extracted fields in the Splunk platform.
- Click Start Preview, and then click the To Splunk JSON function to confirm that the function works as expected. Here's an example of what your data might look like before and after extraction:
SPL2 example Data before extraction Data after extraction ... | to_splunk_json index = cast(map_get(attributes, "index") keep_attributes=true
{ host: "myhost", source: "mysource", source_type: "mysourcetype", kind: "metric", body: [ { name: "Hello World" } ], attributes: { atrr1: "val1", index: "myindex" } }
{ json: "{"event":"Hello World", "source":"mysource", "sourcetype":"mysourcetype", "host":"myhost", "index":"myindex", "fields":{"attr1":"val1"}}" }
Extracting fields in events data | Filtering data |
This documentation applies to the following versions of Splunk® Data Stream Processor: 1.1.0
Feedback submitted, thanks!