Search Reference

 


spath

NOTE - Splunk version 4.x reached its End of Life on October 1, 2013. Please see the migration information.

spath

The spath command--the "s" stands for Splunk (or structured) -- provides a straightforward means for extracting information from structured data formats, XML and JSON. It also highlights the syntax in the displayed events list.

You can also use the eval command's spath() function. For more information, see the Functions for eval and where.

Synopsis

Extracts values from structured data (XML or JSON) and stores them in a field or fields.

Syntax

spath [input=<field>] [output=<field>] [path=<datapath> | <datapath>]

Optional arguments

input
Syntax: input=<field>
Description: The field to read in and extract values. Defaults to _raw.
output
Syntax: output=<field>
Description: If specified, the value extracted from the path is written to this field name.
path
Syntax: path=<datapath> | <datapath>
Description: The location path to the value that you want to extract. If you don't use the path argument, the first unlabeled argument will be used as a path. A location path is composed of one or more location steps, separated by periods; for example 'foo.bar.baz'. A location step is composed of a field name and an optional index surrounded by curly brackets. The index can be an integer, to refer to the data's position in an array (this will differ between JSON and XML), or a string, to refer to an XML attribute. If the index refers to an XML attribute, specify the attribute name with an @ symbol. If you don't specify an output argument, this path becomes the field name for the extracted value.

Description

When called with no path argument, spath runs in "auto-extract" mode, where it finds and extracts all the fields from the first 5000 characters in the input field (which defaults to _raw if another input source isn't specified). If a path is provided, the value of this path is extracted to a field named by the path or to a field specified by the output argument (if it is provided).

A location path contains one or more location steps, each of which has a context that is specified by the location steps that precede it. The context for the top-level location step is implicitly the top-level node of the entire XML or JSON document.

The location step is composed of a field name and an optional array index indicated by curly brackets around an integer or a string. Array indices mean different things in XML and JSON. For example, in JSON, foo.bar{3} refers to the third element of the bar child of the foo element. In XML, this same path refers to the third bar child of foo.

The spath command lets you use wildcards to take the place of an array index in JSON. Now, you can use the location path entities.hashtags{}.text to get the text for all of the hashtags, as opposed to specifying entities.hashtags{0}.text, entities.hashtags{1}.text, etc. The referenced path, here entities.hashtags has to refer to an array for this to make sense (otherwise you get an error, just like with regular array indices).

This also works with XML; for example, catalog.book and catalog.book{} are equivalent (both will get you all the books in the catalog).

Examples

Example 1: GitHub

As an administrator of a number of large git repositories, I want to:

  • see who has committed the most changes and to which repository
  • produce a list of the commits submitted for each user

I set up Splunk to track all the post-commit JSON information, then use spath to extract fields that I call repository, commit_author, and commit_id:

... | spath output=repository path=repository.url
... | spath output=commit_author path=commits.author.name
... | spath output=commit_id path=commits.id

Now, if I want to see who has committed the most changes to a repository, I can run the search:

... | top commit_author by repository

and, to see the list of commits by each user:

... | stats values(commit_id) by commit_author

Example 2: Extract a subset of an attribute

This example shows how to extract values from XML attributes and elements.

<vendorProductSet vendorID="2">
            <product productID="17" units="mm" >
                <prodName nameGroup="custom">
                    <locName locale="all">APLI 01209</locName>
                </prodName>
                <desc descGroup="custom">
                    <locDesc locale="es">Precios</locDesc>
                    <locDesc locale="fr">Prix</locDesc>
                    <locDesc locale="de">Preise</locDesc>
                    <locDesc locale="ca">Preus</locDesc>
                    <locDesc locale="pt">Pre├žos</locDesc> 
                </desc>
           </product>

To extract the values of the locDesc elements (Precios, Prix, Preise, etc.), use:

... | spath output=locDesc path=vendorProductSet.product.desc.locDesc

To extract the value of the locale attribute (es, fr, de, etc.), use:

... | spath output=locDesc.locale path=vendorProductSet.product.desc.locDesc{@locale}

To extract the attribute of the 4th locDesc (ca), use:

... | spath path=vendorProductSet.product.desc.locDesc{4}{@locale}

Example 3: Extract and expand JSON events with multvalued fields

The mvexpand command only works on one multivalued field. This example walks through how to expand a JSON event with more than one multivalued field into individual events for each fields's values. For example, given this event, with sourcetype=json:

{"widget": {
    "text": {
        "data": "Click here",
        "size": 36,
        "data": "Learn more",
        "size": 37,
        "data": "Help",
        "size": 38,
}}

First, start with a search to extract the fields from the JSON and rename them in a table:

sourcetype=json | spath | rename widget.text.size AS size, widget.text.data AS data | table _time,size,data
           _time            size    data
--------------------------- ---- -----------
2012-10-18 14:45:46.000 BST   36 Click here
                              37 Learn more
                              38 Help

Then, use the eval function, mvzip(), to create a new multivalued field named x, with the values of the size and data:

sourcetype=json | spath | rename widget.text.size AS size, widget.text.data AS data | eval x=mvzip(data,size) | table _time,data,size,x
           _time                data    size        x
--------------------------- ----------- ----- --------------
2012-10-18 14:45:46.000 BST Click here   36   Click here,36
                            Learn more   37   Learn more,37
                            Help         38   Help,38

Now, use the mvexpand command to create individual events based on x and the eval function mvindex() to redefine the values for data and size.

sourcetype=json | spath | rename widget.text.size AS size, widget.text.data AS data | eval x=mvzip(data,size)| mvexpand x | eval x = split(x,",") | eval data=mvindex(x,0) | eval size=mvindex(x,1) | table _time,data, size
           _time                data   size
--------------------------- ---------- ----
2012-10-18 14:45:46.000 BST Click here  36
2012-10-18 14:45:46.000 BST Learn more  37
2012-10-18 14:45:46.000 BST Help        38


(Thanks to G. Zaimi for this example.)

More examples

Example 1:

... | spath output=myfield path=foo.bar
... | spath output=myfield path=foo{1}
... | spath output=myfield path=foo.bar{7}.baz

Example 2:

... | spath output=author path=book{@author}

See also

extract, kvform, multikv, regex, rex, xmlkv, xpath

Answers

Have questions? Visit Splunk Answers and see what questions and answers the Splunk community has using the spath command.

This documentation applies to the following versions of Splunk: 4.3 , 4.3.1 , 4.3.2 , 4.3.3 , 4.3.4 , 4.3.5 , 4.3.6 , 4.3.7 , 5.0 , 5.0.1 , 5.0.2 , 5.0.3 , 5.0.4 , 5.0.5 , 5.0.6 , 5.0.7 , 5.0.8 , 5.0.9 , 5.0.10 , 5.0.11 , 6.0 , 6.0.1 , 6.0.2 , 6.0.3 , 6.0.4 , 6.0.5 , 6.0.6 , 6.0.7 , 6.1 , 6.1.1 , 6.1.2 , 6.1.3 , 6.1.4 , 6.1.5 , 6.2.0 , 6.2.1 View the Article History for its revisions.


Comments

Would also like to know a way to use SPATH in transforms, I am using input="field" and can't figure out how to do that using KV_MODE=json.

Rdownie
September 18, 2014

How do I put spath in transforms.conf to enable the auto extractions of the fields instead of manual retrieval ?

Jayannah
June 6, 2014

Example are good, but to me it is not clear why in : "Example 3: Extract and expand JSON events with multvalued fields" you first concatenate all values in X, then made an vmexpand, and then split it to have seperate values... ?
Could you explain a little more that part ?

Sbsbb
November 21, 2012

Example of the JSON payload would help.

Opticsplanet
July 20, 2012

You must be logged into splunk.com in order to post comments. Log in now.

Was this documentation topic helpful?

If you'd like to hear back from us, please provide your email address:

We'd love to hear what you think about this topic or the documentation as a whole. Feedback you enter here will be delivered to the documentation team.

Feedback submitted, thanks!