Extract fields from JSON column values

It is very common to store data in JSON format in MongoDB. Splunk DB Connect parses data retrieved from databases into a key=value structure, which makes it difficult for Splunk to perform field extractions and work with values when dealing with JSON format. In the following sections you will find some ways to achieve this.

MongoDB Atlas

In MongoDB Atlas, when using the SQL interface, you can use the FLATTEN function to extract JSON attributes into columns. For more details see Query with Atlas SQL Statements > FLATTEN

Example: Suppose we have a database named arts and a collection named movies with documents of the following format:

{
  "title": "Schindler's List",
  "director": "Steven Spielberg",
  "year": 1993,
  "metadata": {
    "awards": ["Best Picture Winner"],
    "rating": 9.0
  }
}

In case we need to perform searches in Splunk by metadata, it would be convenient to have field extraction for the metadata attributes. In this case we can use the FLATTEN function to ingest the metadata attributes as independent fields.

SELECT title, director, year, metadata_awards, metadata_rating FROM FLATTEN(arts.movies)

As you can see, after using the FLATTEN function, we will have two new columns available: metadata_awards and metadata_rating.

MongoDB Enterprise

The FLATTEN functionality available in MongoDB Atlas can be achieved using the $project aggregation in MongoDB Enterprise. For more details see $project (aggregation).

Using the above example, we can write a MongoDB Shell query as:

arts.movies.aggregate([{
  $project: 
    {
     title: "$title", 
     director: "$director", 
     year: "$year", 
     metadata_awards: "$metadata.awards", 
     metadata_rating: "$metadata.rating"
    }
}])

Field extraction at search time

In case you have already ingested significant data from MongoDB in JSON format, you can use Splunk's field extraction at search time to make it possible to write more complex and efficient SPL queries.

Using the above example we will define a field extraction for specific sourcetype.

1. Create props.conf file in $SPLUNK_HOME/etc/apps/Splunk_JDBC_mongodb/local.

2. Edit props.conf:

[mongo_movies]
EXTRACT-metadata = metadata=\"(?P<metadata>\{.*?\})\"

3. Restart Splunk.

This will allow Splunk to correctly extract the JSON values for the metadata field. Then, when performing a search, we can do the following:

| spath input=metadata | search rating=9.0

Note: make sure to define the field extraction under the desired sourcetype.

Related answers from Splunk Community

Extract fields from JSON column values

MongoDB Atlas

MongoDB Enterprise

Field extraction at search time

Comments

Extract fields from JSON column values

Was this topic useful?