Splunk Cloud Platform

Use Ingest Processors

Extract JSON fields from data using Ingest Processor

You can create a pipeline that extracts JSON fields from data. Field extraction lets you capture information from your data in a more visible way and configure further data processing based on those fields.

If you're sending data to Splunk Enterprise or Splunk Cloud Platform, be aware that some fields are extracted automatically during indexing. Additionally, be aware that indexing extracted fields can have an impact on indexing performance and search times. Consider the following best practices when configuring field extractions in your pipeline: Extract fields only as necessary. When possible, extract fields at search time instead. Avoid duplicating your data and increasing the size of your events. After extracting a value into a field, either remove the original value or drop the extracted field after you have finished using it to support a data processing action. For more information, see When Splunk software extracts fields in the Splunk Cloud Platform Knowledge Manager Manual.

Pipelines don't extract any fields by default. If a pipeline receives data from a data source that doesn't extract data values into fields, such as a universal forwarder without any add-ons, then the pipeline stores each event as a text string in a field named _raw.

Steps

To create a field extraction pipeline, use the Extract JSON fields from action in the pipeline editor to identify the field names you want to extract. Complete these steps to create a pipeline that receives data associated with a specific source type, optionally processes it, and sends that data to a destination.

  1. From the home page on Splunk Cloud Platform, navigate to the Pipelines page and then select New pipeline, then Ingest Processor pipeline.
  2. On the Get started page, select Blank pipeline, then Next.
  3. On the Define your pipeline's partition page, do the following:
    1. Select how you want to partition your incoming data that you want to send to your pipeline. You can partition by source type, source, and host.
    2. Enter the conditions for your partition, including the operator and the value. Your pipeline will receive and process the incoming data that meets these conditions.
    3. Select Next to confirm the pipeline partition.
  4. On the Add sample data page, do the following:
    1. Enter or upload sample data for generating previews that show how your pipeline processes data. The sample data must contain accurate examples of the values that you want to extract into JSON fields.
    2. Select Next to confirm the sample data that you want to use for your pipeline.
  5. On the Select a metrics destination page, select the name of the destination that you want to send metrics to.
  6. (Optional) If you selected Splunk Metrics store as your metrics destination, specify the name of the target metrics index where you want to send your metrics.
  7. On the Select a data destination page, select the name of the destination that you want to send logs to.
  8. (Optional) If you selected a Splunk platform destination, you can configure index routing:
    1. Select one of the following options in the expanded destination panel:
      Option Description
      Default The pipeline does not route events to a specific index.


      If the event metadata already specifies an index, then the event is sent to that index. Otherwise, the event is sent to the default index of the Splunk Cloud Platform deployment.

      Specify index for events with no index The pipeline only routes events to your specified index if the event metadata did not already specify an index.
      Specify index for all events The pipeline routes all events to your specified index.
    2. If you selected Specify index for events with no index or Specify index for all events, then from the Index name drop-down list, select the name of the index that you want to send your data to.
      If your desired index is not available in the drop-down list, then confirm that the index is configured to be available to the tenant and then refresh the connection between the tenant and the Splunk Cloud Platform deployment. For detailed instructions, see Make more indexes available to the tenant.
  9. If you're sending data to a Splunk Cloud Platform deployment, be aware that the destination index is determined by a precedence order of configurations. See How does Ingest Processor know which index to send data to? for more information

  10. Select Done to confirm the data destination.
    After you complete the on-screen instructions, the pipeline builder displays the SPL2 statement for your pipeline.
  11. (Optional) To generate a preview of how your pipeline processes data based on the sample data that you provided, select the Preview Pipeline icon (Image of the Preview Pipeline icon). Use the preview results to validate your pipeline configuration.
  12. Select the plus icon (This image shows an icon of a plus sign.) in the Actions section, then select Extract JSON fields from.
  13. Select the field that you want to extract from your data, then select Apply.
  14. To extract multiple fields from your data, repeat steps 10-11.
  15. To save your pipeline, do the following:
    1. Select Save pipeline.
    2. In the Name field, enter a name for your pipeline.
    3. (Optional) In the Description field, enter a description for your pipeline.
    4. Select Save. The pipeline is now listed on the Pipelines page, and you can now apply it, as needed.
  16. To apply this pipeline, do the following:
    1. Navigate to the Pipelines page.
    2. In the row that lists your pipeline, select the Actions icon (Image of the Actions icon), and then select Apply.
    3. Select the pipelines that you want to apply, and then select Save. It can take a few minutes to finish applying your pipeline. During this time, all applied pipelines enter the Pending status.
    4. (Optional) To confirm that the Ingest Processor service has finished applying your pipeline, navigate to the Ingest Processor page and check if all affected Ingest Processors have returned to the Healthy status.

The Ingest Processor can now extract JSON fields as specified in the pipeline configuration.

Last modified on 10 September, 2024
Extract timestamps from event data using Ingest Processor   Generate logs into metrics using Ingest Processor

This documentation applies to the following versions of Splunk Cloud Platform: 9.1.2308, 9.1.2312, 9.2.2403 (latest FedRAMP release), 9.2.2406


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters