Splunk® Data Stream Processor

Getting Data In

Acrobat logo Download manual as PDF


On April 3, 2023, Splunk Data Stream Processor will reach its end of sale, and will reach its end of life on February 28, 2025. If you are an existing DSP customer, please reach out to your account team for more information.
This documentation does not apply to the most recent version of Splunk® Data Stream Processor. For documentation on the most recent version, go to the latest release.
Acrobat logo Download topic as PDF

Format and send events to a DSP data pipeline using the Ingest REST API

To format and send events using the Splunk Ingest REST API, begin by testing whether you can send an event to an ingest endpoint.

Prerequisites

  • The following steps assume that you have SCloud configured. See authenticate with SCloud for instructions on how to configure SCloud.

Steps

  1. Open a command prompt window or terminal.
  2. Type the following SCloud command to test if you can send an event to an ingest endpoint.
    • Use this command to test sending data to the /events endpoint:
    If you're using SCloud 1.0.0:
    ./scloud ingest post-events -format raw <<<  'This is a test event.'

    If you're using SCloud 4.0.0:

    ./scloud ingest post-events --format raw <<<  'This is a test event.'
    • Use this command to test sending data to the /metrics endpoint:
    echo '[{"name":"test", "value":1}]' | ./scloud ingest post-metrics

    The ingest post-metrics command doesn't work for SCloud 1.0.0. If you're using SCloud 1.0.0 and want to send data to the /metrics endpoint, see the Metrics example using cURL command section.

Format event data to send to the /events endpoint

Use the Ingest REST API /events endpoint to send event data to your data pipeline over HTTPS in JSON format. The /events endpoint accepts an array of JSON objects. Each JSON object represents a single DSP event.

[
 {
 "body": "Hello, World!",
 "attributes": {
     "message": "Something happened"
 },
 "host": "dataserver.example.com",
 "source": "testapplication",
 "sourcetype": "txt",
 "timestamp": 1533671808138,
 "nanos": 0,
 "id": "2823738566644596",
 "kind": "event"
 },
 {
 "body": "Hello, World2!",
 "attributes": {
      "message": "Something happened"
 },
 "host": "dataserver.example.com",
 "source": "testapplication",
 "sourcetype": "txt",
 "timestamp": 1533671808138,
 "nanos": 0,
 "id": "2518594268716256",
 "kind": "event"
 },.....
]

Event schema

There are eight keys that can be included in the event schema. Including an event with a field not defined in this schema results in an "INVALID_DATA" error.

Key Required? Description
body Required The event's payload. It can be one of the basic types: a string, bytes, or a number (int32, int64, float, or double). In addition, body can be a list or a map of basic types. The default type of body is a union type of all possible types. To pass body as an argument to a downstream function that requires a specific type, you should first cast body to the appropriate more specific type. See cast or ucast for an example.
attributes Optional Specifies a JSON object that contains explicit custom fields.
host Optional The host value to assign to the event data. This is typically the hostname of the client from which you are sending data.
source Optional The source value to assign to the event data. For example, if you are sending data from an app you are developing, you can set this key to the name of the app.
source_type Optional The sourcetype value to assign to the event data.
timestamp Optional The event time in epoch time format in milliseconds. If this key is missing, a timestamp with the current time is assigned when the event reaches the Ingest REST API service.
nanos Optional Nanoseconds part of the timestamp.
id Optional Unique ID of the event. If it is not specified, the system generates an ID.
kind Optional The value event, to indicate that the record is an event.

Format metrics data to send to the /metrics endpoint

Use the Ingest REST API /metrics endpoint to send metric event data to your data pipeline over HTTPS in JSON format. The /metrics endpoint accepts a list of JSON metric objects. Each JSON object represents a single DSP metric event.


Payload schema = [<JsonMetricObject>, <JsonMetricObject>, ...]  # a list of JsonMetricObject

JsonMetricObject = {
   "body": [<Metric>, <Metric>, ...],
   "timestamp": int64,
   "nanos": int32,
   "source": string,
   "sourcetype": string,
   "host": string,
   "id": string,
   "kind": string,
   "attributes": {
       "defaultDimensions": map[string]string,
       "defaultType": string,
       "defaultUnit": string
   }
}

Metrics schema

There are eight keys that can be included in the metrics schema. Including a metric with a field not defined in this schema results in an "INVALID_DATA" error.

Key Required? Description
body Required An array of one or more JSON objects following the defined schema. Each object represents a measurement of a given metric at the time denoted by the parent object's timestamp.


The body field for the /metrics endpoint uses the following schema:

//  Metric = {
//    "name": "cpu.util",
//    "value": 45.0,
//    "dimensions": {"Server":"nginx", "Region":"us-west-1"},
//    "type": "g",
//    "unit": "percent"}
Metric = map[string] object {
    "name": string,       // required. metric name
    "value": numeric,   // required. double | float | int | long
}
attributes Optional JSON objects that follow the defined schema. For example: {"Server":"nginx", "Region":"us-west-1", ...}. If set, individual metrics inherit these dimensions. If there is a dimension also given in the body field of the individual metric, the body field dimension takes precedent.


The attributes field for the /metrics endpoint uses the following schema:

"attributes": {
  "defaultDimensions": map[string]string, // optional. String map.
     // For example: {"Server":"nginx", "Region":"us-west-1", ...},
  "defaultType": string, 
     // optional. metric type, by default it is "g" for "gauge"
  "defaultUnit": string   
     // optional. metric unit, by default it is "none"
    }
host Optional The host value to assign to the event data. This is typically the host name of the client from which you are sending data.
source Optional The source value to assign to the event data. For example, if you are sending data from an app you are developing, you can set this key to the name of the app.
source_type Optional The source type value to assign to the event data.
timestamp Optional The event time in epoch time format in milliseconds. If this key is missing, a timestamp with the current time is assigned when the event reaches the Ingest REST API service.
nanos Optional Nanoseconds part of the timestamp.
id Optional Unique ID of the event. If it is not specified, the system generates an ID.
kind Optional The value metric, to indicate that the record is a metric event.

Metrics example using cURL command

The following example shows a cURL command that sends CPU and memory utilization metrics to the Ingest REST API.

curl  https://<DSP_HOST>:31000/default/ingest/v1beta2/metrics \
  -H "Authorization: Bearer <token>" -H "Content-Type: application/json" -X POST \
  -d '[
        {
          "body": [
            {
              "name": "cpu.util",
              "value": 45.0,
              "unit": "percent"
            },
            {
              "name": "mem.util",
              "value": 20,
              "unit": "gigabytes"
            },
            {
              "name": "net.in",
              "value": 3000,
              "unit": "bytes/second"
            }
          ],
          "sourcetype": "aws:cloudwatch",
          "timestamp": 1526627123013,
          "attributes": {
             "defaultDimensions": {
             "InstanceId": "i-065d598370ac25b90",
             "Region": "us-west-1"
             },
             "defaultType": "g"
          }
        },
        {
          "body": [
            {
              "name": "cpu.util",
              "value": 49.0,
              "unit": "percent",
              "dimensions" : {
                 "InstanceId": "i-428f599604ba25f91"
                 }
            },
            {
              "name": "mem.util",
              "value": 22,
              "unit": "gigabytes"
            },
            {
              "name": "net.in",
              "value": 4000,
              "unit": "bytes/second"
            }
          ],
          "sourcetype": "aws:cloudwatch",
          "timestamp": 1526627123013,
          "attributes": {
             "defaultDimensions": {
                "InstanceId": "i-065d598370ac25b91",
                "Region": "us-west-1"
                },
             "defaultType": "g"
          }
        }
        ...
      ]'

Metrics examples using SCloud command

To send data using the ./scloud ingest post-metrics command, format your data in a streaming JSON format where each line is an array of metrics. As opposed to cURL, SCloud sends the payload of metrics event to the body field so you don't need to include body in the command.

The ./slcoud ingest post-metrics command works with SCloud 4.0.0 but not with SCloud 1.0.0. If you're using SCloud 1.0.0 and want to send data to the /metrics endpoint, see the Metrics example using cURL command section.

The following example shows an SCloud command that sends an array of CPU and memory utilization metrics records to DSP.

echo '[{"name": "cpu.util", "value": 45.0, "unit": "percent", "sourcetype": "aws:cloudwatch", "dimensions": {"InstanceId": "i-065d598370ac25b90", "Region": "us-west-1"}, "type": "g"}, {"name": "mem.util", "value": 20, "unit": "gigabytes", "dimensions": {"InstanceId": "i-065d598370ac25b90", "Region": "us-west-1"}, "type": "g"}, {"name": "net.in", "value": 3000, "unit": "bytes/second", "sourcetype": "aws:cloudwatch", "dimensions": {"InstanceId": "i-065d598370ac25b90", "Region": "us-west-1"}, "type": "g"}]' | ./scloud ingest post-metrics --sourcetype "aws:cloudwatch" --timestamp 1526627123013 --default-type "g"

In this example, we use flags to assign values to some fields such as sourcetype and timestamp instead of ingesting them through the body of the records. When included as part of the body, these fields get dropped from the record and don't show up in the preview. For the full list of flags you can set, use the ./scloud ingest post-metrics --help command.

Quickly send test events using the Ingest REST API

If you want to quickly send multiple test events into your pipeline, one option is to use the demodata.txt provided in the /examples folder in your DSP working directory.

Prerequisites

Steps

  1. From the Data Management page in the Data Stream Processor UI, select a pipeline with one of the prerequisite source functions.
  2. Click Start Preview.
  3. From your main working DSP directory, navigate to the /examples folder.
    cd examples
  4. Move the demodata.txt file from /examples into your main working directory.
    mv demodata.txt .. 
  5. Navigate back to the main working directory.
    cd ..
  6. Log in to the SCloud CLI with ./scloud login. The password is the same one that you use to log in to the Data Stream Processor: sudo ./print-login to re-print your username and password.

    Your access token and other metadata is returned. Your access token expires in twelve hours, so log in periodically to refresh it.

  7. Use SCloud to send events from demodata.txt. Use one of the following commands:
    • To send the entire contents of the file to your pipeline. This can take up to a minute to run completely.
    If you're using SCloud 1.0.0:
    cat demodata.txt | while read line; do echo $line | ./scloud ingest post-events -host Buttercup -source syslog -sourcetype Unknown; done

    If you're using SCloud 4.0.0:

    cat demodata.txt | while read line; do echo $line | ./scloud ingest post-events --host Buttercup --source syslog --sourcetype Unknown; done
    • To send a subset of the file to your pipeline.

    If you're using SCloud 1.0.0:

    head -n <number of lines to read> demodata.txt| while read line; do echo $line | ./scloud ingest post-events -host Buttercup -source syslog -sourcetype Unknown; done

    If you're using SCloud 4.0.0:

    head -n <number of lines to read> demodata.txt| while read line; do echo $line | ./scloud ingest post-events --host Buttercup --source syslog --sourcetype Unknown; done
Last modified on 04 September, 2020
PREVIOUS
Getting data in overview for the Splunk Data Stream Processor
  NEXT
Send events to a DSP data pipeline using the DSP HTTP Event Collector

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.1.0


Was this documentation topic helpful?


You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters