All DSP releases prior to DSP 1.4.0 use Gravity, a Kubernetes orchestrator, which has been announced end-of-life. We have replaced Gravity with an alternative component in DSP 1.4.0. Therefore, we will no longer provide support for versions of DSP prior to DSP 1.4.0 after July 1, 2023. We advise all of our customers to upgrade to DSP 1.4.0 in order to continue to receive full product support from Splunk.
Write thru KV Store
This topic describes how to use the function in the Splunk Data Stream Processor.
Description
Writes data to a KV store collection that you specify. In the SPL2 Pipeline Builder, this function must be used directly after thru.
To use this function, you must first connect to a Splunk Enterprise KV Store. See Connect DSP to a Splunk Enterprise KV Store.
Because this function is a passthrough (thru) function, it cannot be the last function in the pipeline. If it is the last function in the pipeline, then the pipeline fails to validate with error "Sink dataset [kvstore_lookup_sink] does not exist in registry".
Syntax
The required syntax is in bold.
- kvstore_lookup_sink <lookup_dataset>
- predicate:<boolean-expression>
- lookup_fields:<fields>...
- event_fields:<fields>...
- mode=replace | append
Required arguments
- lookup_dataset
- Syntax: <string>
- Description: The name of the connection to your KV Store collection. Before using this function, you must connect to a Splunk Enterprise KV Store. See Connect DSP to a Splunk Enterprise KV Store.
- Example: my_kvstore_lookup
- predicate
- Syntax: expression<boolean>
- Description: An SPL2 expression that returns a Boolean value. Data is written to the KV Store when the predicate evaluates to true. See Predicates in the Search Manual.
- Example: isnotnull(timestamp)
- lookup_fields
- Syntax: <fields>
- Description: The field names in the KV Store to match to the field names in the incoming records. The arguments
lookup_fields
andevent_fields
must contain the same number of entries and be in the same order for proper matching. If you are using thereplace
mode, you must explicitly specify the_key
field since that is the primary key in Splunk Enterprise KV collection datasets. - Example: _key, date, source
- event_fields
- Syntax: <fields>
- Description: The field names in the incoming records to match to the field names in the KV Store. The arguments
lookup_fields
andevent_fields
must contain the same number of entries and be in the same order for proper matching. - Example: id, timestamp, source
Optional arguments
- mode
- Syntax: replace or append
- Description: Choose a mode to specify how DSP writes data to the KV Store.
- Example: append
Write mode Description replace In replace mode, incoming data from DSP replaces existing data in the KV Store collection using a defined primary key. The function takes two lists of the same length, lookup_fields
andevent_fields
.event_fields
contains field names of the incoming event, andlookup_fields
contains the corresponding field names to update in the collection. One of thelookup_fields
must be_key
, and the corresponding field name inevent_fields
designates the primary key used to match records. See the SPL2 example below.In replace mode, since you are overwriting rows in the database with the specified
_key
, a single version of the row will exist in the table, making it easy to see the most recent data.append In append mode, since the primary key field _key
is auto-generated, incoming data from DSP is appended to the designated KV Store. The KV Store's existing rows are not updated.
Usage
This section contains additional usage information about the Write Thru KV Store function.
Previewing the Write thru KV Store function
Because this function does not do any transformations on incoming streaming data, this function shows the same preview results as the function preceding it.
Performing lookups on large KV stores
When your KV Store collection is extremely large, performance can suffer when your lookups must search through the entire collection to retrieve matching field values. When you are writing data to a Splunk Enterprise KV Store, be mindful of your collection size and growth. If you are seeing performance issues on your lookup function, then your Splunk Enterprise KV Store collection might be reaching capacity and you might need to manually delete old entries from the collection. See Use the REST API to manage KV Store collections and data for information on how to delete records from a collection.
SPL2 examples
1. Replace data in a KV Store collection with data from DSP
This example writes incoming streaming data to the lookup dataset mylookupdataset
.
...| thru kvstore_lookup_sink(lookup_dataset: "mylookupdataset", predicate: not(isnull('productId')), mode:"replace", lookup_fields: ["_key", "productId", "product_name", "price"], event_fields: ["id", "productId", "body", "sale_price"]);
Suppose your KV Store collection contains the following data:
_key | productId | product_name | price |
---|---|---|---|
5f3f5454d240a1300236fec1 | DB-SG-G01 | Mediocre Kingdoms | 24.99 |
5f3f5454d240a1300236fec2 | DC-SG-G02 | Dream Crusher | 39.99 |
5f3f5454d240a1300236fec3 | WC-SH-G04 | World of Cheese | 19.99 |
The incoming DSP records look something like this:
id | productId | body | sale_price |
---|---|---|---|
5f3f5454d240a1300236fec1 | DB-SG-G01 | Mediocre Kingdoms | 19.99 |
5f3f5454d240a1300236fec2 | DB-SG-G02 | Dream Crusher | 14.99 |
5f3f5454d240a1300236fec3 | DB-SG-G03 | World of Cheese | 9.99 |
After configuring the Write Thru KV Store function, the data in the KV Store collection is updated with data from DSP.
_key | productId | product_name | price |
---|---|---|---|
5f3f5454d240a1300236fec1 | DB-SG-G01 | Mediocre Kingdoms | 19.99 |
5f3f5454d240a1300236fec2 | DB-SG-G02 | Dream Crusher | 14.99 |
5f3f5454d240a1300236fec3 | DB-SG-G03 | World of Cheese | 9.99 |
2. Add data from DSP to the KV Store collection
This example writes incoming streaming data to the lookup dataset mylookupdataset
.
...| thru kvstore_lookup_sink(lookup_dataset: "mylookupdataset", predicate: not(isnull(productId)), mode:"append", lookup_fields: ["productId", "product_name", "price"], event_fields: ["productId", "body", "sale_price"]);
Suppose your KV Store collection contains the following data:
_key | productId | product_name | price |
---|---|---|---|
5f3f5454d240a1300236fec1 | DB-SG-G01 | Mediocre Kingdoms | 24.99 |
5f3f5454d240a1300236fec2 | DC-SG-G02 | Dream Crusher | 39.99 |
5f3f5454d240a1300236fec3 | WC-SH-G04 | World of Cheese | 19.99 |
The incoming DSP records look something like this:
id | productId | body | sale_price |
---|---|---|---|
2518594268716256 | EW-SG-G04 | Escape from Waterworld | 35.00 |
2518594268716257 | FT-SG-G05 | Farm Town | 19.99 |
2518594268716258 | DB-SG-G06 | Puzzle Solver | 4.99 |
After configuring the Write Thru KV Store function, the data in the KV Store collection is now enriched with the incoming streaming data. Note that since _key
was not present in DSP, the KV Store generated a primary key (_key) value. In addition, even though id
was a field in DSP, it was not written to the KV Store collection because it wasn't specified in the event_fields
argument.
_key | productId | product_name | price |
---|---|---|---|
5f3f5454d240a1300236fec1 | DB-SG-G01 | Mediocre Kingdoms | 24.99 |
5f3f5454d240a1300236fec2 | DC-SG-G02 | Dream Crusher | 39.99 |
5f3f5454d240a1300236fec3 | WC-SH-G04 | World of Cheese | 19.99 |
5f3f5454d240a1300236fec4 | EW-SG-G04 | Escape from Waterworld | 35.00 |
5f3f5454d240a1300236fec5 | FT-SG-G05 | Farm Town | 19.99 |
5f3f5454d240a1300236fec6 | DB-SG-G06 | Puzzle Solver | 4.99 |
Thru | Get data from Splunk DSP Firehose |
This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0, 1.2.1-patch02, 1.2.1, 1.2.2-patch02, 1.2.4, 1.2.5, 1.3.0, 1.3.1, 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 1.4.5, 1.4.6
Feedback submitted, thanks!