Splunk® Data Stream Processor

Use the Data Stream Processor

On April 3, 2023, Splunk Data Stream Processor reached its end of sale, and will reach its end of life on February 28, 2025. If you are an existing DSP customer, please reach out to your account team for more information.

All DSP releases prior to DSP 1.4.0 use Gravity, a Kubernetes orchestrator, which has been announced end-of-life. We have replaced Gravity with an alternative component in DSP 1.4.0. Therefore, we will no longer provide support for versions of DSP prior to DSP 1.4.0 after July 1, 2023. We advise all of our customers to upgrade to DSP 1.4.0 in order to continue to receive full product support from Splunk.

About lookup cache quotas

You can cache the contents of a lookup to improve lookup performance, however there are some limitations that you should be aware of. These limitations only apply when you are using the Lookup function, not the Write Thru KV Store function.

The lookup cache is subject to a quota, or a maximum amount of data that can be contained, per pipeline. The following table describes the cache quota that applies for each type of lookup.

Lookup type Default cache quota per pipeline
CSV 50 MiB
KV Store 200 MiB

Although there are different cache quotas for each lookup type, the percentage of cache quotas that can be used is shared between CSV and KV Store lookups. For example, if 30% of the CSV cache quota is used, then only 70% of the KV Store cache quota remains available. As another example, if you have the following:

  • A pipeline with four lookup functions: two lookups to CSV files and two lookups to KV Stores.
  • CSV files that are sizes 10MiB and 20MiB in size.

In this case, you have used 30MiB of your 50MiB total, or 60% of your total quota. That means that you have 40% or 80MiB of the cache quota remaining for KV Store lookups (0.4*200MiB = 80MiB). In order to stay under the cache size limitations, the cache_size parameter for your KV Store connection should be 40MiB. Since you have two KV Store lookups in your pipeline, this adds up to 80MiB or 83886080 bytes.

Configure the maximum lookup cache quota

To ensure that your pipelines using a lookup are not cancelled, best practices are to ensure that all lookup results fit into the cache. There are two settings that control the lookup cache quota.

  • The lookup_quota_max_static_mb setting specifies the maximum cache quota per pipeline for CSV lookups.
  • The lookup_quota_max_cached_bytes setting specifies the maximum cache quota per pipeline for KV Store lookups.

In addition to the two settings above, you may need to update the Kubernetes memory settings to make sure that you have enough memory to support the desired cache quotas.

  • The tm_mem_limit setting specifies the minimum amount of memory assigned to the pod.
  • The tm_mem_request setting specifies the maximum amount of memory assigned to the pod.

Configure the cache quota for CSV lookups

Do the following steps to increase the cache quota for CSV lookups. This allows you to upload CSV files larger than 50MiB using the Streams API.

  1. Navigate to the working directory of a DSP controller node.
  2. Configure the cache quota for CSV lookups by running the following command in the command-line. The value must be in mebibytes (MiB).
    ./dsp config set streams lookup_quota_max_static_mb=<value>
  3. Since you are increasing the cache quota, you should also increase the minimum amount of memory allocated to a pod accordingly. For a list of accepted memory sizes, see the "Managing Resources for Containers" section in the Kubernetes documentation.
    ./dsp config set flink tm_mem_request=<value>
  4. Since you are increasing the cache quota, you should also increase the maximum amount of memory allocated to a pod accordingly. For a list of accepted memory sizes, see the "Managing Resources for Containers" section in the Kubernetes documentation.
    ./dsp config set flink tm_mem_limit=<value>
  5. Deploy your changes.
    ./dsp deploy streams flink

Even if you increase the CSV cache quota, there is still a maximum file size of 50MB when uploading a CSV file using the UI. If you want to upload a larger CSV file, you'll need to upload the CSV file using the Streams API. See Upload a CSV file to the to enrich data with a lookup.

Configure the cache quota for KV Store lookups

  1. Navigate to the working directory of a DSP controller node.
  2. Configure the cache quota for KV Store lookups by running the following command in the command-line. The value must in bytes.
    ./dsp config set streams lookup_quota_max_cached_bytes=<value>
  3. Since you are increasing the cache quota, you should also increase the minimum amount of memory allocated to a pod accordingly. For a list of accepted memory sizes, see the "Managing Resources for Containers" section in the Kubernetes documentation.
    ./dsp config set flink tm_mem_request=<value>
  4. Since you are increasing the cache quota, you should also increase the maximum amount of memory allocated to a pod accordingly. For a list of accepted memory sizes, see the "Managing Resources for Containers" section in the Kubernetes documentation.
    ./dsp config set flink tm_mem_limit=<value>
  5. Deploy your changes.
    ./dsp deploy streams flink
Last modified on 21 November, 2022
Connect the to a Splunk Enterprise KV Store   Troubleshoot lookups to the Splunk Enterprise KV Store

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 1.4.5


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters