On October 30, 2022, all 1.2.x versions of the Splunk Data Stream Processor will reach its end of support date. See the Splunk Software Support Policy for details.
About lookup cache quotas
You can cache the contents of a lookup to improve lookup performance, however there are some limitations that you should be aware of. These limitations only apply when you are using the Lookup function, not the Write Thru KV Store function.
The lookup cache is subject to a quota, or a maximum amount of data that can be contained, per pipeline. The following table describes the cache quota that applies for each type of lookup.
Lookup type | Default cache quota per pipeline |
---|---|
CSV | 50 MiB |
KV Store | 200 MiB |
Although there are different cache quotas for each lookup type, the percentage of cache quotas that can be used is shared between CSV and KV Store lookups. For example, if 30% of the CSV cache quota is used, then only 70% of the KV Store cache quota remains available. As another example, if you have the following:
- A pipeline with four lookup functions: two lookups to CSV files and two lookups to KV Stores.
- CSV files that are sizes 10MiB and 20MiB in size.
In this case, you have used 30MiB of your 50MiB total, or 60% of your total quota. That means that you have 40% or 80MiB of the cache quota remaining for KV Store lookups (0.4*200MiB = 80MiB). In order to stay under the cache size limitations, the cache_size
parameter for your KV Store connection should be 40MiB. Since you have two KV Store lookups in your pipeline, this adds up to 80MiB or 83886080 bytes.
Caching is disabled by default for KV Store lookups. To enable caching for KV Store lookups, see Connect to the Splunk Enterprise KV Store using the Streams API.
Configure the maximum lookup cache quota
To ensure that your pipelines using a lookup are not cancelled, best practices are to ensure that all lookup results fit into the cache. There are two settings that control the lookup cache quota.
- The
K8S_SS_REST_LOOKUP_QUOTA_MAX_STATIC_MB
setting specifies the maximum cache quota per pipeline for CSV lookups. - The
K8S_SS_REST_LOOKUP_QUOTA_MAX_CACHED_BYTES
setting specifies the maximum cache quota per pipeline for KV Store lookups.
In addition to the two settings above, you may need to update the Kubernetes memory settings to make sure that you have enough memory to support the desired cache quotas.
- The
K8S_FLINK_TASK_MGR_MEM_LIMIT
setting specifies the minimum amount of memory assigned to the pod. - The
K8S_FLINK_TASK_MGR_MEM_REQUEST
setting specifies the maximum amount of memory assigned to the pod. - The
K8S_FLINK_TASK_MGR_HEAP_MB
setting specifies the heap size for the pod.
Configure the cache quota for CSV lookups
Do the following steps to increase the cache quota for CSV lookups. This allows you to upload CSV files larger than 50MiB using the Streams API.
- Configure the cache quota for CSV lookups by running the following command in the command-line. The value must be in mebibytes (MiB).
./set-config K8S_SS_REST_LOOKUP_QUOTA_MAX_STATIC_MB <value>
- Increase the heap size. Best practices are to increase the heap size by at least 4x the size of the file. For example, if your heap is already set at 3000 (3GB) and you are increasing the lookup cache quota to 100MB, you should set the heap size to be at least 3400 MB.
./set-config K8S_FLINK_TASK_MGR_HEAP_MB <value>
- Since you are increasing the cache quota, you should also increase the minimum amount of memory allocated to a pod accordingly. For a list of accepted memory sizes, see the "Managing Resources for Containers" section in the Kubernetes documentation.
./set-config K8S_FLINK_TASK_MGR_MEM_REQUEST <value>
- Since you are increasing the cache quota, you should also increase the maximum amount of memory allocated to a pod accordingly. For a list of accepted memory sizes, see the "Managing Resources for Containers" section in the Kubernetes documentation.
./set-config K8S_FLINK_TASK_MGR_MEM_LIMIT <value>
- Deploy your changes.
./deploy
Even if you increase the CSV cache quota, there is still a maximum file size of 50MB when uploading a CSV file using the UI. If you want to upload a larger CSV file, you'll need to upload the CSV file using the Streams API. See Upload a CSV file to the to enrich data with a lookup.
Configure the cache quota for KV Store lookups
- Configure the cache quota for KV Store lookups by running the following command in the command-line. The value must in bytes.
./set-config K8S_SS_REST_LOOKUP_QUOTA_MAX_CACHED_BYTES <value>
- Increase the heap size. Best practices are to increase the heap size the same amount that you are increasing the KV Store lookups cache size. For example, if you increased the
K8S_SS_REST_LOOKUP_QUOTA_MAX_CACHED_BYTES
cache size by 500MiB, then you should increase the heap size by at least 500MiB../set-config K8S_FLINK_TASK_MGR_HEAP_MB <value>
- Since you are increasing the cache quota, you should also increase the minimum amount of memory allocated to a pod accordingly. For a list of accepted memory sizes, see the "Managing Resources for Containers" section in the Kubernetes documentation.
./set-config K8S_FLINK_TASK_MGR_MEM_REQUEST <value>
- Since you are increasing the cache quota, you should also increase the maximum amount of memory allocated to a pod accordingly. For a list of accepted memory sizes, see the "Managing Resources for Containers" section in the Kubernetes documentation.
./set-config K8S_FLINK_TASK_MGR_MEM_LIMIT <value>
- Deploy your changes.
./deploy
Connect the to a Splunk Enterprise KV Store | Troubleshoot lookups to the Splunk Enterprise KV Store |
This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0, 1.2.1-patch02
Feedback submitted, thanks!