Hash fields using Ingest Processor
Create a pipeline that hashes specific fields in your data. When you hash a field, the Ingest Processor uses the selected hashing algorithm to compute a hash value or "digest" based on the original data values from that field. You can hash fields in order to obfuscate some of the data and prevent it from being directly human-readable.
Be aware that hashing alone might not be sufficient for anonymizing sensitive data or meeting compliance guidelines. Refer to your organization's compliance policies for more information.
As a best practice for preventing unwanted data loss, make sure to always have a default destination for your Ingest Processor pipeline. Otherwise, all unprocessed data is dropped.
Supported hashing algorithms
Ingest Processor supports the following hashing algorithms:
Hashing algorithm | Value | Example SPL2 |
---|---|---|
MD5 | 128-bit hash value | $pipeline = | from $source | eval <hashed_field> = md5(<original_field>) | into destination |
SHA-1 | 160-bit hash value | $pipeline = | from $source | eval <hashed_field>= sha1(<original_field>) | into destination |
SHA-256 | 256-bit hash value | $pipeline = | from $source | eval <hashed_field>= sha256(<original_field>) | into destination |
SHA-512 | 512-bit hash value | $pipeline = | from $source | eval <hashed_field>= sha512(<original_field>) | into destination |
Prerequisites
Before starting to create a pipeline, confirm that the destination that you want the pipeline to send data to is listed on the Destinations page of your tenant. If your destination is not listed, then you must add that destination to your tenant. See Add or manage destinations for more information.
Steps
Perform the following steps to create a pipeline that hashes an event field:
Create a pipeline
Complete these steps to create a basic pipeline that receives a specific subset of the incoming data and then sends that data to a destination.
- Navigate to the Pipelines page, then select New pipeline and then Ingest Processor pipeline.
- Select Blank pipeline, and then select Next.
- On the Define your pipeline's partition page, do the following:
- Select the plus () icon next to Partition, or select the option that matches how you would like to create your partition in the Suggestions section.
- In the Field field, specify the event field that you want the partitioning condition to be based on.
- To specify whether the pipeline includes or excludes the data that meets the criteria, select Keep or Remove.
- In the Operator field, select an operator for the partitioning condition.
- In the Value field, enter the value that your partition should filter by to create the subset.
- Select Apply.
- Once you have defined your partition, select Next.
- (Optional) On the Add sample data page, enter or upload sample data for generating previews that show how your pipeline processes data.
The sample data must be in the same format as the actual data that you want to process. See Getting sample data for previewing data transformations for more information.
- Select Next to confirm the sample data that you want to use for your pipeline.
- On the Select a metrics destination page, select the name of the destination that you want to send metrics to.
- (Optional) If you selected Splunk Metrics store as your metrics destination, specify the name of the target metrics index where you want to send your metrics.
- On the Select a data destination page, select the name of the destination that you want to send logs to.
- (Optional) If you selected a Splunk platform destination, you can configure index routing:
- Select one of the following options in the expanded destinations panel:
Option Description Default The pipeline does not route events to a specific index.
If the event metadata already specifies an index, then the event is sent to that index. Otherwise, the event is sent to the default index of the Splunk Cloud Platform deployment.Specify index for events with no index The pipeline only routes events to your specified index if the event metadata did not already specify an index. Specify index for all events The pipeline routes all events to your specified index. - If you selected Specify index for events with no index or Specify index for all events, then from the Index name drop-down list, select the name of the index that you want to send your data to.
If your desired index is not available in the drop-down list, then confirm that the index is configured to be available to the tenant and then refresh the connection between the tenant and the Splunk Cloud Platform deployment. For detailed instructions, see Make more indexes available to the tenant.
- Select one of the following options in the expanded destinations panel:
- Select Done to confirm the data destination.
You can create more conditions for a partition in a pipeline by selecting the plus () icon.
If you're sending data to a Splunk Cloud Platform deployment, be aware that the destination index is determined by a precedence order of configurations. See How does Ingest Processor know which index to send data to? for more information
You now have a simple pipeline that receives a specific subset of the incoming data and sends that data to a destination. In the next section, you'll configure this pipeline to hash an event field.
Configure hashing in your pipeline
During the previous step, you created a basic pipeline that receives a specific subset of data and then sends that data to a destination. The next step is to configure the pipeline to hash fields in the received events.
Be aware that after you hash an event field, the original plain text might still remain in other parts of the event. To hide the plain text, you must remove the field, mask the data, or perform both actions, as needed.
- Select the plus icon () in the Actions section, then select Compute hash of.
- In the Compute hash of a field dialog box, do the following:
- In the Source field field, specify the field containing the plain text values that you want to compute into hash values.
- Select the hashing algorithm that you want to use to compute the hash values.
- In the Target field field, enter the name of an event field where you want to store the hash values. You can specify an existing event field or the name of a new field that you want to add to your events. If you want to overwrite the original plain text values in the specified Source field with the hash values, then enter the same field as the Source field setting.
- When you have completed your configurations, click Apply.
- If the original plain text values still exist in other parts of the event, then configure additional processing actions to remove or mask those values.
- For information about dropping fields from events, see fields command overview in the SPL2 Search Reference.
- For information about masking data values, see Filter and mask data Ingest Processor.
You now have a pipeline that hashes a selected field. In the next section, you'll verify that this pipeline processes data in the way that you expect and save it to be applied to an Ingest Processor.
Preview, save, and apply your pipeline
- (Optional) Select the Preview Pipeline icon () to generate a preview that shows what your data looks like when it passes through the pipeline.
- To save your pipeline, do the following:
- Select Save pipeline.
- In the Name field, enter a name for your pipeline.
- (Optional) In the Description field, enter a description for your pipeline.
- Select Save.
The pipeline is now listed on the Pipelines page, and you can apply it as needed.
- To apply this pipeline, do the following:
- Navigate to the Pipelines page.
- In the row that lists your pipeline, select the Actions icon () and then select Apply.
- Select the pipeline that you want to apply, and then select Save.
It can take a few minutes for the Ingest Processor service to finish applying your pipeline. During this time, the pipeline enters the Pending Apply status (). Once the operation is complete, the Pending Apply status icon () stops displaying beside the pipeline. Refresh your browser to check if the icon no longer displays.
The Ingest Processor can now hash the specified field in the events that it receives.
Filter and mask data using Ingest Processor | Extract fields from event data using Ingest Processor |
This documentation applies to the following versions of Splunk Cloud Platform™: 9.1.2308, 9.1.2312, 9.2.2403, 9.2.2406 (latest FedRAMP release)
Feedback submitted, thanks!