Use controls for sensitive data in Splunk APM 🔗
Sensitive data such as email addresses, credit card information, and social security numbers require careful handling to protect users and ensure compliance with industry standards. By default, Splunk APM does not capture any sensitive information. Splunk APM only receives information that is explicitly sent to it. If sensitive information is sent by mistake, however, Splunk APM provides a number of controls to prevent and mitigate leaks.
Sensitive data might fall within the categories of personally identifiable information (PII), customer-identifiable information (CII), cardholder data (CHD), or protected health information (PHI). It is necessary to protect these types of data to ensure compliance with industry requirements such as the Payment Card Industry Data Security Standard (PCI DSS), the Health Insurance Portability and Accountability Act (HIPAA), and the General Data Protection Regulation (GDPR).
Prerequisite 🔗
To configure sensitive data in Splunk APM, you must have the admin role.
Remove sensitive data using the Splunk Distribution of OpenTelemetry Collector 🔗
The first line of defense for sensitive information is to use the Splunk autoinstrumentation, which never captures sensitive information.
If sensitive data has been sent to Splunk Observability Cloud during manual instrumentation, you can remove it pre-ingest with the Splunk Distribution of the OpenTelemetry Collector. See Control data to ingest using the Collector for more information, and read the following scenario.
Note
A note about dropping spans
Concealing specific values in spans is the best approach to hide sensitive information in spans. It’s not possible to drop entire spans from your OpenTelemetry pipeline.
Use visibility filter APIs to block data in Splunk APM 🔗
To handle cases in which data is unintentionally sent to Splunk APM and hasn’t been removed in the data ingestion pipeline, users with administrator access can use an API to set visibility filters that block specific span tags in Splunk APM. This hides the information from everywhere in Splunk APM, as well as in API responses. The hidden information is not purged from Splunk APM until it expires after the 8-day default retention period.
The following are 3 examples of scenarios in which you might want to block this kind of information. Note that these APIs are configurable so that you can protect the span tags that might have leaked data without compromising the overall visibility of your services.
Note
See APM Visibility Filter API in the developer documentation for detailed guidance on using these APIs.
Block a specific span tag for a specific service 🔗
If you know that a specific span tag for a service might contain sensitive information, you can hide that span tag and its values everywhere in the Splunk APM UI.
For instance, imagine that Moira has manually instrumented a checkout service in Splunk APM and forgot to block the tags that use the span tags user.email
and credit.card.number
in their instrumentation of the service. The following example API call blocks 2 span tags from all the readCartDetails
operation in checkoutService
.
Request:
POST https://api.<YOUR_REALM>.signalfx.com/v2/apm/visibility-filter
JSON payload:
{
"description": "Data blocked due to leak on 04/03/21",
"startTime": "2021-04-03T15:00:00.073876Z",
"matcher": {
"sf_service": "checkoutService",
"sf_operation": "readCartDetails"
},
"hiddenTags": ["user.email", "credit.card.number"]
}
Note
About realms
A realm is a self-contained deployment of Splunk Observability Cloud in which your organization is hosted. Different realms have different API endpoints. For example, the endpoint for sending data in the us1
realm is https://ingest.us1.signalfx.com
, while the endpoint for sending data in the eu0
realm is https://ingest.eu0.signalfx.com
.
When you see a placeholder realm name in the documentation, such as <YOUR_REALM>
, replace it with your actual realm name. To find your realm name, open the navigation menu in Splunk Observability Cloud, select , and select your username. Locate the realm name in the Organizations section If you don’t include the realm name when specifying an endpoint, Splunk Observability Cloud defaults to the us0
realm.
Remove sensitive information from database queries 🔗
To provide analytics for database queries, Splunk APM captures SQL statements, or queries, from the span in which each call happens. Database queries or statements might contain sensitive information.
After removing identifiable information using the attributes
processor of the Collector, Splunk APM replaces all dynamic elements with the ?
character, a procedure also known as normalization.
The following screenshot shows normalized database queries in Database Query Performance with replaced dynamic elements:
As the db.statement
attribute for SQL databases and the db.operation
attribute for NoSQL databases might still contain sensitive information after normalization, use visibility filters to hide that information in Splunk APM. See Use visibility filter APIs to block data in Splunk APM for more information.
Note
To turn off database query normalization, see Turn off database query normalization.