Recover from a Disaster with Cross-region Disaster Recovery (Early Access)

Cross-region disaster recovery service level agreements and limitations

Cross-region disaster recovery is in the Early Access release phase. In the Early Access release phase, Splunk products might have limitations on customer access, features, maturity, and regional availability. Additionally, its documentation might receive frequent updates, or be incomplete or incorrect. For additional information on Early Access, contact your Splunk representative.

Splunk agrees to the following service level agreements (SLAs) and service limitations when you implement cross-region disaster recovery in your Splunk Cloud Platform environment.

Service level agreements

When you implement the service in your Splunk Cloud Platform environment, Splunk agrees to the following SLAs with respect to data availability, service continuity, and resumption for your environment.

SLA metric What it means Value
Service recovery time objective (RTO) The maximum delay between when Splunk calls a qualified regional disaster event and the restoration of service at 100% infrastructure capacity following the disaster event.

Some data that was ingested prior to a failover might not be available for searching for longer than the RTO due to individual index bucket repair operations.

120 minutes (2 hours)
Recovery point objective (RPO) for data ingestion The maximum amount of time for which the ingestion and indexing system acknowledges that data might not be available on the secondary site following an unplanned failover.

Splunk replicates and makes available 99.99% of the data it ingests during this timeframe. Any data that is not replicated is recoverable after the primary CSP region is operational after the disaster ends in most scenarios.

15 minutes
RPO for available knowledge objects The maximum amount of time for which the knowledge objects that are written to the search tier might be lost following an unplanned failover. The search tier can include search heads or App Key Value Store collections.

See the Service limitations section later in this topic for a complete list of the knowledge objects that Splunk replicates to a secondary CSP region.

120 minutes

Service limitations

The following service limitations exist for environments that implement cross-region disaster recovery.

No support for other Splunk products and services

Cross-region disaster recovery is for Splunk Cloud Platform and its supported configurations only. There is no support on other Splunk products and services, including but not limited to the following:

  • Splunk Enterprise
  • Splunk forwarders of any kind
  • Splunk Application Performance Monitoring
  • Splunk Infrastructure Monitoring
  • Splunk Log Observer
  • Splunk On-Call
  • Splunk Real User Monitoring
  • Splunk Security, Orchestration, Automation, and Response (SOAR)

No support for certain Splunk features

There is no support for using the following features in cross-region disaster recovery. This means that the service won't replicate or recover any of these features if you decide to implement it in your environment.

  • Dynamic Data Self Storage (DDSS)
  • Federated Search S3 (Simple Storage Service)
  • Edge Processor
  • Business Analytics (BA)
  • Threat Intelligence Management (TIM)
  • Splunk Log Observer Connect
  • The simultaneous indexing and forwarding of data using the indexAndForward configuration setting

No support for replication of certain knowledge objects

There is no support for replicating the following types of search objects for the purposes of cross-region disaster recovery.

  • Search artifacts. The service does not replicate the search results of searches that you have previously run. If your environment has dashboards that use results from scheduled searches, either run the searches again or wait for their next scheduled run time to preserve the updated search data.
  • In-progress searches at failover time. Splunk does not replicate the state of searches that are in progress at the time a failover occurs.
  • Triggered alerts and suppression counters. While Splunk does replicate the definitions of both alerts and suppression counters, it does not replicate alerts and suppression counters that have already triggered. On subsequent search runs in the secondary environment after a failover, alerts continue to trigger, and suppression counters get reinitialized.

No support for environment availability in the case of an outage in the secondary cloud service provider region

The service provides resiliency in the scenario of a primary region outage. There is no support for either a regional outage in the secondary region or multiple regional outages.

Other limitations

  • There is no support for the service during an upgrade of your Splunk Cloud Platform environment.
  • Splunk Virtual Compute (SVC) usage data from the primary environment is not available for the time when the environment is in a failed-over state. While your environment operates on the secondary region, it can see the SVC usage data from that region.
  • Admin Config Service (ACS) functionality might not be available while the primary region is in a failed-over state. When your Splunk Cloud Platform environment is failed over, create a support ticket if you need to perform self-service administration activities instead of using ACS APIs.
  • If you use Enterprise Management Encryption Keys (EMEK), Splunk encrypts data that comes from search heads using encryption keys that it generates, rather than your encryption keys.
  • If you use Dynamic Data Active Archive (DDAA) in your Splunk Cloud Platform environment, Splunk does not replicate any existing archived data to the secondary region. Replication of archived data begins after you implement the service in your environment. Archive data that has been replicated is available on the secondary region after a failover.
  • During normal operations, you can search internal system logs on the secondary region, but updates to the logs on the secondary region occur only once a week.
  • During a failover, real-time searches restart once a week.
  • If you create new indexes in your Splunk Cloud Platform environment, wait at least 30 minutes after you create the index before you send data to it. If you begin sending data to your indexes sooner, that data might end up in the "last chance" index instead of the index you specify. The "last chance" index is the index of last resort that Splunk configures for Splunk Cloud Platform instances.
  • After Splunk fails over your SCP environment, it enforces the maximum size limits for indexes. Any historical data that is outside of the index size limits is deleted, with the oldest data deleted first. Where possible, confirm that the size you configured for indexes meets the use case requirements.
Last modified on 13 June, 2024
This documentation applies to the following versions of Splunk Cloud Platform: 9.2.2403 (latest FedRAMP release), 9.2.2406

