Pilot rollout phase part 2: Initial pilot rollout for Splunk Infrastructure Monitoring 🔗

After completing Pilot rollout phase part 1: Plan your pilot rollout, you are ready for pilot rollout phase part 2. During this part of the pilot, focus on onboarding your pilot teams to Splunk Infrastructure Monitoring. This part of the implementation prepares you to monitor critical solutions and brings business value based on custom metrics.

To onboard Infrastructure Monitoring, complete the following tasks:

Launch Infrastructure Monitoring applications
Understand OTel sizing requirements
Advance configuration using OTel collector (for example, token as a secret, Kubernetes distribution)
Create custom dashboards using charts based on ingested metrics
Configure detectors and alerts for specific metric conditions
Review metric names and ingested data
Add Splunk Observability Cloud to your CI/CD pipeline
Create custom templates for detectors or alerts
Prepare for automation using the REST API
Automate using Terraform
Finalize framework and adoption protocol

Note

Work closely with your Splunk Sales Engineer or Splunk Customer Success Manager throughout your onboarding process. They can help you fine tune your Splunk Observability Cloud journey and provide best practices, training, and workshop advice.

Launch Infrastructure Monitoring applications 🔗

For each of the participating teams, identify which services you want to ingest data from.
Install the OpenTelemetry (OTel) agent.
Configure the receivers and pipeline for these services. This creates the default dashboards and detectors for the services such as databases, message bus, and OS platform.

After you set up these dashboards and detectors, the pilot teams can observe their application data in the built-in dashboards and create their own custom dashboards.

See Built-in dashboards.
See Create and customize dashboards.

Understand OTel sizing requirements 🔗

Before you start scaling up the use of the OTel agents, consider the OTel sizing guidelines. For details about the sizing guidelines, see Sizing and scaling. This is especially important on platforms such as Kubernetes where there can be a sudden growth from various autoscaling services. Ensure that the OTel agents can allocate sufficient memory and CPU needed to aid with a smooth rollout.

Complete advanced configurations for the collector 🔗

As you get ready to roll out your first pilot teams, further secure the Splunk OpenTelemetry Collector. For details, see Security guidelines, permissions, and dependencies. You can store your token as a secret or use different methods to securely store tokens and credentials outside the configuration.yaml file for the OTel agent.

For details on storing the token as a secrets, see Splunk OpenTelemetry Collector for Kubernetes on GitHub
For details on other methods, see Other configuration sources (Alpha/Beta).

Create custom dashboards using charts based on ingested metrics 🔗

As the metrics data is sent to Splunk Observability Cloud, start creating custom dashboards by combining metrics from different tools and services. See the following resources:

See Best practices for creating dashboards in Splunk Observability Cloud.
For Splunk Observability Cloud training, see Free training .
Coordinate with your Splunk Sales Engineer to register for the Splunk Observability Cloud workshop. See Splunk Observability Cloud Workshops

Configure detectors and alerts for specific metric conditions 🔗

As with the custom dashboards, onboard the pilot team with the prepackaged autodetect detectors. Ensure that your teams understand how to develop their own sets of detectors according to each of their use cases, such as by adapting existing detectors or creating their own. See the following resources:

See Intro to AutoDetect alerts and detectors.
For Splunk Observability Cloud training, see Free training .
Coordinate with your Splunk Sales Engineer to register for the Splunk Observability Cloud workshop. See Splunk Observability Cloud Workshops

Review metric names and ingested data 🔗

After your initial onboarding of metrics data, review the name and the amount of metrics each team is ingesting. Make sure the ingest data matches the agreed naming convention for dimensions and properties. If needed, address the name and type of dimensions required to ingest into Splunk Infrastructure Monitoring.

Ensure the teams follow the naming convention setup for metrics, so that you can speed up the development of charts and alerts and create alerts that can detect across a whole range of hosts and nodes.

For details about dimensions, see Dimensions.
For details about properties, see Custom properties.
For details about naming conventions for metrics, see Naming conventions for metrics and dimensions.

Add Splunk Observability Cloud to your CI/CD pipeline 🔗

You should have already deployed exporters and pipelines for OpenTelemetry agents. At this point you are ready to add services into your pipeline. For teams that are familiar with tools such as Ansible, Chef, or Puppet, use the exporter and pipeline templates using OpenTelemetry agents.

You can also use the upstream OpenTelemetry Collector Contrib project, send data using the REST APIs, and send metrics using client libraries.

For details about adding receivers for a database, see Configure application receivers for databases.
For information about using the upstream Collector, see Send telemetry using the OpenTelemetry Collector Contrib project.
For details on the Splunk Observability Cloud REST APIs, see Send metrics, traces, and events using Splunk Observability Cloud REST APIs.
For details on sending metrics using client libraries, see SignalFlow client libraries .

Create custom templates for detectors or alerts 🔗

Create custom templates for detectors and alerts for teams to unify various detectors created by users in your teams. Templates prevent duplicating for detectors with similar alerting requirements. You can also deploy templates using Terraform. For more information about the signalfx_detector with Terraform, see https://registry.terraform.io/providers/splunk-terraform/signalfx/latest/docs/resources/detector on the Terraform Registry.

Prepare for automation using the REST API 🔗

Familiarize yourself with the REST API functions available for Splunk Observability Cloud. For example, you can use the REST API to extract charts, dashboards, or detectors from Splunk Observability Cloud. Most commonly, you can use the REST API to send historical metric time series (MTS) data to Splunk Observability Cloud using the API to correct previously-ingested MTS data.

As a best practices, build templates necessary to onboard the reaming teams.

For details about Splunk Observability Cloud REST API, see Observability API Reference.
For details about using the Splunk Observability Cloud API to extract charts, see Charts API.
For details about using the Splunk Observability Cloud API to extract dashboards, see Dashboards API.
For details about using the Splunk Observability Cloud API to extract detectors, see Detectors API.

Automate using Terraform 🔗

You can automate a large number of deployments using Terraform. The Terraform provider uses the Splunk Observability Cloud REST API.

Use Terraform to help set up integrations to cloud providers, dashboards, and alerts. You can also use Terraform to add customized charts and alerts to newly onboarded teams.

To migrate from existing dashboard groups, dashboards and detectors to Terraform, you can use Python script. See Export dashboards script on GitHub.

For details about the Terraform provider, see https://registry.terraform.io/providers/splunk-terraform/signalfx/latest on the Terraform Registry.
For information on using Terraform, see Connect your cloud services using Splunk Terraform.

Finalize framework and adoption protocol 🔗

As you onboard more teams with Splunk Observability Cloud, maintain review sessions to incorporate what you learned from previous onboardings. Review the feedback from the initial onboarded teams and engage with Splunk Observability Cloud Sales Engineers or Professional Services. Start utilizing resources available to your organization including engaging with your Splunk Observability Cloud Sales Engineer or Professional Services resources. These resources can help you with best practices and faster rollout.

Next step 🔗

Next, begin your initial pilot rollout for Splunk Application Performance Monitoring. Pilot phase part 3: Initial pilot rollout for Splunk Application Performance Monitoring

This page was last updated on May 10, 2024.

Related Topics