Docs » Supported integrations in Splunk Observability Cloud » Configure application exporters and receivers for monitoring » Nagios

Nagios πŸ”—

The Splunk Distribution of OpenTelemetry Collector uses the Smart Agent receiver with the Nagios monitor type to run existing Nagios status check scripts through the Collector, which acts as the Nagios Remote Plugin Executor (NRPE) or the Simple Network Management Protocol (SNMP) exec directive, and send the state of the check, depending on the exit code of the command.

This integration is similar to the telegraf/exec monitor configured with dataFormat:nagios integration, with the following exceptions:

  • It does not retrieve perfdata metrics. This integration only retrieves the state of the script for alerting purposes.

  • It overrides the state if the exit code == 0, but the output string starts with warn, crit, or unkn (not case-sensitive).

This integration adds more context to the status check state by using events. In addition to the state metric, it also sends an event that includes the output and the error caught from the command execution.

Using this integration should make troubleshooting more efficient and let you remain in Splunk Observability Cloud without connecting to your Linux or Windows machine in case of an abnormal state to understand what is happening. Using this integration also lets you create a dashboard that is familiar to Nagios users.

Note

The last sent event is cached into memory and compared to new events to avoid repeatedly sending the same event for each collection interval. Restarting the OTel Collector erases its cache, so the most recently sent event is sent again upon restart. If your check always β€œnormally” produces a different output for each run, for example, the uptime check, you can use the FilterStdOut: true parameter to ignore it in comparison.

This integration is available on Kubernetes, Linux, and Windows.

Benefits πŸ”—

After you configure the integration, you can access these features:

Installation πŸ”—

Follow these steps to deploy this integration:

  1. Deploy the Splunk Distribution of OpenTelemetry Collector to your host or container platform:

  2. Configure the monitor, as described in the Configuration section.

  3. Restart the Splunk Distribution of OpenTelemetry Collector.

Configuration πŸ”—

To use this integration of a Smart Agent monitor with the Collector:

  1. Include the Smart Agent receiver in your configuration file.

  2. Add the monitor type to the Collector configuration, both in the receiver and pipelines sections.

Example πŸ”—

To activate this integration, add the following to your Collector configuration:

receivers:
  smartagent/nagios:
    type: nagios
    command: <command>
    service: <service>
    timeout: 7 #9 by default
    ... # Additional config

Next, add the monitor to the service.pipelines.metrics.receivers section of your configuration file:

service:
  pipelines:
    metrics:
      receivers: [smartagent/nagios]

Event-sending functionality πŸ”—

This monitor includes event-sending functionality to let you post your own custom events to Observability Cloud. For example, you can send your own custom event when you deploy a new version of your software or update other parts of your infrastructure. You can then view these events in the Observability Cloud user interface (UI).

Make monitors with event-sending functionality members of a logs pipeline that uses a SignalFx exporter to make the event submission requests. Use a Resource Detection processor to ensure that host identity and other useful information is made available as event dimensions.

For example:

service:
  pipelines:
    logs:
      receivers:
        - smartagent/<receiver>
# Adds the Resource Detection processor to the logs pipeline.
      processors:
        - resourcedetection
      exporters:
        - signalfx

Configuration settings πŸ”—

The following table shows the configuration options for this monitor:

Option

Required

Type

Description

command

yes

string

The command to exec with any arguments like:

"LC_ALL=\"en_US.utf8\" /usr/lib/nagios/plugins/check_ntp_time -H pool.ntp.typhon.net -w 0.5 -c 1"

service

yes

string

Corresponds to the nagios service column and allows to

aggregate all instances of the same service (when calling the same check script with different arguments)

timeout

no

integer

The max execution time allowed in seconds before sending SIGKILL

(default: 9)

ignoreStdOut

no

bool

If false and change is detected on stdout compared to

the last event it will send a new one (default: false)

ignoreStdErr

no

bool

If false and change is detected on stderr compared to

the last event it will send a new one (default: false)

Metrics πŸ”—

The following metrics are available for this integration:

Notes πŸ”—

  • To learn more about the available in Observability Cloud see Metric types

  • In host-based subscription plans, default metrics are those metrics included in host-based subscriptions in Observability Cloud, such as host, container, or bundled metrics. Custom metrics are not provided by default and might be subject to charges. See Metric categories for more information.

  • In MTS-based subscription plans, all metrics are custom.

  • To add additional metrics, see how to configure extraMetrics in Add additional metrics

Troubleshooting πŸ”—

If you are a Splunk Observability Cloud customer and are not able to see your data in Splunk Observability Cloud, you can get help in the following ways.

Available to Splunk Observability Cloud customers

Available to prospective customers and free trial users

  • Ask a question and get answers through community support at Splunk Answers .

  • Join the Splunk #observability user group Slack channel to communicate with customers, partners, and Splunk employees worldwide. To join, see Chat groups in the Get Started with Splunk Community manual.

To learn about even more support options, see Splunk Customer Success .