Cassandra ๐
The Splunk Distribution of OpenTelemetry Collector uses the Smart Agent receiver with the Cassandra monitor type to monitor Cassandra.
This integration is only available on Kubernetes and Linux.
Benefits ๐
After you configure the integration, you can access these features:
View metrics. You can create your own custom dashboards, and most monitors provide built-in dashboards as well. For information about dashboards, see View dashboards in Splunk Observability Cloud.
View a data-driven visualization of the physical servers, virtual machines, AWS instances, and other resources in your environment that are visible to Infrastructure Monitoring. For information about navigators, see Use navigators in Splunk Infrastructure Monitoring.
Access the Metric Finder and search for metrics sent by the monitor. For information, see Search the Metric Finder and Metadata Catalog.
Installation ๐
Follow these steps to deploy this integration:
Deploy the Splunk Distribution of OpenTelemetry Collector to your host or container platform:
Configure the monitor, as described in the Configuration section.
Restart the Splunk Distribution of OpenTelemetry Collector.
Configuration ๐
To use this integration of a Smart Agent monitor with the Collector:
Include the Smart Agent receiver in your configuration file.
Add the monitor type to the Collector configuration, both in the receiver and pipelines sections.
See how to Use Smart Agent monitors with the Collector.
See how to set up the Smart Agent receiver.
For a list of common configuration options, refer to Common configuration settings for monitors.
Learn more about the Collector at Get started: Understand and use the Collector.
Example ๐
To activate this integration, add the following to your Collector configuration:
receivers:
smartagent/cassandra:
type: collectd/cassandra
... # Additional config
Next, add the monitor to the service.pipelines.metrics.receivers
section of your configuration file:
service:
pipelines:
metrics:
receivers: [smartagent/cassandra]
Configuration settings ๐
The following table shows the configuration options for this integration:
Option |
Required |
Type |
Description |
---|---|---|---|
|
yes |
|
Use this string to specify the host to connect to. |
|
yes |
|
|
|
no |
|
|
|
no |
|
|
|
no |
|
|
|
no |
|
|
|
no |
|
|
|
no |
|
Use this value to specify the password for the user name. |
|
no |
|
|
|
no |
|
|
|
no |
|
|
|
no |
|
|
The mBeanDefinitions
configuration option has the following fields:
Option |
Required |
Type |
Description |
---|---|---|---|
|
no |
|
|
|
no |
|
This value prefixes the generated plugin instance. |
|
no |
|
|
|
no |
|
|
|
no |
|
The values
configuration option has the following fields:
Option |
Required |
Type |
Description |
---|---|---|---|
|
no |
|
|
|
no |
|
|
|
no |
|
|
|
no |
|
|
|
no |
|
|
|
no |
|
Use this option to derive multiple metrics from a single MBean. |
Metrics ๐
The following metrics are available for this integration:
Name | Description | Category | Type |
---|---|---|---|
gauge.jvm.threads.count | Number of JVM threads | Default | gauge |
gauge.loaded_classes | Number of classes loaded in the JVM | Default | gauge |
invocations | Total number of garbage collection events | Default | cumulative |
jmx_memory.committed | Amount of memory guaranteed to be available in bytes | Default | gauge |
jmx_memory.max | Maximum amount of memory that can be used in bytes | Default | gauge |
jmx_memory.used | Current memory usage in bytes | Default | gauge |
total_time_in_ms.collection_time | Amount of time spent garbage collecting in milliseconds | Default | cumulative |
jmx_memory.init | Amount of initial memory at startup in bytes | Default | gauge |
counter.cassandra.ClientRequest.RangeSlice.Latency.Count | Count of range slice operations since server start. This typically indicates a server overload condition. If this value is increasing across the cluster then the cluster is too small for the application range slice load. If this value is increasing for a single server in a cluster, then one of the following conditions may be true:
| Default | cumulative |
counter.cassandra.ClientRequest.RangeSlice.TotalLatency.Count | The total number of microseconds elapsed in servicing range slice requests. | Custom | cumulative |
counter.cassandra.ClientRequest.RangeSlice.Timeouts.Count | Count of range slice timeouts since server start. This typically indicates a server overload condition. If this value is increasing across the cluster then the cluster is too small for the application range slice load. If this value is increasing for a single server in a cluster, then one of the following conditions may be true:
| Default | cumulative |
counter.cassandra.ClientRequest.RangeSlice.Unavailables.Count | Count of range slice unavailables since server start. A non-zero value means that insufficient replicas were available to fulfil a range slice request at the requested consistency level. This typically means that one or more nodes are down. To fix this condition, any down nodes must be restarted, or removed from the cluster. | Default | cumulative |
counter.cassandra.ClientRequest.Read.Latency.Count | Count of read operations since server start. | Default | cumulative |
counter.cassandra.ClientRequest.Read.TotalLatency.Count | The total number of microseconds elapsed in servicing client read requests. It can be devided by | Custom | cumulative |
counter.cassandra.ClientRequest.CASRead.Latency.Count | Count of transactional read operations since server start. | Custom | cumulative |
counter.cassandra.ClientRequest.CASRead.TotalLatency.Count | The total number of microseconds elapsed in servicing client transactional read requests. It can be devided by | Custom | cumulative |
counter.cassandra.ClientRequest.Read.Timeouts.Count | Count of read timeouts since server start. This typically indicates a server overload condition. If this value is increasing across the cluster then the cluster is too small for the application read load. If this value is increasing for a single server in a cluster, then one of the following conditions may be true:
| Default | cumulative |
counter.cassandra.ClientRequest.Read.Unavailables.Count | Count of read unavailables since server start. A non-zero value means that insufficient replicas were available to fulfil a read request at the requested consistency level. This typically means that one or more nodes are down. To fix this condition, any down nodes must be restarted, or removed from the cluster. | Default | cumulative |
counter.cassandra.ClientRequest.Write.Latency.Count | Count of write operations since server start. | Default | cumulative |
counter.cassandra.ClientRequest.Write.TotalLatency.Count | The total number of microseconds elapsed in servicing client write requests. It can be devided by | Custom | cumulative |
counter.cassandra.ClientRequest.CASWrite.Latency.Count | Count of transactional write operations since server start. | Custom | cumulative |
counter.cassandra.ClientRequest.CASWrite.TotalLatency.Count | The total number of microseconds elapsed in servicing client transactional write requests. It can be devided by | Custom | cumulative |
counter.cassandra.ClientRequest.Write.Timeouts.Count | Count of write timeouts since server start. This typically indicates a server overload condition. If this value is increasing across the cluster then the cluster is too small for the application write load. If this value is increasing for a single server in a cluster, then one of the following conditions may be true:
| Default | cumulative |
counter.cassandra.ClientRequest.Write.Unavailables.Count | Count of write unavailables since server start. A non-zero value means that insufficient replicas were available to fulfil a write request at the requested consistency level. This typically means that one or more nodes are down. To fix this condition, any down nodes must be restarted, or removed from the cluster. | Default | cumulative |
counter.cassandra.Compaction.TotalCompactionsCompleted.Count | Number of compaction operations since node start. If this value does not increase steadily over time then the node may be experiencing problems completing compaction operations. | Custom | cumulative |
gauge.cassandra.ClientRequest.RangeSlice.Latency.50thPercentile | 50th percentile (median) of Cassandra range slice latency. This value should be similar across all nodes in the cluster. If some nodes have higher values than the rest of the cluster then they may have more connected clients or may be experiencing heavier than usual compaction load. | Custom | gauge |
gauge.cassandra.ClientRequest.RangeSlice.Latency.99thPercentile | 99th percentile of Cassandra range slice latency. This value should be similar across all nodes in the cluster. If some nodes have higher values than the rest of the cluster then they may have more connected clients or may be experiencing heavier than usual compaction load. | Default | gauge |
gauge.cassandra.ClientRequest.RangeSlice.Latency.Max | Maximum Cassandra range slice latency. | Custom | gauge |
gauge.cassandra.ClientRequest.Read.Latency.50thPercentile | 50th percentile (median) of Cassandra read latency. This value should be similar across all nodes in the cluster. If some nodes have higher values than the rest of the cluster then they may have more connected clients or may be experiencing heavier than usual compaction load. | Default | gauge |
gauge.cassandra.ClientRequest.Read.Latency.99thPercentile | 99th percentile of Cassandra read latency. This value should be similar across all nodes in the cluster. If some nodes have higher values than the rest of the cluster then they may have more connected clients or may be experiencing heavier than usual compaction load. | Default | gauge |
gauge.cassandra.ClientRequest.Read.Latency.Max | Maximum Cassandra read latency. | Default | gauge |
gauge.cassandra.ClientRequest.CASRead.Latency.50thPercentile | 50th percentile (median) of Cassandra transactional read latency. | Custom | gauge |
gauge.cassandra.ClientRequest.CASRead.Latency.99thPercentile | 99th percentile of Cassandra transactional read latency. | Custom | gauge |
gauge.cassandra.ClientRequest.CASRead.Latency.Max | Maximum Cassandra transactional read latency. | Custom | gauge |
gauge.cassandra.ClientRequest.Write.Latency.50thPercentile | 50th percentile (median) of Cassandra write latency. This value should be similar across all nodes in the cluster. If some nodes have higher values than the rest of the cluster then they may have more connected clients or may be experiencing heavier than usual compaction load. | Default | gauge |
gauge.cassandra.ClientRequest.Write.Latency.99thPercentile | 99th percentile of Cassandra write latency. This value should be similar across all nodes in the cluster. If some nodes have higher values than the rest of the cluster then they may have more connected clients or may be experiencing heavier than usual compaction load. | Default | gauge |
gauge.cassandra.ClientRequest.Write.Latency.Max | Maximum Cassandra write latency | Default | gauge |
gauge.cassandra.ClientRequest.CASWrite.Latency.50thPercentile | 50th percentile (median) of Cassandra transactional write latency. | Custom | gauge |
gauge.cassandra.ClientRequest.CASWrite.Latency.99thPercentile | 99th percentile of Cassandra transactional write latency. | Custom | gauge |
gauge.cassandra.ClientRequest.CASWrite.Latency.Max | Maximum Cassandra transactional write latency. | Custom | gauge |
gauge.cassandra.Compaction.PendingTasks.Value | Number of compaction operations waiting to run. If this value is continually increasing then the node may be experiencing problems completing compaction operations. | Default | gauge |
counter.cassandra.Storage.Exceptions.Count | Number of internal exceptions caught. Under normal exceptions this should be zero. | Custom | cumulative |
counter.cassandra.Storage.Load.Count | Storage used for Cassandra data in bytes. Use this metric to see how much storage is being used for data by a Cassandra node. The value of this metric is influenced by:
| Default | cumulative |
counter.cassandra.Storage.TotalHints.Count | Total hints since node start. Indicates that write operations cannot be delivered to a node, usually because a node is down. If this value is increasing and all nodes are up then there may be some connectivity issue between nodes in the cluster. | Custom | cumulative |
counter.cassandra.Storage.TotalHintsInProgress.Count | Total pending hints. Indicates that write operations cannot be delivered to a node, usually because a node is down. If this value is increasing and all nodes are up then there may be some connectivity issue between nodes in the cluster. | Default | cumulative |
Notes ๐
To learn more about the available in Splunk Observability Cloud see Metric types
In host-based subscription plans, default metrics are those metrics included in host-based subscriptions in Splunk Observability Cloud, such as host, container, or bundled metrics. Custom metrics are not provided by default and might be subject to charges. See Metric categories for more information.
In MTS-based subscription plans, all metrics are custom.
To add additional metrics, see how to configure
extraMetrics
in Add additional metrics
Troubleshooting ๐
If you are a Splunk Observability Cloud customer and are not able to see your data in Splunk Observability Cloud, you can get help in the following ways.
Available to Splunk Observability Cloud customers
Submit a case in the Splunk Support Portal .
Contact Splunk Support .
Available to prospective customers and free trial users
Ask a question and get answers through community support at Splunk Answers .
Join the Splunk #observability user group Slack channel to communicate with customers, partners, and Splunk employees worldwide. To join, see Chat groups in the Get Started with Splunk Community manual.