Docs » SignalFlow analytics language

SignalFlow analytics language 🔗

The heart of Splunk Infrastructure Monitoring is the SignalFlow analytics engine that runs computations written in a Python-like language.

Stream objects and metric time series 🔗

The main construct of the SignalFlow query language is the stream object, which produces timestamped values organized along dimensions. Raw metric time series data is streamed to analytics jobs, and the queries and computations specified through SignalFlow produced streams. For example, statistics computed across a population or over time. Streams are local to a particular analytics query or computation, and several distinct jobs may query for the same underlying metric time series data. Detectors evaluate conditions involving one or more streams, typically comparisons between streams over a period of time. For example, the condition for a detector could be “disk utilization is greater than 80% for 90% of 10 minutes,” or, “average database latency is above 5 seconds and the number of database calls is at least 20% of the one day average.”

Create custom analytics 🔗

If you want to create custom charts and detector analytics, see Analyze Data Using SignalFlow in the Splunk Developer Guide. You can also run SignalFlow programs directly. For more information, see the SignalFlow API in the Splunk Developer Guide.

Aggregations and transformations 🔗

Many of the built-in analytical functions can perform computations on time series in charts and detectors in two ways: aggregations and transformations.

  • Aggregations operate across all of the datapoints at a single instance in time, for example the mean CPU utilization across a group of 5 servers at time t, t+1, t+2, etc. The output of an aggregation is a single MTS, metric time series, where each datapoint represents the aggregation of all the datapoints over a specific period of time.

    An additional option, Group By, is available for aggregations. If a group‑by field is specified, MTS sharing values for properties named in the group-by criterion are aggregated together. For example, you can compute the average CPU load grouped by AWS instance type; specify the Mean function as an aggregation, and set AWS instance type as the group‑by criterion. The output will show 1 MTS per AWS instance type.

  • Transformations operate in parallel on each MTS over a window of time and yield one output time series for each input time series. For example, the average CPU utilization for 5 servers over a moving window of one day will display five MTS; each output value will be the moving average for that MTS over the previous 24 hours.

    The two types of transformations available, moving window and calendar window, are discussed below. For various examples of how to use transformation analytics in charts, see Gain insights through chart analytics.

Moving window transformations 🔗

In the moving window transformation chart, each line represents the Mean CPU utilization across 4 servers. The grey line, plot A, represents the mean value for each datapoint over the preceding minute because the chart resolution is 1 minute, as shown next to the chart title. The magenta lines, plot B, represent the mean value for each datapoint over the preceding hour , moving window, because 1 hour is specified in the function.

This image shows a moving window transformation chart. There are two CPU utilization functions show in the chart.

Calendar window transformations 🔗

In the chart, the Sum, Mean, Maximum, and Minimum functions let you specify a calendar window for a transformation, instead of a moving window. In the chart the magenta line (left Y-axis) shows the sum of all transactions over a moving window of 1 week (7 days). The green line (right Y-axis) shows the sum of the transactions over a calendar week, including partial values calculated throughout the week. The values increase over a week, then reset at the beginning of the following week.

This image shows a calendar window transformation chart.

When you add a function with a calendar window to a plot, and the current time window is narrower than the cycle length that you specified on the function, the chart resizes to a default time of at least one cycle can be displayed. Also, any dashboard override for time range is removed. A message will be displayed to inform you of this optimization; if you don’t accept the optimization, you may need to modify the time range manually to see the data you expect.

For a chart to show a value at the end of every calendar cycle, the cycle length must be a multiple of the resolution. For more information, see resolution. Some cycle lengths are fixed; a week is always 7 days. Others are variable; a month can be 28, 29, 30, or 31 days.

However, for some combinations of time range and chart display resolution the density selector at the right side of the override bar, it may not be possible to use a resolution that guarantees a chart will show values perfectly aligned with cycle boundaries. For example, if a resolution of 1 day would result in more datapoints than can be shown on a chart, a resolution of two days may have to be used. This means that plotted values cannot line up with the end of a month that has 29 or 31 days, because neither value is a multiple of the 2‑day resolution. Such a situation is visually indicated by the resolution pill on a chart turning orange and showing a message in the on-hover tooltip. You can resolve this issue by changing the display resolution and, or viewing a narrower time range in the chart.

When using calendar time windows with transformations, the chart cannot have a resolution finer than 1 hour. When specifying a calendar time window, you have a few options you can set.

Cycle length and start 🔗

These options are self-explanatory. Cycle length options include hour, day, week, month, and quarter.

For most cycle length options, you can specify a starting point. For example, for a cycle length of a quarter, you could specify that the first quarter starts in February instead of the default of January. The one exception is an hourly cycle length; hourly cycles always start at the top of the hour (minute 0).

Calendar time zone 🔗

For calendar windows, you must specify a calendar time zone. All calendar window functions in a chart use the same calendar time zone.

The time zone that you specify here is a per-chart (or per-detector) option that is independent of the visualization timezone that is set in your user profile the time zone you set for a calendar window is used to determine the exact beginning and end of your chosen calendar window cycles.

For example, January in America/Los Angeles starts at a different instant relative to January in Asia/Tokyo. If Infrastructure Monitoring receives a datapoint with a timestamp near midnight UTC time on December 31, the calendar time zone determines whether that datapoint should count towards the calculation for December (which it would for Los Angeles) or the calculation for January (which it would for Tokyo).

The first time a calendar window function is specified on any plot in a chart, the visualization timezone from your profile is suggested as the value to use for the calendar timezone. However, you can select any time zone you need. The value set here can be viewed and changed in any calendar window function in the chart, as well as in the chart options tab.

Hide partial values 🔗

This setting lets you optimize the output of a calendar window function, based on whether you are interested only in the final values calculated at the ends of cycles (such as the sum of requests served every day) or final as well as partial values calculated during a cycle (such as the sum of requests served so far today). For example, if you have a cycle length of 1 day, hiding partial values means that you will only see one value for each day; you won’t see how values change during the course of the day.

Note that deselecting this option has no effect when cycle length is 1 hour, because a chart using calendar windows cannot have a resolution finer than 1 hour.

In the following chart, hiding partial values (the magenta bars) provides a better overview of how values actually compare on a day-to-day basis. Not hiding partial values (the green lines) shows how the mean changes during the course of each day.

The value you see at the start of each cycle represents the final value for the previous cycle. The magenta column at 12:00am February 15 (on the far left) represents the mean of the values seen over February 14, the column at 12:00am February 16 represents the mean of the values for February 15, and so on.

This image shows a chart with hidden partial values.

Tip

Single value charts can be useful for visualizing calculations such as the maximum latency reported in the current day so far. To properly display these numbers, deselect “Hide partial values”.

Timeshift 🔗

If you select timeshift, which is available only when partial values are hidden, then the value from the end of a previous cycle will be output at the end of every calendar cycle. For example, if your cycle length is Month and you timeshift by one cycle, the datapoint at April 30 will represent the value from March 31, the datapoint at May 31 will represent the value from April 30, and so on.

Note that this timeshift option is aware of cycles such as month having variable lengths, such as how March has more days than Feb, and shifts correctly to the end of a previous interval. By contrast, the standalone timeshift analytics function performs a fixed width shift, such as 30 days.. For more information, see Use the Timeshift function to understand trends. If you want to shift to the ends of previous monthly or quarterly cycles use the timeshift option available within a calendar window transformation.

A typical use case for using timeshift is to create a column chart that includes a plot with a calendar transformation that is not timeshifted, then clone that plot and add the timeshift option. This allows yout to compare, say, the average value seen for a metric over the current week with the average seen over the previous week.

Other functions 🔗

In addition to functions that provide aggregations and transformations, Infrastructure Monitoring offers functions such as Count, which counts the number of MTS that have values; Top and Bottom, which show the highest or lowest N number of values; and Exclude, which provides the ability to filter time series by value, rather than by source.

As with other analytical functions, these functions can be used in concert with others to produce more sophisticated computations. For example, Exclude can be used with Sum to achieve a result akin to the sumif() function found in popular spreadsheet applications.

For a detailed explanation of each function, see analytics-ref.

Expressions 🔗

SignalFlow lets you create expressions that refer to preceding computations as variables. For example, you can calculate a ratio of HTTP response codes received that are 200 to those that are 4xx or 5xx.