Splunk® Enterprise

User Manual

Download manual as PDF

Splunk version 4.x reached its End of Life on October 1, 2013. Please see the migration information.
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

Use reporting commands

You can add reporting commands directly to a search string to help with the production of reports and the summarizing of search results.

A reporting command primer

This subsection covers the major categories of reporting commands and provides examples of how they can be used in a search.

The primary reporting commands are:

  • chart: used to create charts that can display any series of data that you want to plot. You can decide what field is tracked on the x-axis of the chart.
  • timechart: used to create "trend over time" reports, which means that _time is always the x-axis.
  • top: generates charts that display the most common values of a field.
  • rare: creates charts that display the least common values of a field.
  • stats, eventstats, and streamstats: generate reports that display summary statistics
  • associate, correlate, and diff: create reports that enable you to see associations, correlations, and differences between fields in your data.

Note: As you'll see in the following examples, you always place your reporting commands after your search commands, linking them with a pipe operator ("|").

chart, timechart, stats, eventstats, and streamstats are all designed to work in conjunction with statistical functions. The list of available statistical functions includes:

  • count, distinct count
  • mean, median, mode
  • min, max, range, percentiles
  • standard deviation, variance
  • sum
  • first occurrence, last occurrence

To find more information about statistical functions and how they're used, see "Functions for stats, chart, and timechart" in the Search Reference Manual. Some statistical functions only work with the timechart command.

Note: All searches with reporting commands generate specific structures of data. The different chart types available in Splunk require these data structures to be set up in particular ways. For example not all searches that enable the generation of bar, column, line, and area charts also enable the generation of pie charts. Read the "Chart data structure requirements" subtopic of the "Chart gallery" topic in this manual to learn more.

Creating time-based charts

Use the timechart reporting command to create useful charts that display statistical trends over time, with time plotted on the x-axis of the chart. You can optionally split data by another field, meaning that each distinct value of the "split by" field is a separate series in the chart. Typically these reports are formatted as line or area charts, but they can also be column charts.

For example, this report uses internal Splunk log data to visualize the average indexing thruput (indexing kbps) of Splunk processes over time, broken out by processor:

index=_internal "group=thruput" | timechart avg(instantaneous_eps) by processor

Creating charts that are not (necessarily) time-based

Use the chart reporting command to create charts that can display any series of data. Unlike the timechart command, charts created with the chart command use an arbitrary field as the x-axis. You use the over keyword to determine what field takes the x-axis.

Note: The over keyword is specific to the chart command. You won't use it with timechart, for example, because the _time default field is already being used as the x-axis.

For example, the following report uses web access data to show you the average count of unique visitors over each weekday.

index=sampledata sourcetype=access* | chart avg(clientip) over date_wday

You can optionally split data by another field, meaning that each distinct value of the "split by" field is a separate series in the chart. If your search includes a "split by" clause, place the over clause before the "split by" clause.

The following report generates a chart showing the sum of kilobytes processed by each clientip within a given timeframe, split by host. The finished chart shows the kb value taking the y-axis while clientip takes the x-axis. The delay value is broken out by host. You might want to use the Report Builder to format this report as a stacked bar chart.

index=sampledata sourcetype=access* | chart sum(kb) over clientip by host

Another example: say you want to create a stacked bar chart that splits out the http and https requests hitting your servers. To do this you would first create ssl_type, a search-time field extraction that contains the inbound port number or the incoming URL request, assuming that is logged. The finished search would look like this:

sourcetype=whatever | chart count over ssl_type

Again, you can use the Report Builder to format the results as a stacked bar chart.

Visualizing the highs and lows

Use the top and rare reporting commands to create charts that display the most and least common values.

This set of commands generates a report that sorts through firewall information to show you a list of the top 100 destination ports used by your system:

index=sampledata | top limit=100 dst_port

This string, on the other hand, utilizes the same set of firewall data to generate a report that shows you the source ports with the lowest number of denials. If you don't specify a limit, the default number of values displayed in a top or rare is ten.

index=sampledata action=Deny | rare src_port

A more complex example of the top command

Say you're indexing an alert log from a monitoring system, and you have two fields:

  • msg is the message, such as CPU at 100%.
  • mc_host is the host that generates the message, such as log01.

How do you get a report that displays the top msg and the values of mc_host that sent them, so you get a table like this:

Messages by mc_host
CPU at 100%
Log File Alert

To do this, set up a search that finds the top message per mc_host (using limit=1 to only return one) and then sort by the message count in descending order:

source="mcevent.csv" | top limit=1 msg by mc_host | sort -count

Create reports that display summary statistics

Use the stats and eventstats reporting commands to generate reports that display summary statistics related to a field.

To fully utilize the stats command, you need to include a "split by" clause. For example, the following report won't provide much information:

sourcetype=access_combined | stats avg(kbps)

It gives you the average of kbps for all events with a sourcetype of access_combined--a single value. The resulting column chart contains only one column.

But if you break out the report with a split by field, Splunk generates a report that breaks down the statistics by that field. The following report generates a column chart that sorts through the access_combined logs to get the average thruput (kbps), broken out by host:

sourcetype=access_combined | stats avg(kbps) by host

Here's a slightly more sophisticated example of the stats command, in a report that shows you the CPU utilization of Splunk processes sorted in descending order:

index=_internal "group=pipeline" | stats sum(cpu_seconds) by processor | sort sum(cpu_seconds) desc

The eventstats command works in exactly the same manner as the stats command, except that the aggregation results of the command are added inline to each event, and only the aggregations that are pertinent to each event.

You specify the field name for the eventstats results by adding the as argument. So the first example above could be restated with "avgkbps" being the name of the new field that contains the results of the eventstats avg(kbps) operation:

sourcetype=access_combined | eventstats avg(kbps) as avgkbps by host

When you run this set of commands, Splunk adds a new avgkbps field to each sourcetype=access_combined event that includes the kbps field. The value of avgkbps is the average kbps for that event.

In addition, Splunk uses that set of commands to generate a chart displaying the average kbps for all events with a sourcetype of access_combined, broken out by host.

Look for associations, statistical correlations, and differences in search results

Use the associate, correlate and diff commands to find associations, similarities and differences among field values in your search results.

The associate reporting command identifies events that are associated with each other through field/field value pairs. For example, if one event has a referer_domain of "http://www.google.com/" and another event has a referer_domain with the same URL value, then they are associated.

You can "tune" the results gained by the associate command with the supcnt, supfreq, and improv arguments. For more information about these arguments see the Associate page in the Search Reference.

For example, this report searches the access sourcetypes and identifies events that share at least three field/field-value pair associations:

sourcetype=access* | associate supcnt=3

The correlate reporting command calculates the statistical correlation between fields. It uses the cocur operation to calculate the percentage of times that two fields exist in the same set of results.

The following report searches across all events where eventtype=goodaccess, and calculates the co-occurrence correlation between all of those fields.

eventtype=goodaccess | correlate type=cocur

Use the diff reporting command to compare the differences between two search results. By default it compares the raw text of the search results you select, unless you use the attribute argument to focus on specific field attributes.

For example, this report looks at the 44th and 45th events returned in the search and compares their ip address values:

eventtype=goodaccess | diff pos1=44 pos2=45 attribute=ip

About reports, dashboards, and data visualizations
Real-time reporting

This documentation applies to the following versions of Splunk® Enterprise: 4.3, 4.3.1, 4.3.2, 4.3.3, 4.3.4, 4.3.5, 4.3.6, 4.3.7

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters