Returns information about the buckets in the specified index. If you are using Splunk Enterprise, this command helps you understand where your data resides so you can optimize disk usage as required. Searches on an indexer cluster return results from the primary buckets and replicated copies on other peer nodes.
The Splunk index is the repository for data ingested by Splunk software. As incoming data is indexed and transformed into events, Splunk software creates files of rawdata and metadata (index files). The files reside in sets of directories organized by age. These directories are called buckets.
The required syntax is in bold.
- | dbinspect
- [<span> | <timeformat>]
- Syntax: index=<wc-string>...
- Description: Specifies the name of an index to inspect. You can specify more than one index. For all internal and non-internal indexes, you can specify an asterisk ( * ) in the index name.
- Default: The default index, which is typically main.
- Syntax: span=<int> | span=<int><timescale>
- Description: Specifies the span length of the bucket. If using a timescale unit (second, minute, hour, day, month, or subseconds), this is used as a time range. If not, this is an absolute bucket "length".
- When you invoke the
dbinspectcommand with a bucket span, a table of the spans of each bucket is returned. When
spanis not specified, information about the buckets in the index is returned. See Information returned when no span is specified.
- Syntax: timeformat=<string>
- Description: Sets the time format for the
- Syntax: corruptonly=<bool>
- Description: Specifies that each bucket is checked to determine if any buckets are corrupted and displays only the corrupted buckets. A bucket is corrupt when some of the files in the bucket are incorrect or missing such as
tsidx. A corrupt bucket might return incorrect data or render the bucket unsearchable. In most cases the software will auto-repair corrupt buckets.
corruptonly=true, each bucket is checked and the following informational message appears.
- Not supported on Splunk SmartStore indexes.
INFO: The "corruptonly" option will check each of the specified buckets. This search might be slow and will take time.
- Default: false
- Syntax: cached=<bool>
- Description: If set to
dbinspectcommand gets the statistics from the bucket's manifest. If set to
dbinspectcommand examines the bucket itself. For SmartStore buckets,
cached=falseexamines an indexer's local copy of the bucket. However, specifying
cached=trueexamines instead the bucket's manifest, which contains information about the canonical version of the bucket that resides in the remote store. For more information see Troubleshoot SmartStore in Managing Indexers and Clusters of Indexers.
- Default: For non-SmartStore indexes, the default is
false. For SmartStore indexes, the default is
Time scale units
These are options for specifying a timescale as the bucket span.
- Syntax: <sec> | <min> | <hr> | <day> | <month> | <subseconds>
- Description: Time scale units.
Time scale Syntax Description <sec> s | sec | secs | second | seconds Time scale in seconds. <min> m | min | mins | minute | minutes Time scale in minutes. <hr> h | hr | hrs | hour | hours Time scale in hours. <day> d | day | days Time scale in days. <month> mon | month | months Time scale in months. <subseconds> us | ms | cs | ds Time scale in microseconds (us), milliseconds (ms), centiseconds (cs), or deciseconds (ds)
Information returned when no span is specified
When you invoke the
dbinspect command without the
span argument, the following information about the buckets in the index is returned.
||A string comprised of |
||The timestamp for the last event in the bucket, which is the time-edge of the bucket furthest towards the future. Specify the timestamp in the number of seconds from the UNIX epoch.|
||The number of events in the bucket.|
||The globally unique identifier (GUID) of the server that hosts the index. This is relevant for index replication.|
||The number of unique hosts in the bucket.|
||The local ID number of the bucket, generated on the indexer on which the bucket originated.|
||The name of the index specified in your search. You can specify |
||The timestamp for the last time the bucket was modified or updated, in a format specified by the |
||The location to the bucket. The naming convention for the bucket |
||The volume in bytes of the raw data files in each bucket. This value represents the volume before compression and the addition of index files.|
||The size in MB of disk space that the bucket takes up expressed as a floating point number. This value represents the volume of the compressed raw data files and the index files.|
||The number of unique sources in the bucket.|
||The number of unique sourcetypes in the bucket.|
||The name of the Splunk server that hosts the index in a distributed environment.|
||The timestamp for the first event in the bucket (the time-edge of the bucket furthest towards the past), in number of seconds from the UNIX epoch.|
||Specifies whether the bucket is warm, hot, cold.|
||Specifies whether each bucket contains full-size or reduced tsidx files. If the value of this field in the results is |
||Specifies the reason why the bucket is corrupt. The corruptReason field appears only when |
dbinspect command is a generating command. See Command types.
Generating commands use a leading pipe character and should be the first command in a search.
Accessing data and security
If no data is returned from the index that you specify with the
dbinspect command, it is possible that you do not have the authorization to access that index. The ability to access data in the Splunk indexes is controlled by the authorizations given to each role. See Use access control to secure Splunk data in Securing Splunk Enterprise.
Non-searchable bucket copies
For hot non-searchable bucket copies on target peers, tsidx and other metadata files are not maintained. Because accurate information cannot be reported, the following fields show NULL:
1. CLI use of the
Display a chart with the span size of 1 day, using the command line interface (CLI).
myLaptop $ splunk search "| dbinspect index=_internal span=1d"
_time hot-3 warm-1 warm-2 --------------------------- ----- ------ ------ 2015-01-17 00:00:00.000 PST 0 2015-01-17 14:56:39.000 PST 0 2015-02-19 00:00:00.000 PST 0 1 2015-02-20 00:00:00.000 PST 2 1
2. Default dbinspect output
Default dbinspect output for a local _internal index.
| dbinspect index=_internal
This screen shot does not display all of the columns in the output table. On your computer, scroll to the right to see the other columns.
3. Check for corrupt buckets
corruptonly argument to display information about corrupted buckets, instead of information about all buckets. The output fields that display are the same with or without the
| dbinspect index=_internal corruptonly=true
4. Count the number of buckets for each Splunk server
Use this command to verify that the Splunk servers in your distributed environment are included in the
dbinspect command. Counts the number of buckets for each server.
| dbinspect index=_internal | stats count by splunk_server
5. Find the index size of buckets in GB
Use dbinspect to find the index size of buckets in GB. For current numbers, run this search over a recent time range.
| dbinspect index=_internal | eval GB=sizeOnDiskMB/1024| stats sum(GB)
6. Determine whether a bucket is reduced
dbinspect search command:
| dbinspect index=_internal
If the value of the
tsidxState field for each bucket is
full, the tsidx files are full-size. If the value is
mini, the tsidx files are reduced.
This documentation applies to the following versions of Splunk® Enterprise: 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 7.3.6, 7.3.7, 7.3.8, 7.3.9, 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8, 8.0.9, 8.0.10, 8.1.0, 8.1.1, 8.1.2, 8.1.3, 8.1.4, 8.1.5, 8.1.6, 8.1.7, 8.1.8, 8.1.9, 8.1.10, 8.1.11, 8.1.12, 8.1.13, 8.1.14, 8.2.0, 8.2.1, 8.2.2, 8.2.3, 8.2.4, 8.2.5, 8.2.6, 8.2.7, 8.2.8, 8.2.9, 8.2.10, 8.2.11, 8.2.12, 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4, 9.0.5, 9.0.6, 9.0.7, 9.1.0, 9.1.1, 9.1.2