Removes the subsequent duplicate results that match specified criteria.
dedup [<N>] <field-list> [keepevents=<bool>] [keepempty=<bool>] [consecutive=<bool>] [sortby <sort-by-clause>]
- Syntax: <string> <string> ...
- Description: A list of field names.
- Syntax: consecutive=<bool>
- Description: Specify whether to only remove duplicate events that are consecutive (true). Defaults to false.
- Syntax: keepempty=<bool>
- Description: If an event contains a null value for one or more of the specified fields, the event is either retained (true) or discarded. Defaults to false.
- Syntax: keepevents=<bool>
- Description: When true, keeps all events and removes specific values. Defaults to false.
- Syntax: <int>
- Description: Specify the first N (where N > 0) number of events to keep, for each combination of values for the specified field(s). The non-option parameter, if it is a number, is interpreted as N.
- Syntax: ( - | + ) <sort-field>
- Description: List of fields to sort by and their order, descending ( - ) or ascending ( + ).
Sort field options
- Syntax: <field> | auto(<field>) | str(<field>) | ip(<field>) | num(<field>)
- Description: Options for sort-field.
- Syntax: <string>
- Description: The name of the field to sort.
- Syntax: auto(<field>)
- Description: Determine automatically how to sort the field's values.
- Syntax: ip(<field>)
- Description: Interpret the field's values as an IP address.
- Syntax: num(<field>)
- Description: Treat the field's values as numbers.
- Syntax: str(<field>)
- Description: Order the field's values lexicographically.
dedup command lets you specify the number of duplicate events to keep based on the values of a field. The event returned for the
dedup field will be the first event found (most recent in time). If you specify a number,
dedup interprets this number as the count of duplicate events to keep, N. If you don't specify a number, N is assumed to be 1 and it keeps only the first occurring event and removes all consecutive duplicates.
dedup command also lets you sort by some list of fields. This will remove all the duplicates and then sort the results based on the specified sort-by field. Note, that this will only be valid or effective if your search returns multiple results. The other options let you specify other criteria, for example you may want to keep all events, but for events with duplicate values, remove those values instead of the entire event.
Note: We do not recommend that you run the
dedup command against the
_raw field if you are searching over a large volume of data. Doing this causes Splunk to add a map of each unique
_raw value seen which will impact your search performance. This is expected behavior.
Example 1: Remove duplicates of results with the same 'host' value.
... | dedup host
Example 2: Remove duplicates of results with the same 'source' value and sort the events by the '_time' field in ascending order.
... | dedup source sortby +_time
Example 3: Remove duplicates of results with the same 'source' value and sort the events by the '_size' field in descending order.
... | dedup source sortby -_size
Example 4: For events that have the same 'source' value, keep the first 3 that occur and remove all subsequent events.
... | dedup 3 source
Example 5: For events that have the same 'source' AND 'host' values, keep the first 3 that occur and remove all subsequent events.
... | dedup 3 source host
Have questions? Visit Splunk Answers and see what questions and answers the Splunk community has using the dedup command.
This documentation applies to the following versions of Splunk: 4.1 , 4.1.1 , 4.1.2 , 4.1.3 , 4.1.4 , 4.1.5 , 4.1.6 , 4.1.7 , 4.1.8 , 4.2 , 4.2.1 , 4.2.2 , 4.2.3 , 4.2.4 , 4.2.5 , 4.3 , 4.3.1 , 4.3.2 , 4.3.3 , 4.3.4 , 4.3.5 , 4.3.6 , 4.3.7 , 5.0 , 5.0.1 , 5.0.2 , 5.0.3 , 5.0.4 , 5.0.5 , 5.0.6 , 6.0