Search Reference

 


dedup

NOTE - Splunk version 4.x reached its End of Life on October 1, 2013. Please see the migration information.

dedup

Synopsis

Removes the subsequent duplicate results that match specified criteria.

Syntax

dedup [<N>] <field-list> [keepevents=<bool>] [keepempty=<bool>] [consecutive=<bool>] [sortby <sort-by-clause>]

Required arguments

<field-list>
Syntax: <string> <string> ...
Description: A list of field names.

Optional arguments

consecutive
Syntax: consecutive=<bool>
Description: Specify whether to only remove duplicate events that are consecutive (true). Defaults to false.
keepempty
Syntax: keepempty=<bool>
Description: If an event contains a null value for one or more of the specified fields, the event is either retained (true) or discarded. Defaults to false.
keepevents
Syntax: keepevents=<bool>
Description: When true, keeps all events and removes specific values. Defaults to false.
<N>
Syntax: <int>
Description: Specify the first N (where N > 0) number of events to keep, for each combination of values for the specified field(s). The non-option parameter, if it is a number, is interpreted as N.
<sort-by-clause>
Syntax: ( - | + ) <sort-field>
Description: List of fields to sort by and their order, descending ( - ) or ascending ( + ).

Sort field options

<sort-field>
Syntax: <field> | auto(<field>) | str(<field>) | ip(<field>) | num(<field>)
Description: Options for sort-field.
<field>
Syntax: <string>
Description: The name of the field to sort.
auto
Syntax: auto(<field>)
Description: Determine automatically how to sort the field's values.
ip
Syntax: ip(<field>)
Description: Interpret the field's values as an IP address.
num
Syntax: num(<field>)
Description: Treat the field's values as numbers.
str
Syntax: str(<field>)
Description: Order the field's values lexicographically.

Description

The dedup command lets you specify the number of duplicate events to keep based on the values of a field. The event returned for the dedup field will be the first event found (most recent in time). If you specify a number, dedup interprets this number as the count of duplicate events to keep, N. If you don't specify a number, N is assumed to be 1 and it keeps only the first occurring event and removes all consecutive duplicates.

The dedup command also lets you sort by some list of fields. This will remove all the duplicates and then sort the results based on the specified sort-by field. Note, that this will only be valid or effective if your search returns multiple results. The other options let you specify other criteria, for example you may want to keep all events, but for events with duplicate values, remove those values instead of the entire event.

Note: We do not recommend that you run the dedup command against the _raw field if you are searching over a large volume of data. Doing this causes Splunk to add a map of each unique _raw value seen which will impact your search performance. This is expected behavior.

Examples

Example 1: Remove duplicates of results with the same 'host' value.

... | dedup host

Example 2: Remove duplicates of results with the same 'source' value and sort the events by the '_time' field in ascending order.

... | dedup source sortby +_time

Example 3: Remove duplicates of results with the same 'source' value and sort the events by the '_size' field in descending order.

... | dedup source sortby -_size

Example 4: For events that have the same 'source' value, keep the first 3 that occur and remove all subsequent events.

... | dedup 3 source

Example 5: For events that have the same 'source' AND 'host' values, keep the first 3 that occur and remove all subsequent events.

... | dedup 3 source host

See also

uniq

Answers

Have questions? Visit Splunk Answers and see what questions and answers the Splunk community has using the dedup command.

This documentation applies to the following versions of Splunk: 4.1 , 4.1.1 , 4.1.2 , 4.1.3 , 4.1.4 , 4.1.5 , 4.1.6 , 4.1.7 , 4.1.8 , 4.2 , 4.2.1 , 4.2.2 , 4.2.3 , 4.2.4 , 4.2.5 , 4.3 , 4.3.1 , 4.3.2 , 4.3.3 , 4.3.4 , 4.3.5 , 4.3.6 , 4.3.7 , 5.0 , 5.0.1 , 5.0.2 , 5.0.3 , 5.0.4 , 5.0.5 , 5.0.6 , 5.0.7 , 5.0.8 , 6.0 , 6.0.1 , 6.0.2 , 6.0.3 View the Article History for its revisions.


You must be logged into splunk.com in order to post comments. Log in now.

Was this documentation topic helpful?

If you'd like to hear back from us, please provide your email address:

We'd love to hear what you think about this topic or the documentation as a whole. Feedback you enter here will be delivered to the documentation team.

Feedback submitted, thanks!