
Remove indexes and indexed data
You can remove indexed data or even entire indexes from the indexer. These are the main options:
- Delete events from subsequent searches.
- Remove all data from one or more indexes.
- Remove or disable an entire index.
- Delete older data, based on a retirement policy.
Caution: Removing data is irreversible. If you want to get your data back once you've removed data using any of the techniques described in this topic, you must re-index the applicable data sources.
Delete events from subsequent searches
The Splunk search language provides the special command delete
to delete event data from subsequent searches. Before using delete
, read this section carefully.
Note: You cannot run the delete
command during a real-time search; you cannot delete events as they come in. If you try to use delete
during a real-time search, Splunk Enterprise will display an error.
Who can delete?
The delete
command can only be run by a user with the "delete_by_keyword" capability. By default, Splunk Enterprise ships with a special role, "can_delete" that has this capability (and no others). The admin role does not have this capability by default. It's recommended that you create a special user that you log into when you intend to delete index data.
For more information, refer to Add and edit roles in Securing Splunk Enterprise.
How to delete
First run a search that returns the events you want deleted. Make sure that this search returns only the events you want to delete, and no other events. Once you're certain of that, you can pipe the results of the search to the delete
command.
For example, if you want to remove the events you've indexed from a source called /fflanda/incoming/cheese.log
so that they no longer appear in searches, do the following:
1. Disable or remove that source so that it no longer gets indexed.
2. Search for events from that source in your index:
source="/fflanda/incoming/cheese.log"
3. Look at the results to confirm that this is the data you want to delete.
4. Once you've confirmed that this is the data you want to delete, pipe the search to delete
:
source="/fflanda/incoming/cheese.log" | delete
See the page about the delete command in the Search Reference Manual for more examples.
Note: When running Splunk on Windows, substitute the forward slashes (/) in the examples with backslashes (\).
Piping a search to the delete
command marks all the events returned by that search so that subsequent searches do not return them. No user (even with admin permissions) will be able to see this data when searching.
Note: Piping to delete
does not reclaim disk space. The data is not actually removed from the index; it is just invisible to searches.
The delete
command does not update the metadata of the events, so any metadata searches will still include the events although they are not searchable. The main All indexed data dashboard will still show event counts for the deleted sources, hosts, or sourcetypes.
The delete operation and indexer clusters
In the normal course of index replication, the effects of a delete
operation get quickly propagated across all bucket copies in the cluster, typically within a few seconds or minutes, depending on the cluster load and amount of data and buckets affected by the delete
operation. During this propagation interval, a search can return results that have already been deleted.
Also, if a peer that had primary bucket copies at the time of the delete
operation goes down before all the results have been propagated, some of the deletes will be lost. In that case, you must rerun the operation after the primary copies from the downed peer have been reassigned.
Remove data from one or all indexes
To delete indexed data permanently from your disk, use the CLI clean
command. This command completely deletes the data in one or all indexes, depending on whether you provide an <index_name>
argument. Typically, you run clean
before re-indexing all your data.
Note: The clean
command does not work on clustered indexes.
How to use the clean command
Here are the main ways to use the clean
command:
- To access the help page for
clean
, type:
splunk help clean
- To permanently remove event data from all indexes, type:
splunk clean eventdata
- To permanently remove event data from a single index, type:
splunk clean eventdata -index <index_name>
where <index_name>
is the name of the targeted index.
- Add the
-f
parameter to forceclean
to skip its confirmation prompts.
Important: You must stop the indexer before you run the clean
command.
Note: In pre-5.0 versions of Splunk Enterprise, running the clean
command caused the indexer to reset the next bucket ID value for the index to 0. Starting with version 5.0, this is no longer the case. So, if the latest bucket ID was 3, after you run clean
, the next bucket ID will be 4, not 0. For more information on bucket naming conventions and the bucket ID, see What the index directories look like.
Examples
This example removes event data from all indexes:
splunk stop splunk clean eventdata
This example removes event data from the _internal
index and forces Splunk to skip the confirmation prompt:
splunk stop splunk clean eventdata -index _internal -f
Remove an index entirely
To remove an index entirely (and not just the data contained in it), use the CLI command remove index
:
splunk remove index <index_name>
This command deletes the index's data directories and removes the index's stanza from indexes.conf
.
Before running the command, look through all inputs.conf
files (on your indexer and on any forwarders sending data to the indexer) and make sure that none of the stanzas are directing data to the index you plan to delete. In other words, if you want to delete an index called "nogood", make sure the following attribute/value pair does not appear in any of your input stanzas: index=nogood
. Once the index has been deleted, the indexer will discard any data still being sent to that index.
When you run remove index
, it first warns you if any of the inputs on the indexer (but not on any forwarders) are still configured to send data to the specified index. You'll see a message like this:
03-28-2012 23:59:22.973 -0700 WARN IndexAdminHandler - Events from the following 3 inputs will now be discarded, since they had targeted index=zzz: 03-28-2012 23:59:22.973 -0700 WARN IndexAdminHandler - type: monitor, id: /home/v/syslog-avg-1000-lines 03-28-2012 23:59:22.973 -0700 WARN IndexAdminHandler - type: monitor, id: /mnt/kickstart/internal/fermi 03-28-2012 23:59:22.973 -0700 WARN IndexAdminHandler - type: monitor, id: /mnt/kickstart/internal/flights
You can run remove index
while splunkd
is running. You do not need to restart splunkd
after the command completes.
The index deletion process is ordinarily fast, but the duration depends on several factors:
- The amount of data being deleted.
- Whether you are currently performing heavy writes to other indexes on the same disk.
- Whether you have a large number of small
.tsidx
files in the index you're deleting.
Disable an index without deleting it
Use the disable index
CLI command to disable an index without deleting it:
splunk disable index <index_name>
Unlike the remove index
command, disable index
does not delete index data, and it is reversible (with the enable index
command). However, once an index is disabled, splunkd
will no longer accept data targeted at it.
You can also disable an index in Splunk Web. To do this, navigate to Settings > Indexes and click Disable to the right of the index you want to disable.
Delete older data based on retirement policy
When data in an index reaches a specified age or when the index grows to a specified size, it rolls to the "frozen" state, at which point the indexer deletes it from the index. Just before deleting the data, the indexer can move it to an archive, depending on how you configure your retirement policy.
For more information, see Set a retirement and archiving policy.
PREVIOUS Set up multiple indexes |
NEXT Optimize indexes |
This documentation applies to the following versions of Splunk® Enterprise: 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.0.15, 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.1.14, 6.2.0, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.2.14, 6.2.15, 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13, 6.3.14, 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.4.9, 6.4.10, 6.4.11, 6.5.0, 6.5.1, 6.5.1612 (Splunk Cloud only), 6.5.2, 6.5.3, 6.5.4, 6.5.5, 6.5.6, 6.5.7, 6.5.8, 6.5.9, 6.5.10, 6.6.0, 6.6.1, 6.6.2, 6.6.3, 6.6.4, 6.6.5, 6.6.6, 6.6.7, 6.6.8, 6.6.9, 6.6.10, 6.6.11, 6.6.12
Comments
If you run the above command "splunk clean eventdata -index " as root but your splunk instance run as a splunk user (ie non root), splunk won't restart by itself. Error message can be found in the crash log about a permission problem on the index directory. Changing permissions for a delete command is a little surprising at first. So either run the command as splunk user or just correct afterward by regiving the files/directory to splunk user from the root account, which makes splunk happy again.
Seems that clean eventdata command cleans single index without -index parameter. So command<br />./splunk clean eventdata test<br />cleans test -index after warning.
An alternative way for "Remove data from one or all indexes" could be to temporarily change your retirement policy for the index you want to clean. This solution would work in a distributed environment.