Migrate existing data on an indexer cluster to SmartStore
You can migrate the existing data on your indexer cluster from local storage to the remote store.
This procedure describes how to migrate all the indexes on the indexer cluster to SmartStore. You can modify the procedure if you only want to migrate some of the indexes. Indexers support a mixed environment of SmartStore and non-SmartStore indexes.
Because this process requires the cluster to upload large amounts of data, it can take a long time to complete and can have a significant impact on concurrent indexing and searching.
You cannot revert an index to non-SmartStore after you migrate it to SmartStore.
Migrate data
Perform the migration operation in two phases:
- Test the SmartStore configurations and remote connectivity on a standalone test instance.
- Run the migration by applying the configurations to your production indexer cluster.
Prerequisites
- Read:
- Documentation provided by the vendor of the remote storage service that you are using
- If the cluster was once migrated from single-site to multisite, you must convert any pre-existing, single-site buckets to follow the multisite replication and search policies. To do so, change the
constrain_singlesite_buckets
setting in the master'sserver.conf
file to "false" and restart the master node. See Configure the master to convert existing buckets to multisite. - The cluster should be on the smaller side, with a maximum of 20 indexers. If you want to migrate a larger cluster, consult Splunk Professional Services.
- Be aware of these configuration issues:
- The value of the
path
setting for each remote volume stanza must be unique to the indexer cluster. You can share remote volumes only among indexes within a single cluster. In other words, if indexes on one cluster use a particular remote volume, no index on any other cluster or standalone indexer can use the same remote volume. - You must set all SmartStore indexes in an indexer cluster to use
repfactor = auto
. - Leave
maxDataSize
at its default value of "auto" (750MB) for each SmartStore index. - The
coldPath
setting for each SmartStore index requires a value, even though the setting is ignored except in the case of migrated indexes. - The
thawedPath
setting for each SmartStore index requires a value, even though the setting is ignored.
- Reconfigure the cluster as necessary to conform with the lists of unsupported features, current restrictions, and incompatible settings:
- Current restrictions on SmartStore use. Regarding the requirement that replication factor and search factor be equal, you can make this change post-migration.
1. Test the configuration on a standalone instance
The purpose of performing the test on a standalone instance is to:
- test remote store connectivity.
- validate the configuration.
For production purposes, you must run SmartStore on an indexer cluster. However, for limited testing purposes such as those described here, you can run SmartStore on a non-clustered, standalone indexer.
Steps
- Ensure that you have met all prerequisites relevant to this test setup. In particular, read:
- Understand SmartStore security strategies and prepare to implement them as necessary during the deployment process. See SmartStore security strategies.
- Install a new Splunk Enterprise instance. For information on how to install Splunk Enterprise, read the Installation Manual.
- Edit
indexes.conf
in$SPLUNK_HOME/etc/system/local
to specify the SmartStore settings for your indexes. These should be the same group of settings that you intend to use later on your production deployment.
Here is an example of how to configure the SmartStore indexes on an instance, using an S3 remote object store. In this example, all indexes are SmartStore-enabled and use a single remote storage volume, and the remote volume is given the name "remote_store". In addition, the example configures one new index, "cs_index".
[default] # Configure all indexes to use the SmartStore remote volume called # "remote_store". # Note: If you want only some of your indexes to use SmartStore, # place this setting under the individual stanzas for each of the # SmartStore indexes, rather than here. remotePath = volume:remote_store/$_index_name # Configure the remote volume [volume:remote_store] storageType = remote # On the next line, the volume's path setting points to the remote storage location # where indexes reside. Each SmartStore index resides directly below the location # specified by the path setting. The <scheme> identifies a supported remote # storage system type, such as S3. The <remote-location-specifier> is a # string specific to the remote storage system that specifies the location # of the indexes inside the remote system. # This is an S3 example: "path = s3://mybucket/some/path". path = <scheme>://<remote-location-specifier> # The following S3 settings are required only if you're using the access and secret # keys. They are not needed if you are using AWS IAM roles. remote.s3.access_key = <S3 access key> remote.s3.secret_key = <S3 secret key> remote.s3.endpoint = https:|http://<S3 host> # Configure a test index. # Here is the configuration for an example "cs_index" index. [cs_index] homePath = $SPLUNK_DB/cs_index/db thawedPath = $SPLUNK_DB/cs_index/thaweddb coldPath = $SPLUNK_DB/cs_index/colddb
For details on these settings, see Configure SmartStore. Also see indexes.conf.spec in the Admin Manual.
- Restart the instance.
- Test the deployment:
- To confirm remote storage access, run this command on the instance:
splunk cmd splunkd rfs -- ls --starts-with volume:remote_store
This command recursively lists any files that are present in the remote store. It is recommended that you first place a sample text file in the remote store. If you see the file when you run the command, you have access to the remote store.
- Send some data to the instance and wait for buckets to roll. If you don't want to wait for buckets to roll naturally, you can manually roll some buckets:
splunk _internal call /data/indexes/<index_name>/roll-hot-buckets -auth <admin>:<password>
Look for warm buckets being uploaded to remote storage.
- At this point, you should be able to run normal searches against this data. In most cases, you will not be transferring any data from the remote storage, because the data will already be in the local cache. To validate data fetching from remote storage, do the following:
- Evict a bucket from the cache, using this REST endpoint:
services/admin/cacheman/<cid>/evict
where
<cid>
isbid|<bucketId>|
. For example: "bid|taktaklog~0~7D76564B-AA17-488A-BAF2-5353EA0E9CE5|" - Run a search that requires data from the evicted bucket.
The instance must now transfer the bucket from remote storage to run the search. After running the search, you can check that the bucket has reappeared in the cache.
2. Run the migration on the indexer cluster
In this procedure, you configure your cluster for SmartStore. The goal of the procedure is to migrate all existing warm and cold buckets on all indexes to SmartStore. Going forward, all new warm buckets will also reside in SmartStore.
The migration process takes a while to complete. If you have a large amount of data, it can take a long while. Expect some degradation of indexing and search performance during the migration. For that reason, it is best to schedule the migration for a time when your indexers will be relatively idle.
Steps
- Ensure that you have met the prerequisites. In particular, read:
- Understand SmartStore security strategies and prepare to implement them as necessary during the deployment process. See SmartStore security strategies.
- Upgrade all cluster nodes (master, peer nodes, search heads) to the latest version of Splunk Enterprise. See Upgrade an indexer cluster.
- Confirm that there are no bucket fixup tasks in progress or pending. Go to the Master Dashboard, click on the Indexes tab, and then click on the Bucket Status button to confirm.
- Run
splunk enable maintenance-mode
on the master. To confirm that the master is in maintenance mode, runsplunk show maintenance-mode
. - Stop all the peer nodes. When bringing down the peers, use the
splunk stop
command, notsplunk offline
. - On the master node, edit the existing
$SPLUNK_HOME/etc/master-apps/_cluster/local/indexes.conf
file to make the following additions.Do not replace the existing
indexes.conf
file, because you need to keep its current settings, such as its index-definition settings. Instead, merge these additional settings into the existing file. Be sure to remove any other copies of these settings from the file.- Specify the SmartStore index global and volume settings. Assuming that you have already tested these settings on your standalone instance, you can simply copy the settings over from your standalone instance. For example:
[default] # Configure all indexes to use the SmartStore remote volume called # "remote_store". # Note: If you want only some of your indexes to use SmartStore, # place this setting under the individual stanzas for each of the # SmartStore indexes, rather than here. remotePath = volume:remote_store/$_index_name # Configure the remote volume [volume:remote_store] storageType = remote # On the next line, the volume's path setting points to the remote storage location # where indexes reside. Each SmartStore index resides directly below the location # specified by the path setting. The <scheme> identifies a supported remote # storage system type, such as S3. The <remote-location-specifier> is a # string specific to the remote storage system that specifies the location # of the indexes inside the remote system. # This is an S3 example: "path = s3://mybucket/some/path". path = <scheme>://<remote-location-specifier> # The following S3 settings are required only if you're using the access and secret # keys. They are not needed if you are using AWS IAM roles. remote.s3.access_key = <S3 access key> remote.s3.secret_key = <S3 secret key> remote.s3.endpoint = https:|http://<S3 host>
- Configure the
maxGlobalDataSizeMB
andfrozenTimePeriodInSecs
settings, as necessary, to ensure that the cluster will follow your desired freezing behavior, post-migration. See Configure data retention for SmartStore indexes.This step is extremely important, to avoid unwanted bucket freezing and possible data loss. SmartStore bucket-freezing behavior and settings are different from the non-SmartStore behavior and settings.
- On the master node, edit
$SPLUNK_HOME/etc/master-apps/_cluster/local/server.conf
to make any necessary changes to the SmartStore-relatedserver.conf
settings on the peer nodes. In particular, configure the cache size to fit the needs of your deployment. See Configure the SmartStore cache manager. - On the master node, run:
splunk apply cluster-bundle --answer-yes
- Start all the peer nodes. Wait briefly for the peer nodes to download the configuration bundle with the SmartStore settings. To view the status of the configuration bundle process, you can run the
splunk show cluster-bundle-status
command, described in Update common peer configurations and apps. - Run
splunk disable maintenance-mode
on the master. To confirm that the master is not in maintenance mode, runsplunk show maintenance-mode
. - Wait briefly for the peer nodes to begin uploading their warm and cold buckets to the remote store.
Cold buckets use the cold path as their cache location, post-migration.
In all respects, cold buckets are functionally equivalent to warm buckets. The cache manager manages the migrated cold buckets in the same way that it manages warm buckets. The only difference is that the cold buckets will be fetched into the cold path location, rather than the home path location. - To confirm remote storage access across the indexer cluster, run this command from one of the peer nodes:
splunk cmd splunkd rfs -- ls --starts-with volume:remote_store
This command recursively lists any files that are present in the remote store. It should show that the cluster is starting to upload warm buckets to the remote store. If necessary, wait a little while for the first uploads to occur.
- On the master node, make any necessary changes to ensure that the indexer cluster's replication factor and search factor use the same values, for example, 3/3.
- Test SmartStore functionality. At this point, you should be able to run normal searches against this data. In the majority of cases, you will not be transferring any data from the remote storage, because the data will already be in the local cache. To validate data fetching from remote storage, do the following:
- On one of the peer nodes, look for a fully populated bucket, containing both tsidx files and the rawdata file.
- Evict the bucket from the cache, using this REST endpoint:
services/admin/cacheman/<cid>/evict
where
<cid>
isbid|<bucketId>|
. For example: "bid|taktaklog~0~7D76564B-AA17-488A-BAF2-5353EA0E9CE5|" - Run a search locally on the peer node. The search must be one that requires data from the evicted bucket.
The peer must now transfer the bucket from remote storage to run the search. After running the search, you can check that the bucket has reappeared in the cache.
If you need to restart the cluster during migration, upon restart, migration will continue from where it left off.
Refrain from rebalancing data or removing excess buckets until you have run the SmartStore-enabled cluster successfully for a while. In particular, run these operations only after you have set the replication factor and search factor to use equal values and the cluster has performed any related bucket fixup.
Monitor the migration process
You can run an endpoint from the master node to determine the status of the migration:
$ splunk search "|rest /services/admin/cacheman/_metrics |fields splunk_server migration.*" -auth admin:passwd
The endpoint returns data on the migration, which you can use to determine how far along in the process each of the peers is. In this example, peer1 is on its 8th job, out of a total of 35, so the peer's migration is about 20-25% complete. The start_epoch field tells you when the migration began, allowing you to extrapolate an approximate completion time:
splunk_server migration.current_job migration.start_epoch migration.status migration.total_jobs --------------- ----------------------- --------------------- --------------- --------------------- cluster1-master not_started peer1.ajax.com 8 1484942186 running 35 peer2.ajax.com 7 1484942190 running 37 peer2.ajax.com 5 1484942194 running 36
Once migration.status reaches "finished" on all peers, the migration is finished, and current_job will match total_jobs.
Note: If any peer restarts during migration, its migration information is lost, and this endpoint cannot be used to check status of that peer, although the migration will, in fact, resume. The peer's reported status will remain "not_started" even after migration resumes.
Instead, you can run the following endpoint on the restarted peer:
"|rest /services/admin/cacheman |search cm:bucket.stable=0 |stats count"
The count equals the number of upload jobs remaining, where an upload job represents a single bucket to be uploaded, or, in other words, (total_jobs - current_jobs) from the earlier endpoint. The count decrements to zero as migration continues.
You can also use the monitoring console to monitor migration progress. See Troubleshoot with the monitoring console.
Deploy SmartStore on a new indexer cluster | Bootstrap SmartStore indexes onto an indexer cluster |
This documentation applies to the following versions of Splunk® Enterprise: 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10
Feedback submitted, thanks!