Map Amazon Security Lake federated indexes to AWS Glue tables
This topic covers the Set up federated indexes step of the workflow for creating an Amazon Security Lake federated provider. You cannot follow this step until you complete the steps that precede it in the federated provider setup workflow. See the checklist of tasks to set up Federated Analytics.
After you sort out the data lake indexes for your Amazon Security Lake federated provider, you review the set of federated indexes that Splunk software generates for you and make adjustments to them as necessary.
Unlike data lake indexes, federated indexes do not themselves contain data, and you cannot ingest data into them. You use federated indexes when you run federated searches over remote Amazon Security Lake datasets. Each federated index maps to one of the AWS Glue tables that you listed in the Define provider step, and each of those AWS Glue tables represents a remote dataset in your Amazon Security Lake account. You invoke federated indexes in your federated searches to tell Splunk software which remote Amazon Security Lake dataset you intend to search.
The Splunk platform creates federated indexes on the search head of your Splunk Cloud Platform deployment. You can set up role-based permissions for federated indexes in the same way that you set up permissions for ordinary indexes. See Give your users role-based access control of data lake indexes and federated indexes.
In this task, you do these things:
- Review the provided federated index names and change them if necessary.
- Review the time partition settings and make sure they match the AWS source version of the AWS Glue table to which the federated index is mapped.
- Remove federated indexes that you do not need.
Prerequisites
- In the Create subscriber step of the Add a new federated provider workflow, you must have created a new subscriber for federated search access in your Amazon Security Lake account, and you must have added its Resource Share Name and Resource Share ARN fields to the federated provider definition. See Create the Amazon Security Lake subscriber for federated search access.
- In the Define provider step of the Add a new federated provider workflow, you must have obtained the AWS Glue database and AWS Glue tables from the resource share that was created in the AWS Resource Access Manager when you created the ASL subscriber for federated search access. Splunk software uses the AWS Glue database and AWS Glue tables to generate the federated indexes. See Obtain the AWS Glue database and tables.
Steps
- On your Splunk Cloud Platform deployment, in Splunk Web, at the Set up federated indexes step of the Add a new federated provider workflow, you see a list of federated indexes. Splunk software generates a federated index for each AWS Glue table you provide for the Define provider step of this workflow. Using the following table, adjust the settings for each federated index in the list.
Setting Description Default Value Federated index name Optionally change the name of the federated index. Federated index names have the following restrictions: - They can contain only letters, numbers, underscores, and hyphens.
- They must begin with a letter or number.
- They cannot be more than 2,048 characters in length.
Each federated index name must be unique.
Do not change the Federated index name without first examining it to verify the AWS source version of the Amazon Security Lake dataset to which the federated index is mapped.
To understand why the AWS source version of the Amazon Security Lake dataset is relevant, see Optimize federated searches of Amazon Security Lake datasets by identifying time partition settings.
By default, each federated index created for a Federated Analytics federated provider has a name that combines the federated provider name with the AWS Glue table name that the federated index is mapped to.
In addition, federated index names in Federated Analytics federated providers are prefixed with dlf_ and suffixed with _index.
For example, a federated index for a federated provider named Buttercup ASL Provider that is mapped to an AWS Glue table named Buttercup_Glue_Table_One_2_0 would have the default name dlf_buttercup_asl_provider_buttercup_glue_table_one_2_0_index.AWS Glue Table Select the AWS Glue table dataset to which this federated index is mapped. Only AWS Glue tables that you supplied in the Define provider step but which are not already represented in the federated index list are available. This field is editable only when the number of listed federated indexes is lower than the number of AWS Glue tables that you supplied in the Define provider step.
No default Time partition settings Time partition settings improve search performance and reduce search cost. Verify that the time partition settings match the AWS source version of the AWS Glue table dataset to which the federated index is mapped. See Optimize federated searches of Amazon Security Lake datasets by identifying time partition settings. Defaults to time partition settings for AWS Glue table datasets that follow the AWS source version 2 data schema (OCSF 1.1.0). - (Optional) Select the trash can icon () to remove a federated index that you do not need. When you remove an index, the AWS Glue table field becomes editable for other federated indexes. You cannot give the same AWS Glue table value to two or more federated indexes.
- Select Save to save the federated index configuration and complete the federated provider setup.
- (Optional) Give your users access to the federated index. To run searches over the remote dataset to which the federated index maps, your users must have access permissions for the federated index. See Give your users role-based access control of data lake indexes and federated indexes.
Map to OCSF category is not available in this version of Federated Analytics.
Optimize federated searches of Amazon Security Lake datasets by defining time partition settings
Partitioning is an organization strategy for large datasets that makes it possible for you to search them efficiently. When you partition your data, you make searches faster by grouping similar table rows together, and a typical way to do that is to group rows together by their timestamps. Amazon Security Lake data is automatically partitioned by date.
All of the federated indexes that Federated Analytics generates for its federated providers have a default set of time partition settings that apply to Amazon Security Lake datasets formatted in the AWS source version 2 data schema (OCSF 1.1.0) . If your federated index is associated with an Amazon Security Lake dataset that is formatted in the AWS source version 1 data schema (OCSF 1.0.0-rc.2), you must change the time partition settings for that federated index. The following table displays the time partition settings that are required for each AWS source version.
AWS source version | Time field | Time format | Data type |
---|---|---|---|
1 | eventDay | %Y%m%d | string |
2 (default) | time_dt | %ST | timestamp |
For more information about AWS source versions 1 and 2, see Security Lake queries in the Amazon Security Lake User Guide.
Steps
- Determine the AWS source version data schema of the AWS Glue table dataset your federated index is mapped to by inspecting the auto-generated Federated index name of the federated index. The AWS source version will be at or near the end of the federated index name. You will see
1_0
for the AWS source version 1 data schema or2_0
for the AWS source version 2 data schema. - Select Edit.
- If your federated index maps to a dataset that follows AWS source version 1, change the values of the Time field, Time format, and Data type time partition settings to eventDay, %Y%m%d, and string, respectively.
If your federated index maps to a dataset that follows AWS source version 2, make no change to the time partition settings. - Select the Time zone that applies to your time partition fields. In most cases, GMT (Greenwich Mean Time) should be sufficient.
- Select Save to save the time partition setting configuration for the federated index.
Run federated searches over your AWS Glue table datasets
After you confirm the setup of your federated indexes, you can use the sdselect
command to search the AWS Glue datasets to which those federated indexes map. See sdselect command overview.
Other users can run sdselect
searches when you grant them role-based access control of one or more of these federated indexes. See Give your users role-based access control of data lake indexes and federated indexes.
After you create your federated provider, you might experience a delay of up to 15 minutes before you can successfully run an sdselect
search, as related AWS resources are created and accepted in the AWS account to which your Splunk Cloud Platform belongs.
For an overview of your options when searching with Federated Analytics, see Run Federated Analytics searches.
Deactivate a federated index
You can deactivate a federated index that you might want to turn off but not delete. You might deactivate federated indexes when your data scanning entitlements are depleted, to prevent unintentional usage.
You can activate a deactivated federated index by following these same steps, except you'll select Activate instead of Deactivate on the final step.
Prerequisites
- A role on your Splunk Cloud Platform deployment that has the admin_all_objects capability.
- A federated index for Federated Analytics that you want to deactivate.
Steps
- On your Splunk Cloud Platform deployment, in Splunk Web, select Settings, then Federated search.
- On the Federated index tab, identify an activated federated index that you want to deactivate.
- Select Deactivate for the index you want to deactivate.
Delete a federated index
You can delete a federated index that maps to an AWS Glue table that you no longer need to search.
Prerequisites
- A role on your Splunk Cloud Platform deployment that has the admin_all_objects capability.
- A federated index for Federated Analytics that you want to delete.
Steps
- On your Splunk Cloud Platform deployment, in Splunk Web, select Settings, then Federated search.
- On the Federated index tab, identify a federated index that you want to delete.
- Select Delete for the index you want to delete.
You can replace deleted federated indexes for an existing Federated Analytics federated provider by editing the provider from the Providers tab, or by selecting Add federated index on the Indexes tab.
Set up data ingest and retention rules for data lake indexes | Manage existing Amazon Security Lake federated providers, federated indexes, and data lake indexes |
This documentation applies to the following versions of Splunk Cloud Platform™: 9.3.2408
Feedback submitted, thanks!