Define an Amazon S3 federated provider
To set up federated search for Amazon S3 on your Splunk Cloud Platform deployment, you must define one or more federated providers for that deployment. A federated provider definition gives your Splunk Cloud Platform deployment the means to establish a connection with a specific AWS account and search over specific datasets in that AWS account.
In this task you do these things:
- Name your Amazon S3 federated provider definition and provide essential Amazon S3 account information.
- Provide necessary AWS Glue information, such as a list of the AWS Glue Data Catalog tables you intend to use for your searches, and the name of the AWS Glue Data Catalog database that all of the listed AWS Glue tables belong to.
- Provide the Amazon S3 locations for the data you intend to search.
- If you use server-side encryption through the AWS Key Management Service to encrypt your Amazon S3 data, provide the AWS KMS key Amazon resource names (ARNs) for your encrypted Amazon S3 buckets.
- Generate resource policies for your Amazon S3 locations, your AWS Glue Data Catalog, and your AWS KMS keys, and update your Amazon accounts with those policies.
General prerequisites
- You must have the following things:
- A role on your Splunk Cloud Platform deployment with the admin_all_objects capability.
- An AWS account and an AWS IAM role with permissions that let you attach and modify policies for Amazon S3 locations and an Amazon Glue data catalog. Contact your AWS administrator for assistance with permissions.
- Turn on token authentication for your Splunk Cloud Platform deployment. See Enable or disable token authentication in Securing Splunk Cloud Platform.
AWS Glue table prerequisites
Gather information for each AWS Glue table that you are associating with the Amazon S3 federated provider definition. For details, see Create an AWS Glue Data Catalog table.
- Obtain the values of the Name, Database, and Location fields for each AWS Glue table you are adding to the definition. To find this information for a specific table, open the AWS Glue console, and select Tables.
- Each AWS Glue table you add to a specific Amazon S3 federated provider definition must do the following things:
- Share the same AWS Glue database.
- Reference an Amazon S3 location path that contains file objects that share the same file type and compression type.
- Have a column for each field in the Amazon S3 dataset contained by the Amazon S3 location path.
Steps
- On your Splunk Cloud Platform deployment, in Splunk Web, select Settings, then Federated search.
- On the Federated Providers tab, select Add federated provider.
- Select the Amazon S3 federated provider type and select Next.
- Specify the following settings for your Amazon S3 federated provider:
- Federated provider name
- AWS account ID
- AWS region
- Glue Data Catalog database
- Glue Data Catalog tables
- Amazon S3 locations
- AWS KMS key ARNs (optional)
- Select Generate policy to create the following things:
- A Glue Data Catalog resource policy.
- A separate Amazon S3 bucket policy for each Amazon S3 bucket discovered in the location paths listed in Amazon S3 locations.
- An AWS KMS key policy, if you filled out the AWS KMS key ARNs field.
- Copy and paste the Glue Data Catalog resource policy into your AWS Glue account. See Update your Glue Data Catalog resource policy.
- Copy and paste each Amazon S3 bucket policy into the bucket policies for the affected Amazon S3 buckets. See Update your Amazon S3 bucket policies.
- If Splunk software generates an AWS KMS key policy, copy and paste the policy into the key policies for the AWS KMS keys listed in AWS KMS Key ARNs. See Update your AWS KMS Key policies.
- Select Warning and consent.
- Select Create provider to create your federated provider definition.
Splunk software verifies whether your deployment has sufficient cross-account permissions to search Amazon S3 accounts.
- If your permissions are sufficient, you receive a success message and can go on to create the federated indexes you use in your federated searches. See Map a federated index to an AWS Glue table dataset.
- If your permissions are insufficient, an error message appears with an Update Amazon S3 permissions button. You can attempt to set permissions yourself by selecting Update Amazon S3 permissions. See Manually update deployment permissions.
Shorten location paths to capture AWS Glue tables
You can shorten location paths to represent multiple AWS Glue tables. For example, you can have 2 AWS Glue tables at the following location paths:
Location | AWS Glue table name |
---|---|
s3://bucket1/path1/my_csv_data
|
table_csv |
s3://bucket1/path1/my_json_data
|
table_json |
You can provide each path separately in the Amazon S3 locations list.
Alternatively, you can enter a single shortened location to the Amazon S3 locations list that captures both table_csv and table_json:
s3://bucket1/path1/
You can also use a wildcard ( * ) to capture both AWS Glue tables:
s3://bucket1/path1/my*
Use wildcards only at the end of Amazon S3 locations.
Amazon S3 paths that terminate in a file object are not valid for Splunk federated providers. Use only locations that end in an Amazon S3 folder.
For example, this is an invalid location path: s3//bucket1/path1/my_csv_data/data.csv
Specify Amazon S3 Provider settings
Specifying Amazon S3 provider settings is part of the Define an Amazon S3 federated provider task.
All of the following Amazon S3 provider settings are required except for AWS KMS key ARNs, which is optional. None of the settings have default values.
Federated provider name
Enter a unique name for the federated provider. The provider name can contain only alphanumeric characters, underscores, and hyphens.
AWS account ID
Enter the 12-digit ID for your AWS account.
AWS region
The Splunk Cloud Platform attempts to automatically identify the AWS region of your deployment. If the Splunk Cloud Platform is successful, it populates the AWS region setting with the appropriate value.
The AWS region of the Glue Data Catalog database for this federated provider must match the AWS region for your Splunk Cloud Platform deployment.
You cannot set this field on your own.
Glue Data Catalog database
Enter the name of the AWS Glue database that contains the AWS Glue tables listed in Glue Data Catalog tables. AWS Glue database names can contain only lowercase letters, numbers, underscores, and hyphens.
The Glue Data Catalog database setting has the following restrictions:
- The AWS Glue database specified by Glue Data Catalog database must contain all of the AWS Glue tables listed in Glue Data Catalog tables.
- The Glue Data Catalog database must have the same AWS region as your Splunk Cloud Platform deployment.
- A federated provider definition can have only 1 AWS Glue database name.
Check the Tables page in the AWS Glue console to see the Database assignments for individual AWS Glue tables.
Glue Data Catalog tables
Enter 1 or more AWS Glue tables that you want to associate with this federated provider. Separate AWS Glue table names with commas. AWS Glue table names can contain only lowercase letters, numbers, underscores, and hyphens.
All AWS Glue tables listed in Glue Data Catalog tables must have these elements:
- Belong to the AWS Glue database specified by Glue Data Catalog database.
- Reference an Amazon S3 location path listed in Amazon S3 locations.
To get AWS Glue table names, check the Tables page in the AWS Glue console. AWS Glue table names appear in the Names column.
For more information about creating AWS Glue tables, see Create an AWS Glue Data Catalog table.
Amazon S3 locations
Amazon S3 locations are file paths in Amazon S3 that contain the data that you want to search.
Enter 1 or more Amazon S3 location paths. Separate location paths with commas. Amazon S3 locations can contain only alphanumeric characters and the following special characters: /!=_.*'():
You can shorten locations so that they encompass multiple paths. See Shorten location paths to capture AWS Glue tables.
Each AWS Glue table you list in Glue Data Catalog tables must be represented by a location in Amazon S3 locations.
An AWS Glue table can be associated with only 1 location path. A single location path can be specified in multiple unrelated AWS Glue table definitions.
If you do not know the Amazon S3 location paths for the AWS Glue tables you have listed, check the Tables page in the AWS Glue console. Amazon S3 location paths appear in the Locations column.
AWS KMS key ARNs
(Optional) Do you use the AWS Key Management Service to apply server-side encryption (SSE-KMS) to the data stored in your Amazon S3 buckets or the metadata in your AWS Glue Data Catalog?
If you do, enter into AWS KMS key ARNs the Amazon resource names (ARNs) for the AWS KMS keys that encrypt data in your Amazon S3 buckets or metadata in your AWS Glue Data Catalog.
Federated search for Amazon S3 supports only customer-managed AWS KMS keys. In addition, each KMS key ARN you provide in this field must belong to the AWS account you specify with the AWS account ID setting.
For more information about AWS KMS keys, see AWS KMS concepts in the AWS Key Management Service Developer Guide.
Get AWS KMS key ARNs for your Amazon S3 buckets
To get the AWS KMS key ARNs for your Amazon S3 buckets, go to your Amazon S3 console and review the buckets that are associated with the locations you have listed in Amazon S3 locations for this provider. Follow these steps to obtain an AWS KMS key ARN that is associated with an Amazon S3 bucket:
- In the Amazon S3 console, select the bucket name to open its record.
- Select the bucket Properties tab.
- Inspect the Default encryption section. If you find an AWS KMS key ARN in that section, copy it.
- Paste the copied ARN into the AWS KMS Key ARNs field in your federated provider definition.
If you copy multiple ARNs into AWS KMS key ARNs, separate them with comma characters.
If additional AWS KMS keys encrypt data within the bucket, you must provide their key ARNs within the AWS KMS key ARNs list.
Get the AWS KMS key ARN for your AWS Glue Data Catalog
Follow these steps to get the AWS KMS key ARN for your AWS Glue Data Catalog:
- In the AWS Glue console, select Catalog settings in the left-hand navigation bar.
- Under Encryption options, if you find an AWS KMS key ARN in the AWS KMS key for metadata encryption field, copy it.
- Paste the copied ARN into the AWS KMS key ARNs field in your federated provider definition.
If AWS KMS key ARNs contains a list of other ARNs when you add the Data Catalog AWS KMS key ARN, make sure you separate the new ARN from the others with a comma character.
Update your Glue Data Catalog resource policy
Updating your Glue Data Catalog resource policy is part of the Define an Amazon S3 federated provider task.
When you select Generate policy, Splunk software generates a Glue Data Catalog resource policy. Copy and paste this Glue Data Catalog resource policy into your AWS Glue account, appending it to other Glue Data Catalog resource policies if any exist.
- Select Copy for the Glue Data Catalog resource policy to save a copy of the policy to your clipboard.
Here is an example of a Glue Data Catalog resource policy:{ "Version": "2012-10-17", "Statement": [ { "Sid": "AllowSplunkAccessToAWSGlueDataCatalog", "Effect": "Allow", "Principal": { "AWS": [ "arn:aws:iam::<AWS-account-ID>:role/<stack-name>" ] }, "Action": [ "glue:GetDatabase", "glue:GetDatabases", "glue:GetTables", "glue:GetTable", "glue:GetPartitions", "glue:GetPartition", "glue:BatchGetPartition" ], "Resource": [ "arn:aws:glue:us-west-2:<AWS-account-ID>:catalog", "arn:aws:glue:us-west-2:<AWS-account-ID>:database/a_sample_db", "arn:aws:glue:us-west-2:<AWS-account-ID>:table/a_sample_db/a_table" ] } ] }
Each Splunk Cloud Platform deployment is identified by its
stack-name
, which is the prefix of the deployment's URL. For example, if your deployment's URL is https://buttercupgames.splunkcloud.com, thestack-name
isbuttercupgames
. - In the AWS Glue console, navigate to the Data Catalog settings page.
- Paste the saved Glue Data Catalog resource policy into the Permissions field. If other Glue data catalog resource policies already exist, append your new policy to the end of the list. Separate the policies with commas.
Resolve security warnings, errors, general warnings, and suggestions before you save your policy.
- Select Save to save the Glue Data Catalog resource policy update.
For more information about updating Glue Data Catalog resource policies, search on "Granting cross-account access" in the AWS Glue Developer Guide.
Update your Amazon S3 bucket policies
Updating your Amazon S3 bucket policies is part of the Define an Amazon S3 federated provider task.
When you select Generate policy, Splunk software generates an Amazon S3 bucket policy for each Amazon S3 bucket referenced in Amazon S3 locations for the federated provider. Update your Amazon S3 account with these generated policies.
- Select Copy for the Amazon S3 bucket policy to save a copy of the policy to your clipboard.
Here is an example of an Amazon S3 bucket policy:{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "AWS": [ "arn:aws:iam::<AWS-account-ID>:role/<stack-name>" ] }, "Action": [ "s3:GetBucketLocation", "s3:ListBucket", "s3:GetObject*" ], "Sid": "AllowSplunkAccessTo a-sample-aws-s3-bucket", "Resource": [ "arn:aws:s3:::a-sample-aws-s3-bucket", "arn:aws:s3:::a-sample-aws-s3-bucket/a-table/*"] } ] }
Each Splunk Cloud Platform deployment is identified by its
stack-name
, which is the prefix of the deployment's URL. For example, if your deployment's URL is https://buttercupgames.splunkcloud.com, thestack-name
isbuttercupgames
. - After you copy the policy, go to the Amazon S3 console and navigate to the Buckets page.
- Select the Name of the bucket for the policy you have copied.
- Select the Permissions tab for the bucket.
- Select Edit for the Bucket policy.
- Select Add new statement.
- Paste the saved bucket policy into the bucket policy edit window. If there are already policies there, append your bucket policy to the existing set of policies, separating policies with commas.
Resolve security warnings, errors, general warnings, and suggestions before you save your policy.
- Select Save changes to save your policy update.
Repeat these steps for each Amazon S3 bucket policy that Splunk software generates for your federated provider.
For more information about updating Amazon S3 bucket policies, search on "Using bucket policies" in the Amazon Simple Storage Service User Guide.
Update your AWS KMS Key policies
Updating your AWS KMS Key Policies is part of the Define an Amazon S3 federated provider task. You must update your AWS KMS Key Policies only if you are using SSE-KMS encryption to encrypt data in your Amazon S3 buckets or your AWS Glue Data Catalog and have filled out the AWS KMS key ARNs field for the federated provider.
When you select Generate Policy, Splunk software generates an AWS KMS key policy. To allow Splunk to search your SSE-KMS-encrypted Amazon S3 data, copy and paste this AWS KMS key policy into the accounts for your AWS KMS keys, appending it to other AWS KMS key policies if any exist.
- Start by selecting Copy for the AWS KMS key policy to save a copy of the policy to your clipboard.
Here is an example of a AWS KMS key policy:{ "Version": "2012-10-17", "Statement": [ { "Sid": "AllowUseOfTheKey", "Effect": "Allow", "Principal": { "AWS": [ "arn:aws:iam::<AWS-account-ID>:role/<stack-name>" ] }, "Action": [ "kms:Decrypt" ], "Resource": "*" } ] }
Each Splunk Cloud Platform deployment is identified by its
stack-name
, which is the prefix of the deployment's URL. For example, if your deployment's URL is https://buttercupgames.splunkcloud.com, thestack-name
isbuttercupgames
. - In the Amazon S3 console, navigate to the Buckets page.
- Select the Name of a bucket that corresponds to a listed AWS KMS key ARN.
- Select the Properties tab for the bucket.
- Select the Encryption key ARN in the Default encryption section to open the Key ID page for the key in the Key Management Service.
- In the Key policy section, select Edit.
- Paste your saved AWS KMS key policy to the existing Key policy, separating policies with commas. When you append the policy, do not copy in the Version and Statement header fields if they already exist in the policy.
Make sure to resolve security warnings, errors, general warnings, and suggestions before you save your policy.
- Select Save changes to save the key policy update.
Repeat these steps for each AWS KMS key ARN listed in AWS KMS key ARNs for your federated provider. For more information, see Changing a key policy in the AWS Key Management Service Developer Guide.
Manually update deployment permissions
This step is part of the Define an Amazon S3 federated provider task. Take this step only when you get an error message that asks you to manually update your deployment permissions.
Before you can run federated searches over an Amazon S3 account from your Splunk Cloud Platform deployment, Splunk software needs to set up cross-account permissions for your deployment. Without these permissions, you cannot run Amazon S3 federated searches.
Splunk software verifies whether your deployment has correct cross-account permissions whenever you attempt to create or update a federated provider. If it detects that your deployment has missing or incorrect cross-account permissions, it attempts to set them up. If that attempt fails, you can try to manually set the cross-account permissions by selecting the Update Amazon S3 permissions button.
There are two ways to access the Update Amazon S3 permissions button.
- When you create or update a federated provider, Splunk software displays an error message with the Update Amazon S3 permissions button if its attempt to automatically set up cross-account permissions fails. Select the button to reattempt to set the permissions.
- At any time you can access the button from the Federated Providers tab, which you can get to by selecting Settings, then Federated Search. Select Update Amazon S3 permissions to open a dialog box that contains the Update Amazon S3 permissions button.
If you select Update Amazon S3 permissions and your permissions are not restored, contact your Splunk Support representative.
Create an AWS Glue Data Catalog table | Map a federated index to an AWS Glue Data Catalog table dataset |
This documentation applies to the following versions of Splunk Cloud Platform™: 9.1.2312, 9.2.2403, 9.2.2406 (latest FedRAMP release)
Feedback submitted, thanks!