Create a DSP connection to get metadata from AWS
The Amazon Metadata connector is planned for deprecation. See the Release Notes for more information.
To get metadata from resources and infrastructure in Amazon Web Services (AWS) into a data pipeline in Splunk Data Stream Processor (DSP), you must first create a connection using the Amazon Metadata connector. In the connection settings, provide your Identity and Access Management (IAM) user credentials so that DSP can access your data, and schedule a data collection job to specify how frequently DSP retrieves the data. You can then use the connection in the Amazon Metadata source function to get metadata from AWS into a DSP pipeline.
Prerequisites
Before you can create the Amazon metadata connection, you must have the following:
- An IAM user with the necessary permissions for each API that you want to collect data from.
- See the IAM user permissions section on this page for the complete list of permissions that you would need to collect data from all supported AWS APIs.
- See How AWS metadata is collected for information about the specific permissions required for each supported AWS API.
- The access key ID and secret access key for the IAM user.
If you don't have an IAM user with the necessary permissions, ask your AWS administrator for assistance.
IAM user permissions
If you want to collect data from all the AWS APIs that the Amazon Metadata connector supports, make sure that your IAM user has the following permissions:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "ec2:DescribeInstances", "ec2:DescribeKeyPairs", "ec2:DescribeReservedInstances", "ec2:DescribeSnapshots", "ec2:DescribeVolumes", "ec2:DescribeSecurityGroups", "ec2:DescribeImages", "ec2:DescribeAddresses", "elasticloadbalancing:DescribeLoadBalancers", "elasticloadbalancing:DescribeListeners", "elasticloadbalancing:DescribeTags", "elasticloadbalancing:DescribeTargetHealth", "elasticloadbalancing:DescribeTargetGroups", "elasticloadbalancing:DescribeInstanceHealth", "ec2:DescribeVpcs", "ec2:DescribeSubnets", "ec2:DescribeNetworkAcls", "cloudfront:ListDistributions", "rds:DescribeDBInstances", "lambda:ListFunctions", "s3:ListAllMyBuckets", "iam:GetAccountPasswordPolicy", "iam:GetAccessKeyLastUsed", "iam:ListUsers", "iam:ListAccessKeys", "eks:DescribeCluster", "eks:ListClusters", "route53domains:ListDomains", "route53domains:GetDomainDetail", "route53domains:ListTagsForDomain", "acm:DescribeCertificate", "acm:ListCertificates", "acm:ListTagsForCertificate", "route53:ListTrafficPolicyInstances", "route53:ListTagsForResource", "route53:ListHostedZones", "route53:GetHostedZone", "route53:ListTagsForResource", "route53:ListTrafficPolicies", "route53:GetTrafficPolicy", "route53:ListTagsForResource", "ecr:DescribeRepositories", "ecr:DescribeRepositories", "ecr:DescribeImages", "ecs:ListClusters", "ecs:ListContainerInstances", "ecs:DescribeContainerInstances", "ecs:ListClusters", "ecs:ListTasks", "ecs:DescribeTasks", "ecs:ListClusters", "ecs:ListServices", "ecs:DescribeServices", "ecs:ListClusters", "ecs:DescribeClusters", "elasticfilesystem:DescribeFileSystems", "dynamodb:ListTables", "dynamodb:DescribeTable", "dynamodb:ListGlobalTables", "dynamodb:DescribeGlobalTable", "waf:ListWebACLs", "waf:GetWebACL", "logs:DescribeLogGroups", "logs:ListTagsLogGroup", "logs:GetLogGroupFields" ], "Resource": "*" } ] }
If you want to collect data from a subset of the supported AWS APIs, you only need to add the permissions for those particular APIs.
Steps
- From the Data Stream Processor page, click Data Management and then select the Connections tab.
- Click Create New Connection.
- Select Amazon Metadata and then click Next.
- Complete the following fields:
Field Description Connection Name A unique name for your connection. Access Key ID The access key ID for your IAM user. Secret Access Key The secret access key for your IAM user. Region API Groups A list of groups that indicate which combinations of regions and APIs the connector collects data from. For each group that you want to define, click Add Group and select the appropriate values from the following drop-down lists: - Regions: A list of regions that you want to collect data from.
- APIs (Optional): If you don't want to collect data from all the supported APIs, type a list of the specific APIs that you want to collect data from.
Scheduled This parameter is on by default, indicating that jobs run automatically. Toggle this parameter off to stop the scheduled job from automatically running. Jobs that are currently running are not affected. Schedule The time-based job schedule that determines when the connector executes jobs for collecting data. Select a predefined value or write a custom CRON schedule. All CRON schedules are based on UTC. Workers The number of workers you want to use to collect data. Any credentials that you upload are transmitted securely by HTTPS, encrypted, and securely stored in a secrets manager.
- Click Save.
If you're editing a connection that's being used by an active pipeline, you must reactivate that pipeline after making your changes. When you reactivate a pipeline, you must select where you want to resume data ingestion. See Using activation checkpoints to activate your pipeline in the Use the Data Stream Processor manual for more information.
You can now use your connection in an Amazon Metadata source function at the start of your data pipeline to get metadata from AWS. For instructions on how to build a data pipeline, see the Building a pipeline chapter in the Use the manual. For information about the source function, see Get data from Amazon Metadata in the Function Reference manual.
Connecting AWS metadata sources to your DSP pipeline | Connecting Kafka to your DSP pipeline as a data source |
This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0, 1.2.1-patch02, 1.2.1, 1.2.2-patch02, 1.2.4, 1.2.5
Feedback submitted, thanks!