Connecting AWS metadata sources to your DSP pipeline
The Amazon Metadata connector is planned for deprecation. See the Release Notes for more information.
When creating a data pipeline in Splunk Data Stream Processor, you can connect to various Amazon Web Services (AWS) APIs and use them as data sources. These APIs provide access to events that contain metadata from the resources and infrastructure in AWS. You can get AWS metadata into a pipeline, transform the metadata as needed, and then send the transformed metadata out from the pipeline to a destination of your choosing.
To connect to AWS APIs as data sources, you must complete the following tasks:
- Create a connection that allows DSP to access your AWS metadata. See Create a DSP connection to get metadata from AWS.
- Create a pipeline that starts with the Amazon Metadata source function. See the Building a pipeline chapter in the Use the Data Stream Processor manual for instructions on how to build a data pipeline.
- Configure the Amazon Metadata source function to use your Amazon metadata connection. See Get data from Amazon Metadata in the Function Reference manual.
When you activate the pipeline, the source function starts collecting events from the supported AWS APIs. Each event is received into the pipeline as a record.
If your data fails to get into DSP, check the connection settings to make sure you have the correct access key ID and secret access key for your Identity and Access Management (IAM) user, as well as the appropriate regions and APIs for your AWS resources. DSP doesn't run a check to see if you enter valid credentials.
How AWS metadata is collected
The source function collects data according to the job schedule that you specified in the connection settings. See Scheduled data collection jobs for more information, including a list of the limitations that apply to all scheduled data collection jobs.
The Amazon Metadata connector uses AWS regions and AWS APIs to collect resource status and infrastructure information. The data that is included in each record varies depending on the specific AWS API that the event comes from.
The connector supports the AWS APIs described in the following table. This table includes details about the AWS permissions required to get data from each API, and describes the data that is included in the records from the given API.
AWS API | AWS Permission | Source | Source type | Body |
---|---|---|---|---|
ec2_instances | ec2:DescribeInstances | <region>:ec2:describeInstances | aws:ec2:instance | All attributes of ec2.Instance and OwnerID of ec2.Reservation |
ec2_key_pairs | ec2:DescribeKeyPairs | <region>:ec2:describeKeyPairs | aws:ec2:keyPair | All attributes of ec2.KeyPairInfo |
ec2_reserved_instances | ec2:DescribeReservedInstances | <region>:ec2:describeReservedInstances | aws:ec2:reservedInstances | All attributes of ec2.ReservedInstances |
ebs_snapshots | ec2:DescribeSnapshots | <region>:ec2:describeSnapshots | aws:ec2:snapshot | All attributes of ec2.Snapshot |
ec2_volumes | ec2:DescribeVolumes | <region>:ec2:describeVolumes | aws:ec2:volume | All attributes of ec2.Volume |
ec2_security_groups | ec2:DescribeSecurityGroups | <region>:ec2:describeSecurityGroups | aws:ec2:securityGroup | All attributes of ec2.SecurityGroup |
ec2_images | ec2:DescribeImages | <region>:ec2:describeImages | aws:ec2:image | All attributes of ec2.Image |
ec2_addresses | ec2:DescribeAddresses | <region>:ec2:describeAddresses | aws:ec2:address | All attributes of ec2.Address |
classic_load_balancers | elasticloadbalancing:DescribeLoadBalancers elasticloadbalancing:DescribeTags elasticloadbalancing:DescribeInstanceHealth |
<region>:elb:describeLoadBalancers | aws:elb:loadBalancer | All attributes of elb.LoadBalancerDescription Tags: All attributes of elb.Tags Instances: All attributes of elb.InstanceState |
application_load_balancers | elasticloadbalancing:DescribeLoadBalancers elasticloadbalancing:DescribeListeners elasticloadbalancing:DescribeTags elasticloadbalancing:DescribeTargetHealth elasticloadbalancing:DescribeTargetGroups |
<region>:elbv2:describeLoadBalancers | aws:elbv2:loadBalancer | All attributes of elbv2.LoadBalance Listeners: All attributes of elbv2.Listeners Tags: All attributes of elbv2.Tags TargetGroups: All attributes of elbv2.TargetGroup and elbv2.TargetHealth |
vpcs | ec2:DescribeVpcs | <region>:ec2:describeVpcs | aws:ec2:vpc | All attributes of ec2.Vpc |
vpc_subnets | ec2:DescribeSubnets | <region>:ec2:describeSubnets | aws:ec2:subnet | All attributes of ec2.Subnet |
vpc_network_acls | ec2:DescribeNetworkAcls | <region>:ec2:describeNetworkAcls | aws:ec2:networkAcl | All attributes of ec2.NetworkAcl |
cloudfront_distributions | cloudfront:ListDistributions | <region>:cloudfront:listDistributions | aws:cloudfront:distribution | All attributes of cloudfront.DistributionSummary |
rds_instances | rds:DescribeDBInstances | <region>:rds:describeDBInstances | aws:rds:dbInstance | All attributes of rds.DBInstance |
lambda_functions | lambda:ListFunctions | <region>:lambda:listFunctions | aws:lambda:function | All attributes of lambda.FunctionConfiguration |
s3_buckets | s3:ListAllMyBuckets | <region>:s3:listBuckets | aws:s3:bucket | All attributes of s3.Bucket |
iam_users | iam:ListUsers iam:ListAccessKeys iam:GetAccessKeyLastUsed iam:GetAccountPasswordPolicy |
<region>:iam:listUsers | aws:iam:user | All attributes of iam.User AccessKey: All attributes of iam.AccessKeyMetadata AccessKey.AccessKeyLastUsed: all attributes of iam.AccessKeyLastUsed PasswordPolicy: All attributes of iam.PasswordPolicy |
eks_clusters | eks:DescribeCluster eks:ListClusters |
<region>:eks:describeCluster | aws:eks:cluster | All attributes of EKS.ListClusters and EKS.DescribeCluster |
route53_domains | route53domains:ListDomains route53domains:GetDomainDetail route53domains:ListTagsForDomain (optional) |
<region>:route53Domains:getDomainDetail | aws:route53Domains:domain | All attributes of Route53Domain.ListDomains and Route53Domain.GetDomainDetail |
acm_certificates | acm:DescribeCertificate acm:ListCertificates acm:ListTagsForCertificate (optional) |
<region>:acm:describeCertificate | aws:acm:certificate | All attributes of ACM.ListCertificates, acm.DescribeCertificate, and acm.ListTagsForCertificate |
route53_traffic_policy_instances | route53:ListTrafficPolicyInstances route53:ListTagsForResource (optional) |
<region>:route53:listTrafficPolicyInstances | aws:route53:trafficPolicyInstance | All attributes of Route53.ListTrafficPolicyInstances and route53.ListTagsForResource |
route53_hosted_zones | route53:ListHostedZones route53:GetHostedZone route53:ListTagsForResource (optional) |
<region>:route53:getHostedZone | aws:route53:hostedZone | All attributes of Route53.ListHostedZones, Route53.GetHostedZone. and route53.ListTagsForResource |
route53_traffic_policies | route53:ListTrafficPolicies route53:GetTrafficPolicy route53:ListTagsForResource (optional) |
<region>:route53:getTrafficPolicy | aws:route53:trafficPolicy | All attributes of Route53.ListTrafficPolicies, Route53.GetTrafficPolicy, and route53.ListTagsForResource |
ecr_repositories | ecr:DescribeRepositories | <region>:ecr:describeRepositories | aws:ecr:repository | All attributes of ECR.DescribeRepositories |
ecr_images | ecr:DescribeRepositories ecr:DescribeImages |
<region>:ecr:describeImages | aws:ecr:image | All attributes of ECR.DescribeRepositories and ECR.DescribeImages |
ecs_container_instances | ecs:ListClusters ecs:ListContainerInstances ecs:DescribeContainerInstances |
<region>:ecs:describeContainerInstances | aws:ecs:containerInstance | All attributes of ECS.ListClusters, ECS.ListContainerInstances, and ECS.DescribeContainerInstances |
ecs_tasks | ecs:ListClusters ecs:ListTasks ecs:DescribeTasks |
<region>:ecs:describeTasks | aws:ecs:task | All attributes of ECS.ListClusters, ECS.ListTasks, and ECS.DescribeTasks |
ecs_services | ecs:ListClusters ecs:ListServices ecs:DescribeServices |
<region>:ecs:describeServices | aws:ecs:service | All attributes of ECS.ListClusters, ECS.ListServices, and ECS.DescribeServices |
ecs_clusters | ecs:ListClusters ecs:DescribeClusters |
<region>:ecs:describeClusters | aws:ecs:cluster | All attributes of ECS.ListClusters and ECS.DescribeClusters |
efs_file_systems | elasticfilesystem:DescribeFileSystems | <region>:efs:describeFileSystems | aws:efs:fileSystem | All attributes of EFS.DescribeFileSystems |
dynamodb_tables | dynamodb:ListTables dynamodb:DescribeTable |
<region>:dynamoDB:describeTable | aws:dynamoDB:table | All attributes of DynamoDB.ListTables and DynamoDB.DescribeTable |
dynamodb_global_tables | dynamodb:ListGlobalTables dynamodb:DescribeGlobalTable |
<region>:dynamoDB:describeGlobalTable | aws:dynamoDB:globalTable | All attributes of DynamoDB.ListGlobalTables and DynamoDB.DescribeGlobalTable |
waf_web_acls | waf:ListWebACLs waf:GetWebACL |
<region>:waf:getWebACL | aws:waf:webACL | All attributes of Waf.ListWebACLs and Waf.GetWebACL |
cloudwatchlogs_log_groups | logs:DescribeLogGroups logs:ListTagsLogGroup (optional) logs:GetLogGroupFields (optional) |
<region>:cloudwatchlogs:describeLogGroups | aws:cloudwatchlogs:logGroup | All attributes of CloudWatchLogs.DescribeLogGroups, CloudWatchLogs.ListTagsLogGroup, and CloudWatchLogs.GetLogGroupFields |
Formatting DSP data for Parquet files in Amazon S3 | Create a DSP connection to get metadata from AWS |
This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0, 1.2.1-patch02, 1.2.1, 1.2.2-patch02, 1.2.4, 1.2.5, 1.3.0, 1.3.1
Feedback submitted, thanks!