Splunk® Data Stream Processor

Connect to Data Sources and Destinations with DSP

Acrobat logo Download manual as PDF

On October 30, 2022, all 1.2.x versions of the Splunk Data Stream Processor will reach its end of support date. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see the Upgrade the Splunk Data Stream Processor topic.
Acrobat logo Download topic as PDF

Connecting AWS metadata sources to your DSP pipeline

The Amazon Metadata connector is planned for deprecation. See the Release Notes for more information.

When creating a data pipeline in Splunk Data Stream Processor, you can connect to various Amazon Web Services (AWS) APIs and use them as data sources. These APIs provide access to events that contain metadata from the resources and infrastructure in AWS. You can get AWS metadata into a pipeline, transform the metadata as needed, and then send the transformed metadata out from the pipeline to a destination of your choosing.

To connect to AWS APIs as data sources, you must complete the following tasks:

  1. Create a connection that allows DSP to access your AWS metadata. See Create a DSP connection to get metadata from AWS.
  2. Create a pipeline that starts with the Amazon Metadata source function. See the Building a pipeline chapter in the Use the Data Stream Processor manual for instructions on how to build a data pipeline.
  3. Configure the Amazon Metadata source function to use your Amazon metadata connection. See Get data from Amazon Metadata in the Function Reference manual.

When you activate the pipeline, the source function starts collecting events from the supported AWS APIs. Each event is received into the pipeline as a record.

If your data fails to get into DSP, check the connection settings to make sure you have the correct access key ID and secret access key for your Identity and Access Management (IAM) user, as well as the appropriate regions and APIs for your AWS resources. DSP doesn't run a check to see if you enter valid credentials.

How AWS metadata is collected

The source function collects data according to the job schedule that you specified in the connection settings. See Scheduled data collection jobs for more information, including a list of the limitations that apply to all scheduled data collection jobs.

The Amazon Metadata connector uses AWS regions and AWS APIs to collect resource status and infrastructure information. The data that is included in each record varies depending on the specific AWS API that the event comes from.

The connector supports the AWS APIs described in the following table. This table includes details about the AWS permissions required to get data from each API, and describes the data that is included in the records from the given API.

AWS API AWS Permission Source Source type Body
ec2_instances ec2:DescribeInstances <region>:ec2:describeInstances aws:ec2:instance All attributes of ec2.Instance and OwnerID of ec2.Reservation
ec2_key_pairs ec2:DescribeKeyPairs <region>:ec2:describeKeyPairs aws:ec2:keyPair All attributes of ec2.KeyPairInfo
ec2_reserved_instances ec2:DescribeReservedInstances <region>:ec2:describeReservedInstances aws:ec2:reservedInstances All attributes of ec2.ReservedInstances
ebs_snapshots ec2:DescribeSnapshots <region>:ec2:describeSnapshots aws:ec2:snapshot All attributes of ec2.Snapshot
ec2_volumes ec2:DescribeVolumes <region>:ec2:describeVolumes aws:ec2:volume All attributes of ec2.Volume
ec2_security_groups ec2:DescribeSecurityGroups <region>:ec2:describeSecurityGroups aws:ec2:securityGroup All attributes of ec2.SecurityGroup
ec2_images ec2:DescribeImages <region>:ec2:describeImages aws:ec2:image All attributes of ec2.Image
ec2_addresses ec2:DescribeAddresses <region>:ec2:describeAddresses aws:ec2:address All attributes of ec2.Address
classic_load_balancers elasticloadbalancing:DescribeLoadBalancers
<region>:elb:describeLoadBalancers aws:elb:loadBalancer All attributes of elb.LoadBalancerDescription
Tags: All attributes of elb.Tags
Instances: All attributes of elb.InstanceState
application_load_balancers elasticloadbalancing:DescribeLoadBalancers
<region>:elbv2:describeLoadBalancers aws:elbv2:loadBalancer All attributes of elbv2.LoadBalance
Listeners: All attributes of elbv2.Listeners
Tags: All attributes of elbv2.Tags
TargetGroups: All attributes of elbv2.TargetGroup and elbv2.TargetHealth
vpcs ec2:DescribeVpcs <region>:ec2:describeVpcs aws:ec2:vpc All attributes of ec2.Vpc
vpc_subnets ec2:DescribeSubnets <region>:ec2:describeSubnets aws:ec2:subnet All attributes of ec2.Subnet
vpc_network_acls ec2:DescribeNetworkAcls <region>:ec2:describeNetworkAcls aws:ec2:networkAcl All attributes of ec2.NetworkAcl
cloudfront_distributions cloudfront:ListDistributions <region>:cloudfront:listDistributions aws:cloudfront:distribution All attributes of cloudfront.DistributionSummary
rds_instances rds:DescribeDBInstances <region>:rds:describeDBInstances aws:rds:dbInstance All attributes of rds.DBInstance
lambda_functions lambda:ListFunctions <region>:lambda:listFunctions aws:lambda:function All attributes of lambda.FunctionConfiguration
s3_buckets s3:ListAllMyBuckets <region>:s3:listBuckets aws:s3:bucket All attributes of s3.Bucket
iam_users iam:ListUsers
iam:GetAccessKeyLastUsed iam:GetAccountPasswordPolicy
<region>:iam:listUsers aws:iam:user All attributes of iam.User
AccessKey: All attributes of iam.AccessKeyMetadata
AccessKey.AccessKeyLastUsed: all attributes of iam.AccessKeyLastUsed
PasswordPolicy: All attributes of iam.PasswordPolicy
eks_clusters eks:DescribeCluster
<region>:eks:describeCluster aws:eks:cluster All attributes of EKS.ListClusters and EKS.DescribeCluster
route53_domains route53domains:ListDomains
route53domains:ListTagsForDomain (optional)
<region>:route53Domains:getDomainDetail aws:route53Domains:domain All attributes of Route53Domain.ListDomains and Route53Domain.GetDomainDetail
acm_certificates acm:DescribeCertificate
acm:ListTagsForCertificate (optional)
<region>:acm:describeCertificate aws:acm:certificate All attributes of ACM.ListCertificates, acm.DescribeCertificate, and acm.ListTagsForCertificate
route53_traffic_policy_instances route53:ListTrafficPolicyInstances
route53:ListTagsForResource (optional)
<region>:route53:listTrafficPolicyInstances aws:route53:trafficPolicyInstance All attributes of Route53.ListTrafficPolicyInstances and route53.ListTagsForResource
route53_hosted_zones route53:ListHostedZones
route53:ListTagsForResource (optional)
<region>:route53:getHostedZone aws:route53:hostedZone All attributes of Route53.ListHostedZones, Route53.GetHostedZone. and route53.ListTagsForResource
route53_traffic_policies route53:ListTrafficPolicies
route53:ListTagsForResource (optional)
<region>:route53:getTrafficPolicy aws:route53:trafficPolicy All attributes of Route53.ListTrafficPolicies, Route53.GetTrafficPolicy, and route53.ListTagsForResource
ecr_repositories ecr:DescribeRepositories <region>:ecr:describeRepositories aws:ecr:repository All attributes of ECR.DescribeRepositories
ecr_images ecr:DescribeRepositories
<region>:ecr:describeImages aws:ecr:image All attributes of ECR.DescribeRepositories and ECR.DescribeImages
ecs_container_instances ecs:ListClusters
<region>:ecs:describeContainerInstances aws:ecs:containerInstance All attributes of ECS.ListClusters, ECS.ListContainerInstances, and ECS.DescribeContainerInstances
ecs_tasks ecs:ListClusters
<region>:ecs:describeTasks aws:ecs:task All attributes of ECS.ListClusters, ECS.ListTasks, and ECS.DescribeTasks
ecs_services ecs:ListClusters
<region>:ecs:describeServices aws:ecs:service All attributes of ECS.ListClusters, ECS.ListServices, and ECS.DescribeServices
ecs_clusters ecs:ListClusters
<region>:ecs:describeClusters aws:ecs:cluster All attributes of ECS.ListClusters and ECS.DescribeClusters
efs_file_systems elasticfilesystem:DescribeFileSystems <region>:efs:describeFileSystems aws:efs:fileSystem All attributes of EFS.DescribeFileSystems
dynamodb_tables dynamodb:ListTables
<region>:dynamoDB:describeTable aws:dynamoDB:table All attributes of DynamoDB.ListTables and DynamoDB.DescribeTable
dynamodb_global_tables dynamodb:ListGlobalTables
<region>:dynamoDB:describeGlobalTable aws:dynamoDB:globalTable All attributes of DynamoDB.ListGlobalTables and DynamoDB.DescribeGlobalTable
waf_web_acls waf:ListWebACLs
<region>:waf:getWebACL aws:waf:webACL All attributes of Waf.ListWebACLs and Waf.GetWebACL
cloudwatchlogs_log_groups logs:DescribeLogGroups
logs:ListTagsLogGroup (optional)
logs:GetLogGroupFields (optional)
<region>:cloudwatchlogs:describeLogGroups aws:cloudwatchlogs:logGroup All attributes of CloudWatchLogs.DescribeLogGroups, CloudWatchLogs.ListTagsLogGroup, and CloudWatchLogs.GetLogGroupFields
Last modified on 29 March, 2022
Formatting DSP data for Parquet files in Amazon S3
Create a DSP connection to get metadata from AWS

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0, 1.2.1-patch02, 1.2.1, 1.2.2-patch02, 1.2.4, 1.2.5, 1.3.0, 1.3.1

Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters