Splunk® Data Stream Processor

Connect to Data Sources and Destinations with DSP

Acrobat logo Download manual as PDF


On October 30, 2022, all 1.2.x versions of the Splunk Data Stream Processor will reach its end of support date. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see the Upgrade the Splunk Data Stream Processor topic.
This documentation does not apply to the most recent version of Splunk® Data Stream Processor. For documentation on the most recent version, go to the latest release.
Acrobat logo Download topic as PDF

Create a DSP connection to get metadata from AWS

The Amazon Metadata connector is planned for deprecation. See the Release Notes for more information.

To get metadata from resources and infrastructure in Amazon Web Services (AWS) into a data pipeline in Splunk Data Stream Processor (DSP), you must first create a connection using the Amazon Metadata connector. In the connection settings, provide your Identity and Access Management (IAM) user credentials so that DSP can access your data, and schedule a data collection job to specify how frequently DSP retrieves the data. You can then use the connection in the Amazon Metadata source function to get metadata from AWS into a DSP pipeline.

Prerequisites

Before you can create the Amazon metadata connection, you must have the following:

  • An IAM user with the necessary permissions for each API that you want to collect data from.
    • See the IAM user permissions section on this page for the complete list of permissions that you would need to collect data from all supported AWS APIs.
    • See How AWS metadata is collected for information about the specific permissions required for each supported AWS API.
  • The access key ID and secret access key for the IAM user.

If you don't have an IAM user with the necessary permissions, ask your AWS administrator for assistance.

IAM user permissions

If you want to collect data from all the AWS APIs that the Amazon Metadata connector supports, make sure that your IAM user has the following permissions:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeInstances",
                "ec2:DescribeKeyPairs",
                "ec2:DescribeReservedInstances",
                "ec2:DescribeSnapshots",
                "ec2:DescribeVolumes",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeImages",
                "ec2:DescribeAddresses",
                "elasticloadbalancing:DescribeLoadBalancers",
                "elasticloadbalancing:DescribeListeners",
                "elasticloadbalancing:DescribeTags",
                "elasticloadbalancing:DescribeTargetHealth",
                "elasticloadbalancing:DescribeTargetGroups",
                "elasticloadbalancing:DescribeInstanceHealth",
                "ec2:DescribeVpcs",
                "ec2:DescribeSubnets",
                "ec2:DescribeNetworkAcls",
                "cloudfront:ListDistributions",
                "rds:DescribeDBInstances",
                "lambda:ListFunctions",
                "s3:ListAllMyBuckets",
                "iam:GetAccountPasswordPolicy",
                "iam:GetAccessKeyLastUsed",
                "iam:ListUsers",
                "iam:ListAccessKeys",
                "eks:DescribeCluster",
                "eks:ListClusters",
                "route53domains:ListDomains",
                "route53domains:GetDomainDetail",
                "route53domains:ListTagsForDomain",
                "acm:DescribeCertificate",
                "acm:ListCertificates",
                "acm:ListTagsForCertificate",
                "route53:ListTrafficPolicyInstances",
                "route53:ListTagsForResource",
                "route53:ListHostedZones",
                "route53:GetHostedZone",
                "route53:ListTagsForResource",
                "route53:ListTrafficPolicies",
                "route53:GetTrafficPolicy",
                "route53:ListTagsForResource",
                "ecr:DescribeRepositories",
                "ecr:DescribeRepositories",
                "ecr:DescribeImages",
                "ecs:ListClusters",
                "ecs:ListContainerInstances",
                "ecs:DescribeContainerInstances",
                "ecs:ListClusters",
                "ecs:ListTasks",
                "ecs:DescribeTasks",
                "ecs:ListClusters",
                "ecs:ListServices",
                "ecs:DescribeServices",
                "ecs:ListClusters",
                "ecs:DescribeClusters",
                "elasticfilesystem:DescribeFileSystems",
                "dynamodb:ListTables",
                "dynamodb:DescribeTable",
                "dynamodb:ListGlobalTables",
                "dynamodb:DescribeGlobalTable",
                "waf:ListWebACLs",
                "waf:GetWebACL",
                "logs:DescribeLogGroups",
                "logs:ListTagsLogGroup",
                "logs:GetLogGroupFields"
            ],
            "Resource": "*"
        }
    ]
}

If you want to collect data from a subset of the supported AWS APIs, you only need to add the permissions for those particular APIs.

Steps

  1. From the Data Stream Processor page, click Data Management and then select the Connections tab.
  2. Click Create New Connection.
  3. Select Amazon Metadata and then click Next.
  4. Complete the following fields:
    Field Description
    Connection Name A unique name for your connection.
    Access Key ID The access key ID for your IAM user.
    Secret Access Key The secret access key for your IAM user.
    Region API Groups A list of groups that indicate which combinations of regions and APIs the connector collects data from. For each group that you want to define, click Add Group and select the appropriate values from the following drop-down lists:
    • Regions: A list of regions that you want to collect data from.
    • APIs (Optional): If you don't want to collect data from all the supported APIs, type a list of the specific APIs that you want to collect data from.
    Scheduled This parameter is on by default, indicating that jobs run automatically. Toggle this parameter off to stop the scheduled job from automatically running. Jobs that are currently running are not affected.
    Schedule The time-based job schedule that determines when the connector executes jobs for collecting data. Select a predefined value or write a custom CRON schedule. All CRON schedules are based on UTC.
    Workers The number of workers you want to use to collect data.

    Any credentials that you upload are transmitted securely by HTTPS, encrypted, and securely stored in a secrets manager.

  5. Click Save.

    If you're editing a connection that's being used by an active pipeline, you must reactivate that pipeline after making your changes. When you reactivate a pipeline, you must select where you want to resume data ingestion. See Using activation checkpoints to activate your pipeline in the Use the Data Stream Processor manual for more information.

You can now use your connection in an Amazon Metadata source function at the start of your data pipeline to get metadata from AWS. For instructions on how to build a data pipeline, see the Building a pipeline chapter in the Use the manual. For information about the source function, see Get data from Amazon Metadata in the Function Reference manual.

Last modified on 29 March, 2022
PREVIOUS
Connecting AWS metadata sources to your DSP pipeline
  NEXT
Connecting Kafka to your DSP pipeline as a data source

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0, 1.2.1-patch02, 1.2.1, 1.2.2-patch02, 1.2.4, 1.2.5


Was this documentation topic helpful?


You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters