Splunk® Validated Architectures

Splunk Validated Architectures

Getting Microsoft Azure data into the Splunk platform

Introduction

Splunk offers many ways of getting Microsoft Azure resource data into Splunk Cloud. Essentially the trade-offs vary by ingestion type and path by ways of scaling, support, security, performance, management and cost.

When considering the best ingest option for the resource in your organization it is best to consider the trade-offs as mentioned above, as in some cases there is more than one ingestion possibility for a Azure resource type.

As a general rule, Data Manager is the recommended method of data ingestion for Splunk Cloud customers for supported data sources where available. Data Manager greatly reduces the time to configure cloud data sources from hours to minutes, while providing a centralized data ingestion management, monitoring and troubleshooting experience.

Throughout this document we will discuss the different architectures to help you choose the best solution for your use case.

Architecture diagram

The following diagram represents the different data ingestion paths for Microsoft Azure.

Microsoft Azure diagram

This diagram shows the Azure resource data sources, and their relative ingestion methods.

Note: M365 (Microsoft Office Suite) is not part of the Microsoft Azure Cloud Platform, however it is a Microsoft Cloud Service, so it is also represented here.

This document covers the Splunk Add-ons for Azure and Event Streaming via the HTTP Event Collector (HEC). For additional information on the HTTP Event Collector and Splunk Add-ons for Azure, please refer to the following documentation:

Push vs Pull

Data Availability

Microsoft has three main ways of making Microsoft Azure data available when it comes to getting data into Splunk.

Storage Accounts: A typical Storage Account is a special container with important properties for storage services and is what holds the storage services such as Blobs, Files, Queues and Tables.

Event Hubs: An Event Hub is a Microsoft fully managed, real-time data ingestion and streaming service that is extremely scalable. Event hubs are an ingestion service which deals with millions of events per second.

Rest APIs: A typical REST API is what a web service uses over HTTP and provides the interface for users to interact with the service. This service would essentially provide create, retrieve or update.

Push vs Pull

There are two main ways of ingesting Microsoft Azure data, Push and Pull.

Push

  • Benefits
    • Increased scalability
    • Near real-time compared to pull
    • Utilizes Azure services, requiring less management and maintenance overhead.
  • Limitations
    • Customization requires developer knowledge in one of the supported programming languages (C#, Java, JavaScript, Powershell or Python).
    • Cannot be used for all sourcetypes

Pull

  • Benefits
    • Lower configuration overhead through Splunk TA (Technical add-ons)
  • Limitations
    • Runs on a schedule which can introduce delays
    • At higher volumes, scaling will require additional hosts and can become more complicated.
    • Cannot be used for all sourcetypes
    • Use of an inputs data manager or stand alone ad-hoc search head creates a single point of failure.

Pull method: TA (Technical Add-on) : Using the Splunk Add-on for Microsoft Cloud Services and Splunk Add-on for Microsoft Azure to pull data from the different Azure data sources from a variety of Microsoft cloud services using Event Hubs, Azure Service Management APIs and Azure Storage APIs.

Microsoft Azure GDI Push diagram

Push method: Using HEC (HTTP Event Collector) as the entry point to push data from the Azure data sources to Splunk (Cloud or Enterprise) over HTTP/S, which can be incorporated using Azure functions meaning use an Azure function to push to splunk via HEC. (This method will get the closest to real-time).

Microsoft Azure GDI Push diagram

More information about source types using HEC can be found in HTTP Event Collector Source Types Azure.

Azure Functions for Splunk

This repository contains available Azure Functions to integrate Microsoft data with Splunk. Azure Functions can be triggered by certain events like an event arriving on an Event Hub, a blob written to a storage account, a Microsoft Teams call concluding, etc. The functions in this repository respond to these events and route data to Splunk accordingly.

Three Azure Functions

  • Microsoft Teams
  • Azure Event Hubs
  • Azure Storage

Microsoft Azure GDI Push diagram

Find below a high-level architecture diagram for both methods.

Microsoft Azure Push Pull diagram

Additionally – Events arriving on an Azure Event Hub are able to trigger serverless Azure Functions. Azure Functions can then further process the raw events in near real-time.

Hybrid push and pull method

The other option to ingest Microsoft Azure data is by leveraging Data Manager for Splunk Cloud. Data Manager provides a simple, modern and automated experience of getting data in for Splunk Cloud administrators, and reduces the time it takes to configure data collection (from hours/days to minutes). Data Manager automates the initial data pipeline setup and configuration. It also allows Splunk admins to manage the pipeline health from an intuitive UI.

The diagram below highlights the Azure data sources, but Data Manager supports all the three main CSPs: Azure, AWS and GCP and is a great way to centralize data onboarding and troubleshooting for cloud data sources from a single pane of glass.

Data Manager high level diagram.

Data Manager is currently available only for Splunk Cloud Platform environments running on AWS. For Microsoft Azure Data Manager integration, there is an auto-generated ARM template that is used for the Azure configuration.

Data Manager source types

For more information on how to configure Data Manager for Microsoft Azure, please follow the guidance on the link below.

Data Manager for Azure Data-Onboarding

Supported data sources

Add-on Input/action Documentation
Splunk Add-on for Microsoft Cloud Services
  • Azure Storage Blob (Example: NSG Flow Logs)
  • Azure Storage Table
  • Azure Audit
  • Azure Resource
  • Event Hub
  • Metrics
  • Azure KQL Log Analytics
  • Azure Consumption (Billing)
Splunkbase

Splunk Documentation

Splunking Azure: NSG Flow Logs (Option 1)

Microsoft Add-on for Microsoft Azure
  • Azure Active Directory Sign-ins
  • Azure Active Directory Users
  • Azure Active Directory Groups
  • Azure Active Directory Audit
  • Azure Active Directory Risk Detection
  • Azure Active Directory Devices
  • Security Center (Now Microsoft Defender for Cloud)
Splunkbase
Splunk Add-on for Microsoft Office 365 Management Activity
  • Audit Azure Active Directory
  • Audit Exchange
  • Audit Sharepoint
  • Audit General
  • DLP All

Service Health & Communications

  • Service Health
  • Service Update Messages

Mailbox

  • Mailbox Usage Detail
  • Mailbox Usage Mailbox Counts

Microsoft 365

  • Microsoft 365 Groups Activity Detail
  • Microsoft 365 Services User Counts

One Drive

  • One Drive Activity User Counts
  • One Drive Usage Account Detail
  • One Drive Usage Storage

SharePoint

  • SharePoint Site Usage Detail
  • SharePoint Site Usage File Counts

Teams

  • Teams User Activity Counts
  • Teams User Activity User Details

Yammer

  • Yammer Group Activity Detail
  • Yammer Group Activity Group Counts

Audit Logs

  • Audit Logs Sign-ins

Cloud Application Security

  • Polices
  • Alerts
  • Cloud Discovery
  • Entities
  • Files

Message Trace

Splunkbase

HTTP Event Collector Source Types

Below shows the source types that can be sent to Splunk cloud by HTTP Event Collector.

Input/action Sources Documentation
Active Directory Sending AD Logs
Diagnostic Logs Sending Diagnostic Logs
Azure Monitor Metrics Sending Azure Metrics
Activity Logs
  • AuditLogs
  • SignInLogs
  • NonInteractiveUserSignInLogs
  • ServicePrincipalSignInLogs
  • ManagedIdentitySignInLogs
  • ProvisioningLogs
  • ADFSSignInLogs Active Directory Federation Services (ADFS)
  • RiskyUsers
  • UserRiskEvents
  • RiskyServicePrincipals
  • ServicePrincipalRiskEvents
  • EnrichedOffice365AuditLogs
  • MicrosoftGraphActivityLogs
  • NetworkAccessTrafficLogs
Sending Azure Activity Logs
NSG Flow Log Data Azure functions for sending Azure Storage data to a Splunk HTTP Event Collector (preferred method)

Splunking Azure: NSG Flow Logs (Option 2) Sending NSG Flow Logs

Application Insights
Network Watcher


Helpful Links


Note: Microsoft Azure uses diagnostics settings to define data export and destination rules. Each resource to be monitored must have a diagnostic setting. Diagnostic settings can be defined using the Azure portal, PowerShell, Azure CLI, diagnostics settings Resource Manager templates, REST API, or an Azure Policy.

Refer to the detailed diagram below for source type data flow.

Event Hubs

Microsoft Event Hubs are used with various methods of data ingestion and data streaming platform. When using Event Hubs it is recommended to use partitions to spread the peak load of the large volumes of events across the partitions.

Security

When it comes to security of getting data into Splunk Cloud, there are two entities to think about.

  • Security of Microsoft Azure
  • Security of Splunk Cloud

Both Microsoft Azure and Splunk Cloud have SSO SAML, LDAP and RBAC of which Zero Trust is recommended. By thinking every authorization behind a firewall as trusted would be wrong, instead only give each user access to what he/she would need to perform their job with least privilege access.

Security starts with Zero Trust and policy of which can be implemented.

Within Splunk Cloud's Role Based Access Controls (RBAC) are six (6) different roles in the table below which helps define user capabilities in a secure way.

Role Capabilities
Apps Manage apps, also has some admin capabilities
Power Edit all shared objects and alerts, tag events
User Create, edit, and run searches. Can also edit its own preferences
Can_delete Can delete by keyboard
Sc_admin (Cloud-specific) Create users and roles
Tokens_auth Configure token-based authorization

Splunk cloud offers two types of encryption:

  • Encryption in transit uses industry standard SSL/TLS 1.2+ encryption. This is used by forwarders and user sessions.
  • Encryption at rest uses AES 256-bit advanced encryption. This service is available as a premium service enhancement.

Splunk Cloud Platform uses AWS KMS (Key Management Service), a fully managed service which helps to create and manage encryption. Essentially, Splunk is responsible for the overall management of all the keys, including their creation, rotation, and revocation. Splunk Cloud Platform also offers Enterprise Managed Encryption Keys (EMEK) as an option for encryption at rest. This gives you the option to bring your own primary encryption key.

Splunk Cloud Platform supports compliance with the following compliance regulations:

  • SOC type 2
  • ISO 27001
  • PCI
  • HIPAA
  • FEDRAMP

You can read more about how Splunk products support compliance regulations here.

Last modified on 28 March, 2024
Getting Google Cloud Platform data into the Splunk platform   Getting AWS data into the Splunk platform

This documentation applies to the following versions of Splunk® Validated Architectures: current


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters