Splunk® Supported Add-ons

Splunk Add-on for Microsoft Office 365

Acrobat logo Download manual as PDF


Acrobat logo Download topic as PDF

Troubleshoot the Splunk Add-on for Microsoft Office 365

General troubleshooting

For troubleshooting tips that you can apply to all add-ons, see Troubleshoot add-ons in Splunk Add-ons. For additional resources, see Support and resource links for add-ons in Splunk Add-ons.

Cannot ingest data after configuring a new application and tenant

The Splunk Add-on for Microsoft Office 365 requires Application permission to read the service health, activity data, and DLP policy events. Make sure these permissions are selected, saved and then granted within the Office 365 Management Activity API configuration on Azure Active Directory.

  1. Navigate to the Enable Access pane in the Microsoft Azure Active Directory application configuration UI
  2. Set the Application permissions.
    • Read service health information for your organization
    • Read activity data for your organization
    • (Optional) Read DLP policy events including detected sensitive data

      Accessing DLP policy events requires an additional Microsoft Azure Active Directory subscription. If you are unable to ingest DLP policy events, make sure you have the correct Microsoft Azure Active Directory subscription. Refer to the Microsoft Azure Active Directory documentation for more information.

  3. Click Save after you change permissions.
  4. Click Grant permissions to finish applying the permission changes.

Cannot ingest Message Trace data after configuring a new application and tenant

HTTP Request error: 401 Client Error

The Splunk Add-on for Microsoft Office 365 requires ReportingWebService.Read.All. Verify this permission is selected, saved, and then granted within the Office 365 Management Activity API configuration on Azure Active Directory.


Certificate verify failed (_ssl.c:741) error message

If you create a new input, you might receive the following error message: certificate verify failed (_ssl.c:741) Perform the following steps to resolve the error:

Navigate to $SPLUNK_HOME/etc/auth/cacert.pem and open the cacert.pem file with a text editor. Copy the text from your deployment's proxy server certificate, and paste it into the cacert.pem file. Save your changes.


Data collection stops working - HTTP errors

The Client Secrets in your Microsoft Azure deployment can rotate on a predefined schedule, according to your organization's security requirements. If the secret is not updated in the Splunk Add-on for Microsoft Office 365, data collection will stop. You may see HTTP Error 401 - Unauthorized or HTTP Error 500 - Internal Server Error in the logs.

  1. Navigate to the Splunk Web home screen.
  2. Click on Splunk Add-on for Microsoft Office 365 in the left navigation banner.
  3. Click on the Tenant tab.
  4. Select the Tenant that needs an updated Client Secret and click Edit.
  5. Select Change and update the Client Secret.
  6. Click Update to save the changes.

Audit events are delayed or missing

As the number of events in your deployment increases, the Splunk Add-on for Microsoft Office 365 may not be able return all events in one query before the next query executes, and events from the previous query may be delayed or even missed. One root cause for this can be the number of threads that are available and used to collect the necessary data sets. If events are being queued, you can increase the number of threads in increments of four until all events are returned in one query.

  1. Navigate to $SPLUNK_HOME/etc/apps/splunk_ta_o365/local, and create an inputs.conf file, if it does not already exist.
  2. Add the following stanza to the $SPLUNK_HOME/etc/apps/splunk_ta_o365/local/inputs.conf file.
    [splunk_ta_o365_management_activity]
    interval = 300
    disabled = 0
    sourcetype = o365:management:activity
    number_of_threads = 4
    
  3. Increase the number of threads in increments of 4. The maximum number of threads is 64.
  4. Restart Splunk.
  5. Test to see if all events are being returned:

    index=_internal sourcetype="splunk:ta:o365:log" message="Ingesting content success." | eval content_time = strptime(content_id, "%Y%m%d%H%M%S") | chart count by content_time span=600

    You can add a filter on the data_input field to narrow down the search for a particular data input:

    index=_internal sourcetype="splunk:ta:o365:log" message="Ingesting content success." data_input=my_test_input | eval content_time = strptime(content_id, "%Y%m%d%H%M%S") | chart count by content_time span=600

    Change my_test_input to the data input name you would like to check.

You could also deploy the Splunk Add-on for Microsoft Office 365 as a tuned standalone add-on to capture Microsoft Azure Active Directory audit events separately from Service Events and Service Messages.

Data ingestion stops on Debian or Ubuntu Linux Server

Splunk Enterprise launches modular inputs under a shell process on Debian or Ubuntu Linux Server and this can block new modular input instances. If you are running the add-on with Debian or Ubuntu Linux Server, set the option start_by_shell = false in each stanza of inputs.conf.

  1. Navigate to $SPLUNK_HOME/etc/apps/splunk_ta_o365/local, and create an inputs.conf file, if it does not already exist.
  2. Add the folowing stanzas to the $SPLUNK_HOME/etc/apps/splunk_ta_o365/local/inputs.conf file:
    [splunk_ta_o365_management_activity]
    interval = 300
    disabled = 0
    sourcetype = o365:management:activity
    number_of_threads = 4
    start_by_shell = false
    
    [splunk_ta_o365_service_status]
    interval = 1800
    disabled = 0
    sourcetype = o365:service:status
    start_by_shell = false
    
    [splunk_ta_o365_service_message]
    interval = 600
    disabled = 0
    sourcetype = o365:service:message
    start_by_shell = false
    
  3. Restart Splunk.

Data collection hangs while calling the Office 365 management API

While calling the Office 365 management API, you receive the following error message in your logs.

ReadTimeout: HTTPSConnectionPool(host='manage.office.com', port=443): Read timed out. (read timeout=60)

The modular input is hung during data collection. Configure the request_timeout parameter in inputs.conf.

Data ingestion stops for management activity

If data collection for the management activity input stops, and you receive the following message in your error logs.

message="failed to get error code" body="{\"Message\":\"Authorization has been denied for this request.\"}"

Configure token_refresh_window parameter in inputs.conf. Enter the number of seconds before the token's expiration time when the token should be refreshed. The range for the parameter is from 400 seconds to 3600 seconds. See the inputs.conf.spec file in the README directory for this add-on for more information.

Data duplication issues when fetching multiple content URLs

Microsoft's o365:management:activity API is not like typical event services and does not forward actual events. The API is a front end to an at-least-once delivery message bus, and returns lists of urls pointing to data, and not unique events. With each call to this API, the API clients (like the Splunk software) retrieve new events by time. But the at-least-once nature of the API means that clients get instructed to process the same set of data more than once.

This API design from Microsoft provides assurance that both internal and external failures in process will avoid lost events. A consequence of this design assurance is the occasional duplication of events whenever there is any doubt about the delivery of a message. This API design is highly scalable, as it does not require consistency or checkpoints from the O365 API.

Modular inputs have the ability to manage checkpoints such as counters and last queried time. However for the sake of performance, modular inputs in this add-on are stateless and do not retain data from previous calls, so cannot determine if the current or prior thread has been given the same content by value or key/identifier. This design is intentional, in order to minimize the overhead of high volume interfaces.

Typically these duplicate events from the API should have minimal impact on most use cases, but can impact some aggregate (threshold) or anomaly detection use cases. If these events impact your use case significantly, the best practice is to either raise a request with Microsoft for any possible enhancements to the API design, or alternatively build a message-format compatible webhook using Azure functions or other serverless technologies or any of the available API gateway solutions, that can be used to check for duplicate events by maintaining a history of messages sent by the API over a period of time. This alternative solution can also easily send data to Splunk via the HTTP Event Collector (HEC).

Service health information is not getting ingested

If your service health information is not getting ingested, check to see if you are using the ServiceHealth.Read.All API from Office 365 Management APIs, or the ServiceHealth.Read.All API in Microsoft Graph.

The ServiceHealth.Read.All API from Office 365 Management APIs was retired by Microsoft on December 17, 2021. Use ServiceHealth.Read.All API in Microsoft Graph

If upgrading to version 3.0.0 or later, disable ServiceHealth.Read.All in Office 365 Management APIs, and enable ServiceHealth.Read.All in Microsoft Graph.

Input page not showing any configured inputs with "Unexpected Error" shown on UI

Troubleshoot the "Unexpected Error" in input page shown on UI

  • While configuring inputs, if the "Content Type" field is not selected.
    1. Determine the inputs without "Content Type" field,
    2. Go to "Settings" -> "Data Inputs"
    3. Find out the already configured Office 365 Add-on Inputs which don't have the "Content Type" field provided.
    4. Delete those inputs from the same "Data Inputs" UI
    5. Reconfigure your inputs using the Configure Inputs for the Splunk Add-on for Microsoft Office 365 topic in this manual.
  • If any input is configured from "Settings" -> "Data Inputs" UI, then validations handled in the Add-on will be skipped resulting in the above error.
    1. Delete those inputs from the same "Data Inputs" UI
    2. Reconfigure your inputs using the Configure Inputs for the Splunk Add-on for Microsoft Office 365 topic in this manual.

Data ingestion stops for cloud application security input

If data collection for the cloud application security input stops, and you receive the following message in your error logs.

message="Error retrieving Cloud Application Security messages." exception=401:{"detail":"Invalid token header. Token string should not contain spaces."}

One of the reasons for the this error is because of issues following the upgrade steps to migrate to versions 4.1.0 and higher. For more information, see the upgrade topic in this manual.

Duplicate events for Cloud App Security and Management Activity

Problem

You encounter duplicate events for Cloud App Security and Management Activity data ingestion.

Possible solution

After upgrading the Splunk Add-on for Microsoft Office 365 to version 4.1.0, due to a change in checkpoint logic, your Splunk platform deployment might receive duplicate events for a maximum of 7 days. Duplicate events will stop ingesting after 7 days. You may observe a rise in the usage of your deployment's memory/CPU resources.

Last modified on 03 November, 2022
PREVIOUS
Configure Message Trace Input for the Splunk Add-on for Microsoft Office 365
 

This documentation applies to the following versions of Splunk® Supported Add-ons: released


Was this documentation topic helpful?


You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters