Troubleshoot the Splunk Add-on for Microsoft Office 365
General troubleshooting
For troubleshooting tips that you can apply to all add-ons, see Troubleshoot add-ons in Splunk Add-ons. For additional resources, see Support and resource links for add-ons in Splunk Add-ons.
Cannot ingest data after configuring a new application and tenant
The Splunk Add-on for Microsoft Office 365 requires Application permission to read the service health, activity data, and DLP policy events. Make sure these permissions are selected, saved and then granted within the Office 365 Management Activity API configuration on Azure Active Directory.
- Navigate to the Enable Access pane in the Microsoft Azure Active Directory application configuration UI
- Set the Application permissions.
- Read service health information for your organization
- Read activity data for your organization
- (Optional) Read DLP policy events including detected sensitive data
Accessing DLP policy events requires an additional Microsoft Azure Active Directory subscription. If you are unable to ingest DLP policy events, make sure you have the correct Microsoft Azure Active Directory subscription. Refer to the Microsoft Azure Active Directory documentation for more information.
- Click Save after you change permissions.
- Click Grant permissions to finish applying the permission changes.
Cannot ingest Message Trace data after configuring a new application and tenant
HTTP Request error: 401 Client Error
The Splunk Add-on for Microsoft Office 365 requires ReportingWebService.Read.All. Verify this permission is selected, saved, and then granted within the Office 365 Management Activity API configuration on Azure Active Directory.
Certificate verify failed (_ssl.c:741) error message
If you create a new input, you might receive the following error message:
certificate verify failed (_ssl.c:741)
Perform the following steps to resolve the error:
Navigate to $SPLUNK_HOME/etc/auth/cacert.pem
and open the cacert.pem file with a text editor.
Copy the text from your deployment's proxy server certificate, and paste it into the cacert.pem file.
Save your changes.
SSL Cert Errors
O365 TA supports HTTP proxy only, so it will not work with the HTTPS proxy. Make sure the proxy configured in the add-on is of the HTTP type.
If there is ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:1106) error, please check below.
- Check that HTTPS proxy is not set at the splunk(e.g. in splunk-launch.conf) or system level(https_proxy/http_proxy environment variables).
If there is a CERTIFICATE_VERIFY_FAILED error, make sure the required proxy server certificates and vendor-specific certificates are appended to the following available file paths:
- $SPLUNK_HOME/etc/apps/splunk_ta_o365/lib/certifi/cacert.pem
- $SPLUNK_HOME/lib/python3.7/site-packages/certifi/cacert.pem
Data collection stops working - HTTP errors
The Client Secrets in your Microsoft Azure deployment can rotate on a predefined schedule, according to your organization's security requirements. If the secret is not updated in the Splunk Add-on for Microsoft Office 365, data collection will stop. You may see HTTP Error 401 - Unauthorized
or HTTP Error 500 - Internal Server Error
in the logs.
- Navigate to the Splunk Web home screen.
- Click on Splunk Add-on for Microsoft Office 365 in the left navigation banner.
- Click on the Configuration > Tenant tab.
- Select the Tenant that needs an updated Client Secret and click Edit.
- Select Change and update the Client Secret.
- Click Update to save the changes.
Audit events are delayed or missing
As the number of events in your deployment increases, the Splunk Add-on for Microsoft Office 365 may not be able return all events in one query before the next query executes, and events from the previous query may be delayed or even missed. One root cause for this can be the number of threads that are available and used to collect the necessary data sets. If events are being queued, you can increase the number of threads in increments of four until all events are returned in one query.
- Navigate to
$SPLUNK_HOME/etc/apps/splunk_ta_o365/local
, and create aninputs.conf
file, if it does not already exist. - Add the following stanza to the
$SPLUNK_HOME/etc/apps/splunk_ta_o365/local/inputs.conf
file.[splunk_ta_o365_management_activity] interval = 300 disabled = 0 sourcetype = o365:management:activity number_of_threads = 4
- Increase the number of threads in increments of 4. The maximum number of threads is 64.
- Restart Splunk.
- Test to see if all events are being returned:
index=_internal sourcetype="splunk:ta:o365:log" message="Ingesting content success." | eval content_time = strptime(content_id, "%Y%m%d%H%M%S") | chart count by content_time span=600
You can add a filter on the
data_input
field to narrow down the search for a particular data input:
Changeindex=_internal sourcetype="splunk:ta:o365:log" message="Ingesting content success." data_input=my_test_input | eval content_time = strptime(content_id, "%Y%m%d%H%M%S") | chart count by content_time span=600
my_test_input
to the data input name you would like to check.
Increase the thread count gradually until it stops boosting performance. Avoid having high thread count unless the system is of high specifications and you are observing performance improvement with increase in threads.
You could also deploy the Splunk Add-on for Microsoft Office 365 as a tuned standalone add-on to capture Microsoft Azure Active Directory audit events separately from Service Events and Service Messages.
Data ingestion stops on Debian or Ubuntu Linux Server
Splunk Enterprise launches modular inputs under a shell process on Debian or Ubuntu Linux Server and this can block new modular input instances. If you are running the add-on with Debian or Ubuntu Linux Server, set the option start_by_shell = false
in each stanza of inputs.conf
.
- Navigate to
$SPLUNK_HOME/etc/apps/splunk_ta_o365/local
, and create aninputs.conf
file, if it does not already exist. - Add the folowing stanzas to the
$SPLUNK_HOME/etc/apps/splunk_ta_o365/local/inputs.conf
file:[splunk_ta_o365_management_activity] interval = 300 disabled = 0 sourcetype = o365:management:activity number_of_threads = 4 start_by_shell = false [splunk_ta_o365_service_status] interval = 1800 disabled = 0 sourcetype = o365:service:status start_by_shell = false [splunk_ta_o365_service_message] interval = 600 disabled = 0 sourcetype = o365:service:message start_by_shell = false
- Restart Splunk.
Data collection hangs while calling the Office 365 management API
While calling the Office 365 management API, you receive the following error message in your logs.
ReadTimeout: HTTPSConnectionPool(host='manage.office.com', port=443): Read timed out. (read timeout=60)
The modular input is hung during data collection. Configure the request_timeout
parameter in inputs.conf
.
Data ingestion stops for management activity
If data collection for the management activity input stops, and you receive the following message in your error logs.
message="failed to get error code" body="{\"Message\":\"Authorization has been denied for this request.\"}"
Configure token_refresh_window
parameter in inputs.conf
. Enter the number of seconds before the token's expiration time when the token should be refreshed. The range for the parameter is from 400 seconds to 3600 seconds. See the inputs.conf.spec
file in the README directory for this add-on for more information.
Data duplication issues when fetching multiple content URLs
Microsoft's o365:management:activity
API is not like typical event services and does not forward actual events. The API is a front end to an at-least-once delivery message bus, and returns lists of urls pointing to data, and not unique events. With each call to this API, the API clients (like the Splunk software) retrieve new events by time. But the at-least-once nature of the API means that clients get instructed to process the same set of data more than once.
This API design from Microsoft provides assurance that both internal and external failures in process will avoid lost events. A consequence of this design assurance is the occasional duplication of events whenever there is any doubt about the delivery of a message. This API design is highly scalable, as it does not require consistency or checkpoints from the O365 API.
Modular inputs have the ability to manage checkpoints such as counters and last queried time. However for the sake of performance, modular inputs in this add-on are stateless and do not retain data from previous calls, so cannot determine if the current or prior thread has been given the same content by value or key/identifier. This design is intentional, in order to minimize the overhead of high volume interfaces.
Typically these duplicate events from the API should have minimal impact on most use cases, but can impact some aggregate (threshold) or anomaly detection use cases. If these events impact your use case significantly, the best practice is to either raise a request with Microsoft for any possible enhancements to the API design, or alternatively build a message-format compatible webhook using Azure functions or other serverless technologies or any of the available API gateway solutions, that can be used to check for duplicate events by maintaining a history of messages sent by the API over a period of time. This alternative solution can also easily send data to Splunk via the HTTP Event Collector (HEC).
Service health information is not getting ingested
If your service health information is not getting ingested, check to see if you are using the ServiceHealth.Read.All API from Office 365 Management APIs, or the ServiceHealth.Read.All API in Microsoft Graph.
The ServiceHealth.Read.All API from Office 365 Management APIs was retired by Microsoft on December 17, 2021. Use ServiceHealth.Read.All API in Microsoft Graph
If upgrading to version 3.0.0 or later, disable ServiceHealth.Read.All in Office 365 Management APIs, and enable ServiceHealth.Read.All in Microsoft Graph.
Input page not showing any configured inputs with "Unexpected Error" shown on UI
Troubleshoot the "Unexpected Error" in input page shown on UI
- While configuring inputs, if the "Content Type" field is not selected.
- Determine the inputs without "Content Type" field,
- Go to "Settings" -> "Data Inputs"
- Find out the already configured Office 365 Add-on Inputs which don't have the "Content Type" field provided.
- Delete those inputs from the same "Data Inputs" UI
- Reconfigure your inputs using the Configure Inputs for the Splunk Add-on for Microsoft Office 365 topic in this manual.
- If any input is configured from "Settings" -> "Data Inputs" UI, then validations handled in the Add-on will be skipped resulting in the above error.
- Delete those inputs from the same "Data Inputs" UI
- Reconfigure your inputs using the Configure Inputs for the Splunk Add-on for Microsoft Office 365 topic in this manual.
Data ingestion stops for cloud application security input
If data collection for the cloud application security input stops, and you receive the following message in your error logs.
message="Error retrieving Cloud Application Security messages." exception=401:{"detail":"Invalid token header. Token string should not contain spaces."}
One of the reasons for the this error is because of issues following the upgrade steps to migrate to versions 4.1.0 and higher. For more information, see the upgrade topic in this manual.
Duplicate events for Cloud App Security and Management Activity
Problem
You encounter duplicate events for Cloud App Security and Management Activity data ingestion.
Possible solution
After upgrading the Splunk Add-on for Microsoft Office 365 to version 4.1.0, due to a change in checkpoint logic, your Splunk platform deployment might receive duplicate events for a maximum of 7 days. Duplicate events will stop ingesting after 7 days. You may observe a rise in the usage of your deployment's memory/CPU resources.
Configure Message Trace Input for the Splunk Add-on for Microsoft Office 365 | Performance reference for the Management Activity input in the Splunk Add-on for Microsoft Office 365 |
This documentation applies to the following versions of Splunk® Supported Add-ons: released
Feedback submitted, thanks!