Splunk Cloud Platform

Federated Search

Run Federated Analytics searches

After you define your Federated Analytics federated provider, you can use your data lake indexes and federated indexes to run searches of your Amazon Security Lake data. The method you choose depends on the age of the data you want to review.

Age of data Location of data Type of search run
Newer data (up to 31 days old, depending on the retention period set for each data lake index) In data lake indexes on your Splunk Cloud Platform deployment, freshly ingested from your Amazon Security Lake account. Standard searches that reference data lake indexes, using standard Splunk processing language (SPL).
Older data (whatever exceeds the retention period of your data lake indexes) In your remote Amazon Security Lake datasets Searches that use the sdselect command to reference federated indexes. Each federated index maps to a specific remote dataset in your Amazon Security Lake account.

Search recent Amazon Security Lake data in your data lake indexes

When you set up a Federated Analytics federated provider, you define one or more data lake indexes that ingest data from your Amazon Security Lake account on an ongoing basis. All ingested Amazon Security Lake data follows the Open Cybersecurity Schema Framework (OCSF) format, and each index you define has filters that ensure that it ingests only data conforming to a specific OCSF category, such as Application Activity, Findings, or Identity & Access.

You can run a search over one or more data lake indexes by referencing the indexes in your search and then using the same SPL you would use for any other Splunk search. All data lake index names begin with dl_ by default. If you keep this naming convention you can search all of your data lake indexes at once by putting dl_* in your search string.

Because data lake indexes contain fresh, local data, they are ideal for scheduled searches and alerts that run on a frequent basis.

Here is an example of a simple threat-hunting search of a single data lake index.

index=dl_network_activity_index AND _time >= 1716415000 AND _time <= 1716415000 + <time_window_in_seconds> AND traffic.bytes > 66666 AND src_endpoint.port = 6666 AND connection_info.protocol_num = 6 AND traffic.packets = 66 | table traffic.bytes, src_endpoint.ip, dst_endpoint.ip

Each data lake index has a data retention period that you define as part of the federated provider setup. Data lake index retention periods can span up to a maximum of 31 days from the time you run your search. If you want to search Amazon Security Lake data with timestamps that exceed the retention periods of your data lake indexes, you will need to run a federated search of the remote datasets in your Amazon Security Lake account.

See the Search Manual and the Search Reference if you need help with SPL.

Run federated searches over remote datasets in your Amazon Security Lake account

Run federated searches over your remote Amazon Security Lake datasets where they live in Amazon S3 when you need to access ASL data that you're not ingesting into your data lake indexes, or when you need to review ASL data with timestamps that exceed your data lake index retention periods.

Federated searches over remote Amazon Security Lake datasets are best suited for ad hoc threat hunting searches that you run on an infrequent basis, due to the performance and cost limitations of such searches.

When you write a federated search you must do the following things:

  • Begin your search with the sdselect command. In federated searches, the sdselect command does the most work.
  • Invoke a federated index that you defined for your federated provider. Each federated index maps to a specific remote dataset in your Amazon Security Lake account. The sdselect command does not support searching multiple federated indexes with a single search.

Here is an example of a simple threat-hunting federated search of the dataset represented by a single federated index.

| sdselect strict=t traffic.bytes, src_endpoint.ip, dst_endpoint.ip FROM dlf_buttercup_fa_all_asl_data_tables_index WHERE time >= 1716415000000 AND time <= 1716415000000 + <time_window_in_seconds> * 1000 AND (eventDay="20240522" OR eventDay="20240523") AND category_name="Network Activity" AND traffic.bytes > 66666 AND src_endpoint.port = 6666 AND connection_info.protocol_num = 6 AND traffic.packets = 66

For more information about using sdselect to run federated searches, see sdselect command overview.

Last modified on 14 October, 2024
Give your users role-based access control of data lake indexes and federated indexes   sdselect command overview

This documentation applies to the following versions of Splunk Cloud Platform: 9.3.2408


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters