Dataset extension
Dataset extension creates a search, report, dataset, or other object that is built upon a reference to an existing dataset. This reference means that the object always refers to the original dataset for its foundational data. If the definition of the original dataset changes, those changes are passed down to any datasets that extend it.
Dataset extension is not the same as dataset cloning. When you clone a dataset, you create a distinct, individual dataset that is identical to the original dataset but not otherwise connected to it. When you extend a dataset, you create a dataset, report, dashboard panel, or alert that is bound to the original dataset through its reference to that dataset.
Example of extending a dataset as a report
For example, say you have a dataset named Alpha. If you select Explore > Investigate in Search on the Datasets listing page for the Alpha dataset, you go to the Search view and run a search that displays the contents of Alpha. This search string uses the from
command to reference Alpha. You can optionally modify the search string with additional Splunk Search Processing Language (SPL).
If you save this search string as a report named Beta, it still has the reference back to Alpha. If someone decides to make a change to Alpha, that change cascades down to the Beta report. This change might cause problems in the Beta report.
For example, you might modify the search string of the Beta report with lookups and eval expressions that use fields passed down from the Alpha dataset in their definitions. If someone deletes those fields from the Alpha dataset, those lookups and eval expressions break in the Beta report, because they require fields that no longer exist.
Dataset extension chains
You can extend any dataset as a table dataset. This means that you can have chains of extended datasets. For example you can extend Dataset Alpha as dataset Beta, and then extend dataset Beta as dataset Gamma, and so on. Any change to Alpha propagates down through the other datasets in the chain.
You can understand dataset extension chains from the end of the chain, but not from the start. So to use the example in the preceding paragraph, if you are on dataset Gamma, you can see that it extends Beta, which in turn extends Alpha. But if you view Alpha, you have no way of knowing which datasets were extended from it.
Learn which datasets a dataset extends
Locate the dataset in the Datasets listing page and expand its row. If it extends one or more datasets, you will find an Extends line item with the extended datasets listed from top to bottom. For example, the following image shows the detailed information for Gamma, showing that it extends Alpha and Beta.
You can also find this information on the viewing page for a dataset. Click More Info to see what datasets the dataset that you are viewing extends.
Use a naming convention for extended datasets
When you are working with a dataset, it is difficult to know what datasets are extended from it.
You can manage this by using a naming convention to indicate when a dataset is extended from another. For example, if you extend a dataset from dataset Alpha, you can name it Alpha.Beta. Later, if you extend two datasets from Alpha.Beta, you can name those datasets Alpha.Beta.Gamma and Alpha.Beta.Epsilon. This naming methodology is similar to that of data model datasets, where the dataset name indicates where it lives in a greater hierarchy of data model datasets. The following image shows the relationship between Alpha and the datasets extended from it.
When you extend a dataset, you can update its description to indicate that it is extended. Identify the knowledge objects that have been directly extended from it, not the full extension chain, if one exists. Add a sentence like this to the dataset description: "This dataset has been extended as a table dataset named <dataset_name> and a report named <report_name>."
The from command
Dataset extension is facilitated by the from
command, whether you extend it by opening it in the Search view or through Table Views.
When you open a dataset in the Search view, you see a search string that uses the from
command to retrieve data from that dataset. For example, say you have a dataset named Buttercup_Games_Purchases. If, while on the Datasets listing page, you click Explore in Search for that dataset, the Splunk platform takes you to the Search view, where you see this search string:
| from datamodel:"Buttercup_Games_Purchases"
You can extend any dataset as a table dataset. When you do this, Table Views uses the from
command in the background. Click the SPL toggle in the command history sidebar to see how Table Views uses the from
command.
For more information, see from in the Search Reference.
Extension and table acceleration
If you want to accelerate a table that extends other tables, it needs to be shared with you, and the tables it extends must be shared with you as well. Acceleration can be applied only to datasets that use purely streaming commands.
You will not see acceleration benefits when you use from
to extend an accelerated table.
You cannot accelerate a table that is extended from a lookup table file or lookup definition since lookup dataset extension isn't a streaming operation.
See Accelerate tables.
View and update a table dataset | Accelerate table datasets |
This documentation applies to the following versions of Splunk® Enterprise: 8.1.0, 8.1.1, 8.1.2, 8.1.3, 8.1.4, 8.1.5, 8.1.6, 8.1.7, 8.1.8, 8.1.9, 8.1.10, 8.1.11, 8.1.12, 8.1.13, 8.1.14, 8.2.0, 8.2.1, 8.2.2, 8.2.3, 8.2.4, 8.2.5, 8.2.6, 8.2.7, 8.2.8, 8.2.9, 8.2.10, 8.2.11, 8.2.12, 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4, 9.0.5, 9.0.6, 9.0.7, 9.0.8, 9.0.9, 9.0.10, 9.1.0, 9.1.1, 9.1.2, 9.1.3, 9.1.4, 9.1.5, 9.2.0, 9.2.2, 9.2.1, 9.3.0
Feedback submitted, thanks!