Define dataset fields
In this topic we talk about adding and editing fields of data model datasets. Dataset fields provide the set of fields that your Pivot users work with when they define and generate pivot reports.
Fields can be present within the dataset, or they can be derived and added to the dataset through the use of lookups and eval expressions.
You use the Data Model Editor to create and manage dataset fields. It enables you to:
- Create new fields.
- Update or delete existing fields that aren't inherited.
- Override certain settings for inherited fields.
You can also use the Data Model Editor to build out data model dataset hierarchies, define datasets (by providing constraints, search strings, or transaction definitions), rename datasets, and delete datasets. For more information about using the Data Model Editor to perform these tasks, see Design data models.
This topic will not cover the concepts behind dataset fields in detail. If you have not worked with data model fields up to this point, you should review the topic About data models.
For information about creating and managing new data models, see Manage data models. Aside from creating new data models via the Data Models management page, this topic also shows you how to manage data model permissions and acceleration.
Data model dataset field types
There are five types of data model dataset fields.
Auto-extracted
A field extracted by the Splunk software at index time or search time. You can only add auto-extracted fields to root datasets. Child datasets can inherit them, but they cannot add new auto-extracted fields of their own. Auto-extracted fields divide into three groups.
Group | Definition |
---|---|
Fields added by automatic key value field extraction | These are fields that the Splunk software extracts automatically, like uri or version . This group includes fields indexed through structured data inputs, such as fields extracted from the headers of indexed CSV files. See Extract fields from files with structured data in Getting Data In.
|
Fields added by knowledge objects | These are fields added to search results by field extractions, automatic lookups, and calculated fields that are configured in props.conf .
|
Fields that you have manually added | You can manually add fields to the auto-extracted fields list. They might be rare fields that you do not currently see in the dataset, but may appear in it at some point in the future. This set of fields can include fields added to the dataset by generating commands such as inputcsv or dbinspect .
|
Eval Expression
A field derived from an eval
expression that you enter in the field definition. Eval expressions often involve one or more extracted fields.
Lookup
A field that is added to the events in the dataset with the help of a lookup that you configure in the field definition. Lookups add fields from external data sources such as CSV files and scripts. When you define a lookup field you can use any lookup object in your system and associate it with any other field that has already been associated with that same dataset.
See About lookups.
Regular Expression
This field type is extracted from the dataset event data using a regular expression that you provide in the field definition. A regular expression field definition can use a regular expression that extracts multiple fields; each field will appear in the dataset field list as a separate regular expression field.
Geo IP
A specific type of lookup that adds geographical fields, such as latitude, longitude, country, and city to events in the dataset that have valid IP address fields. Useful for map-related visualizations.
See Design data models.
Field categories
The Data Model Editor groups data model dataset fields into three categories.
Category | Definition |
---|---|
Inherited | All datasets have at least a few inherited fields. Child fields inherit fields from their parent dataset, and these inherited fields always appear in the Inherited category. Root event, search, and transaction datasets also have default fields that are categorized as inherited. |
Extracted | Any auto-extracted field that you add to a dataset is listed in the "Extracted" field category. |
Calculated | The Splunk software derives calculated fields through a calculation, lookup definition, or field-matching regular expression. When you add Eval Expression, Regular Expression, Lookup, and Geo IP field types to a dataset, they all appear in this field category. |
Field order and field chaining
The Data Model Editor lets you rearrange the order of fields. This is useful when you have a set of fields that must be processed in a specific order, because fields are processed in descending order from the top of the list to the bottom.
For example, you can design an Eval Expression field that uses the values of two auto-extracted fields. Extracted fields precede calculated fields, so in this case the fields would be processed in the correct order without any work on your part. But you might also use the eval expression field as input for a lookup field. Because Eval Expression fields and Lookup fields are both categorized as calculated fields by the Data Model Editor, you would want to make sure that you order the calculated field list so that the Eval Expression field appears above the Lookup field.
So the order of these fields would be:
- Auto Extracted Field 1
- Auto Extracted Field 2
- Eval Expression Field (calculates a field with the values of the two Auto-Extracted fields)
- Lookup Field (uses the Eval Expression field as an input field)
All dataset fields are shown and optional by default.
- A shown field is visible and available to Pivot users when they are in the context of the dataset to which the field belongs. For example, say the
url
field is marked as shown for the HTTP Requests dataset. When a user enters Pivot and selects the HTTP Requests dataset, they can use theurl
field when they define a pivot report. - An optional field is not required to be present in every event in the dataset represented by its dataset. This means that there potentially can be many events in the dataset that do not contain the field.
You can change these settings to hidden and required, respectively. When you do this the field will be marked as hidden and/or required in the dataset field list.
- A hidden field is not displayed to Pivot users when they select the dataset in a Pivot context. They will be unable to use it for the purpose of Pivot report definition.
- This setting lets you expose different subsets of fields for each dataset in your data model, even if all of the datasets inherit the same set of fields from a single parent dataset. This helps to ensure that your Pivot users only engage with fields that make sense given the context of the dataset represented by the dataset.
- You can hide fields that are being added to the dataset only to define another field (see "Field order and field chaining," above). There may be no need for your Pivot users to engage with the first fields in a field chain.
- A required field must appear in every event represented by the dataset. This filters out any event that does not have the field. In effect this is another type of constraint on top of any formal constraints you've associated with the dataset.
These field settings are specific to each dataset in your data model. This means you can have the ip_address
field set to Required in a parent dataset but still set as optional in the child datasets that descend from that parent dataset. Even if all of the datasets in a data model have the same fields (meaning the fields are set in the topmost root dataset and then simply inherited to all the other datasets in the hierarchy), the fields that are marked hidden or required can be different from dataset to dataset in that data model.
Note: There is one exception to your ability to provide different "shown/hidden" and "optional/required" settings for the same field across different datasets in a data model. You cannot update these settings for inherited fields that are categorized as "Calculated" fields in the parent dataset in which they first appear. For this kind of field you can only change the setting by updating the fields in that parent dataset. Your changes will be replicated through the child dataset that descend from that parent dataset.
You can set these values for extracted and calculated fields when you first define them. You can also edit field names or types after they've been defined.
- Click Override for a field in the Inherited category or Edit for a field in the Extracted and Calculated categories.
- Change the value of the Flag field to the appropriate value.
- Click Save to save your changes.
With the Bulk Edit list you can change the "shown/hidden" and "optional/required" values for multiple fields at once.
- Select the fields you want to edit.
- Click Bulk Edit and select either Optional, Required, Hidden, or Shown.
- If you select either Required or Hidden the appropriate fields update to display the selected status for the selected fields. You cannot update these values for inherited fields that are categorized as calculated fields in the parent dataset in which they first appear. See the Note above for more information.
Enter or update field names and types
The Data Model Editor lets you give fields in the Extracted and Calculated categories a display Name of your choice. It also lets you determine the Type for such fields, even in cases where a Type value has been automatically assigned to the field.
Splunk software automatically assigns a type to auto-extracted fields. If an auto-extracted field's Type value is assigned incorrectly, you can provide the correct one. For example, based on available values for an auto-extracted field, Splunk software may decide it is a Number type field when you know that it is in fact a String type. You can change the Type value to String if this is the case.
Changing the display Name of an auto-extracted field won't change how the associated field is named in the index--it just renames it in the context of this data model.
- Click Edit for the field whose Name or Type you would like to update.
- Update the Name or change the Type. Name values cannot contain asterisk characters.
- Click Save to save your changes.
Use the Bulk Edit list to give multiple fields the same Type value.
- Select the fields you want to edit.
- Click Bulk Edit and select either Boolean, IPv4, Number, or String.
You cannot change the Type value for inherited fields. If you select any inherited fields the Type values in the Bulk Edit list will be unavailable.
All of the selected fields should have their Type value updated to the value you choose.
Design data models | Add an auto-extracted field |
This documentation applies to the following versions of Splunk Cloud Platform™: 8.2.2112, 8.2.2201, 8.2.2202, 8.2.2203, 9.0.2205, 9.0.2208, 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403, 9.2.2406 (latest FedRAMP release), 9.3.2408
Feedback submitted, thanks!