Field alias behavior change
When you upgrade to version 7.2.4+ of Splunk Enterprise, the behavior of certain field alias configurations changes.
A field alias is a way of setting up an alternate name for a field. You can then use that alternate name to search for events that contain that field. Ideally, you should be able to define multiple aliases for a single field, but each alias you define should apply only to one source field. Additionally, when you apply a field alias configuration to a search, the expectation is that the source field is in the events, but the alias field is not in the events.
This issue involves events that already include the alias field, but which are missing the source field or have no value for the source field.
Before the 7.2.4 field alias fix
In versions of Splunk Enterprise previous to 7.2.4, when you applied a field alias configuration to events that had the alias field but not the source field, no changes were made to those events. The alias fields were allowed to stay.
This behavior is an erroneous application of the field alias concept. It allows users to have alias field values that do not correspond to source fields.
Example of the old field alias behavior
Here are four events of sample log data. This is what they look like before we apply field alias processing to them. Each of these events has a sourcetype
of st1
and a source
of example.log
.
02-19-2019 19:46:56.122 user=jessica id=123456789 uid=1241 message=this is just a simple example 02-19-2019 19:46:11.342 user=joe id=123777789 message=this is just a simple example 02-18-2019 11:12:56.854 uid=7788 message=this is just a simple example 02-18-2019 11:11:25.478 user=adam message=this is just a simple example
Now, say you want to apply this pair of props.conf
field alias configurations to that set of events.
[st1] FIELDALIAS-class1 = uid AS user [source::example.log] FIELDALIAS-class2 = id AS user
With this pair of configurations, events that share a sourcetype
of st1
and a source
of example.log
have the user
field aliased to two different source fields: uid
and id
. These colliding configurations are problematic because field aliases are supposed to reference only one source field at a time.
In addition, you know that user
, the alias field, already exists in the events. If your field alias configurations say that the value of user
should match a value of either uid
or id
but the user
field in the event already has a value of jessica
, how does the search head resolve this? It replaces the user
field value with one of the alias field values, according to lexicographical sort order logic.
But the real issue here is with the fourth event, where the alias field exists, but no source field exists. The pre-7.2.4 rules allowed the alias field to stay in an event when there were no source fields in the event.
Here is what the sample events look like after field alias processing with the pre-7.2.4 rules.
02-19-2019 19:46:56.122 user=123456789 id=123456789 uid=1241 message=this is just a simple example 02-19-2019 19:46:11.342 user=123777789 id=123777789 message=this is just a simple example 02-18-2019 11:12:56.854 user=7788 uid=7788 message=this is just a simple example 02-18-2019 11:11:25.478 user=adam message=this is just a simple example
This results in the overwriting of user
values in the first two events. The search head resolves the conflict between id
and uid
in the first event by selecting id
.
The search head resolves collisions between two or more AS configurations by applying each of the FIELDALIAS class names in lexicographical sort order. It uses the last class that it applies. So in the first event of this example, it first applies class1 and then applies class2. Because class2 is the last class applied, the user field takes on the value of the id field.
The third event gets a new field. Before processing, it had a source field, but no alias field. After processing it has an alias field with the value of the source field. This is how field aliases are supposed to work.
But the fourth event has an alias field without a source field. After the field alias configuration is applied, the alias field should not appear in events that do not have the corresponding source fields. The logic set by the configuration is not consistent.
After the 7.2.4 field alias fix
In Enterprise 7.2.4, this bug was fixed. The fix changed the behavior when you apply a field alias configuration to an event where the alias field is already present but no source fields exist. This table explains how the behavior has changed and why.
Event contains source field? | Event contains alias field? | Before 7.2.4, what happened after application of the field alias configuration to the event? | After 7.2.4, what happens after application of the field alias configuration to the event? | Why does this happen? |
---|---|---|---|---|
Yes | Yes | The search head replaces the value of the alias field with the value of the source field. | The search head replaces the value of the alias field with the value of the source field. | If a source field and its alias field are both present in an event, they should share the value of the source field. |
No | Yes | Nothing. The search head does not change or remove the alias field. | The search head removes the alias field from the event. | If a source field is missing from an event, its alias field should not be present in that event. |
Yes | No | The search head adds an alias to the event. It is given the value of the source field in the event. | The search head adds an alias to the event. It is given the value of the source field in the event. | This is the ideal environment for a field alias. It enables you to search for events using a field name that is an alias for a source field that is already present in those events. |
No | No | Nothing happens. The search head leaves the event unchanged. | Nothing happens. The search head leaves the event unchanged. | If the event does not have the source field, and the alias field is also not present, there is no need to change the event. |
Example of the 7.2.4 fix
This example shows you how the 7.2.4 fix changed the results of some searches. Say you start with the same two field alias configurations:
[st1] FIELDALIAS-class1 = uid AS user [source::example.log] FIELDALIAS-class2 = id AS user
You apply those configurations to the same four events as the preceding examples. Here are the results:
02-19-2019 19:46:56.122 user=123456789 id=123456789 uid=1241 message=this is just a simple example 02-19-2019 19:46:11.342 user=123777789 id=123777789 message=this is just a simple example 02-18-2019 11:12:56.854 user=7788 uid=7788 message=this is just a simple example 02-18-2019 11:11:25.478 message=this is just a simple example
As you can see, the difference in this set of events is that the user
field is removed from the fourth event. It is removed because it is an alias field and there is no source field in the event.
The introduction of ASNEW in 7.2.4
Version 7.2.4 of the Splunk platform introduced the ASNEW
field alias configuration. ASNEW
allows you to combine field aliases without overriding or removing values.
For example, say you have a search that runs over events that include the dst
field, and you want to apply the following props.conf
field alias configuration to it:
[sv1] FIELDALIAS-classx = src AS dst
In the case of events that already have dst
, you want the field and its values to be undisturbed by the field alias processing. You do not want dst
to be removed, and you do not want the value of dst
to be altered. In this case you must change the configuration from AS
to ASNEW
:
[sv1] FIELDALIAS-classx = src ASNEW dst
When you apply this configuration, the search head passes over instances of dst
that are already present in your events. It does not remove them or overwrite them.
If you use Splunk Web
If you are a Splunk Cloud Platform user, or if you simply prefer to manage your field aliases through their Settings pages, you can use the Overwite field values setting to determine how alias fields are treated when they are already present in events at the time that field alias processing takes place.
Select Overwrite field values for a field alias that uses the corrected field alias behavior. This means that it does what it takes during processing to ensure that the alias fields in the event share the values of their corresponding source fields. See the table in After the 7.2.4 field alias fix for more information.
When Overwrite field values is not selected, the field alias uses the uncorrected behavior, which means that the alias field is not changed or removed if it exists in an event without a source field when field alias processing takes place.
When you create a new field alias, Overwrite field values is not selected by default.
See Create field aliases in Splunk Web for more information about the workflow for field alias creation with the Settings pages.
Using calculated fields to apply an alias field to multiple source fields
Calculated fields provide a more versatile method for applying an alias field to multiple source fields. Use eval
functions such as coalesce
to determine the order in which colliding source fields are applied to your alias fields.
Calculated fields that use functions like mvappend
and mvdedup
also enable you to deal with situations where your field alias configuration collides with a field extraction. For example, say you have this combination of a field alias configuration and a field extraction configuration:
[st1] FIELDALIAS-class1 = uid AS user [source::example.log] EXTRACT-class2 = 123(?<user>[0-9]+)789
During a search, the EXTRACT-class2
configuration extracts user
field values for events with a source of example.log
. Later in the search pipeline, the FIELDALIAS-class1
configuration applies a field alias to events with a source type of st1
. FIELDALIAS-class1
gives the user
field the same value as uid
even when uid
is null. As a result, events with a source of example.log
and a source type of st1
have the extracted value of the user
field overwritten by the contents of the uid
field.
Here is the same set of results used earlier in this topic, after they have been processed through these two configurations:
02-19-2019 19:46:56.122 user=1241 id=123456789 uid=1241 message=this is just a simple example 02-19-2019 19:46:11.342 id=123777789 message=this is just a simple example 02-18-2019 11:12:56.854 user=7788 uid=7788 message=this is just a simple example 02-18-2019 11:11:25.478 message=this is just a simple example
This configuration is fine if you intend for the extracted value of user
to be overwritten. But if that is not the case, one of the following three calculated field configurations would be a better choice than the FIELDALIAS-class1
configuration, depending on the effect you are trying to achieve:
Calculated field configuration | What it does |
---|---|
EVAL-user = coalesce(user, uid)
|
Retains only one field value. Prioritizes the extracted value over the aliased value. |
EVAL-user = mvappend(user, uid)
|
Maintains both the extracted and aliased values. Could lead to duplicated values. |
EVAL-user = mvdedup(mvappend(user, uid))
|
Maintains both the extracted and aliased values. No duplicates. |
Linux kernel memory overcommitting and Splunk crashes | Timestamp recognition of dates with two-digit years fails beginning January 1, 2020 |
This documentation applies to the following versions of Splunk® Enterprise: 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 7.3.6, 7.3.7, 7.3.8, 7.3.9, 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8, 8.0.9, 8.0.10, 8.1.0, 8.1.1, 8.1.2, 8.1.3, 8.1.4, 8.1.5, 8.1.6, 8.1.7, 8.1.8, 8.1.9, 8.1.10, 8.1.11, 8.1.12, 8.1.13, 8.1.14, 8.2.0, 8.2.1, 8.2.2, 8.2.3, 8.2.4, 8.2.5, 8.2.6, 8.2.7, 8.2.8, 8.2.9, 8.2.10, 8.2.11, 8.2.12, 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4, 9.0.5, 9.0.6, 9.0.7, 9.0.8, 9.0.9, 9.0.10, 9.1.0, 9.1.1, 9.1.2, 9.1.3, 9.1.4, 9.1.5, 9.1.6, 9.1.7, 9.2.0, 9.2.1, 9.2.2, 9.2.3, 9.2.4, 9.3.0, 9.3.1, 9.3.2, 9.4.0
Feedback submitted, thanks!