Custom data types
Preview features are provided by Splunk to you "as is" without any warranties, maintenance and support, or service level commitments. Splunk makes this preview feature available in its sole discretion and may discontinue it at any time. Use of preview features is subject to the Splunk General Terms.
Data types define the characteristics of the data. With custom data types, you can specify a set of complex characteristics that define the shape of your data.
You can define your own data types by using either the built-in data types or other custom data types.
There are several advantages to defining your own data types:
- Type checking
- Prevents operations that expect a certain data type from being used with other, invalid data types. It's important to always use the right data type. Using the wrong data type can have a significant impact on search performance and storage. See Additional examples.
- Auto-completion
- Data types that are known to the system can take advantage of auto-completion features in a user interface (UI).
- Identify data structures
- One of the key advantages of using custom data types is the ability to identify data structures that don't match the SPL2 built-in data types.
Supported data types
SPL2 uses a rich data type system, including primitive types and derived types:
- Primitive data types are used by themselves or as the basis for other data types.
- Derived types are created from one or more primitive or other derived data types.
This table describes the data types supported in SPL2:
Type | Description |
---|---|
Primitive types | A set of common data types, such as string, integer, long, and Boolean.
All of the primitive types are built-in to SPL2. For the complete list, see Built-in data types. |
Array types | A derived data type that defines a sequence of homogeneous values.
type fullnames = string[] The built-in |
Stream types | A derived data type that defines a continuous, unending sequence of values. |
Structured types | A derived data type that defines a group of different data types into a single data type. The built-in object data type is an example of a structured type. See Built-in data types.
type person = { firstname:string, surname:string, age:int } |
Union types | A derived type that specifies more than one valid data type. A union type can map to any of the specified types.
type ipaddress=ipv4|ipv6 The built-in |
Declaring data types
Data type declarations always start with the word type
followed by the type name and the type expression. For example:
- type <type-name> = <type-expression>
The type names follow the same conventions as field names:
- Type names must begin with a-z, A-Z, or the underscore ( _ ) character.
- Type names must be composed of a-z, A-Z, 0-9, and the underscore ( _ ) character.
- If the name begins with or contains any other character, the type name must be enclosed in single quotation marks ( ' ).
The type expression depends on the data type that you are declaring.
For example, the following Union data type declaration uses an expression that includes alternative data types. The expression declares that the number
data type can be either an integer, a long, a float, or a double.
type number = int | long | float | double
Array types
Array data types define a sequence of any other data type. You specify an empty set of square brackets []
to indicate that the type is an array.
Syntax
The required syntax for array types is in bold:
- type <type-name> = <property-data-type>[]
Examples
You can declare an array type that is an array of strings:
type fullnames = string[]
You can define an array of objects:
type personnel = object[]
You can also define an array of objects with this declaration:
type personnel = {*}[]
The object data type is one of the built-in data types. Object data types consist of members of key-value pairs. For example, the following object consists of the keys type
and 'game-name'
:
{type: "competitive", 'game-name': "Ticket to Ride"}
Regular expression examples
You can use constrained types for strings like IP addresses, credit card numbers, and social security numbers that match a particular regular expression.
The following examples use the match
function for the <predicate-expression>.
This example declares a data type called ipv4
that is a string that matches a regular expression for valid IPv4 addresses:
type ipv4 = string WHERE match($value, "(([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\\.){3}([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])"
This example declares a data type called ipv6
that is a string that matches a regular expression for valid IPv6 addresses:
type ipv6 = string WHERE match($value, ("((([0-9a-fA-F]){1,4})\\:){7}([0-9a-fA-F]){1,4}")
With these two IP address data types, you can create a union data type that specifies that either IP address version is valid:
type ipaddress = ipv4 | ipv6
Structured type examples
You can create a constrained data type that is based on a property in a structured data type. Consider the following structured data type called person
:
type person = { firstname:string, surname:string, age:int }
You can create a constrained data type called elderly_person
that references the person
data type. In this example, the value for the age
property in the person
data type must be greater than 70:
type elderly_person = person WHERE $value.age > 70
This data type says any person that is older than 70 is also an elderly_person
.
Stream types
Stream data types are similar to array data types. Both define a sequence of data types. A stream data type is a continuous, unending sequence of object
instances. Because they are unending, the instances of stream types might not all fit into memory. Older instances will need to roll off to make room for newer instances.
For example, consider a stream data type that defines new employees. The company is always hiring new employees. At some point an employee is no longer new to the company.
type 'new-hire' = { emp_ID:int, firstname:string, surname:string, division:string, manager:string, fulltime:boolean }
In this example, the type name new-hire
is in single quotation marks because the type name contains a dash. Type names that contain anything other than a-z, A-Z, 0-9, and the underscore ( _ ) character must be enclosed in single quotation marks.
You can't define a stream data type that uses another stream type.
Structured types
Structured data types consist of a collection of property names and property data types. Use structured types to group data of different types into a single data type.
The value assigned to a structured type is always a JSON object, which can be rendered as flattened fields or as the object itself.
Syntax
The required syntax for structured types is in bold:
- type <type-name> = {<property-name> [: <property-data-type>], ...}
Separate each property with a comma. Enclose properties in curly brackets { }.
- type-name
- Syntax: <type-name>
- Description: The name of the data type. Valid type names are identical to valid field names.
- property-name
- Syntax: <property-name>
- Description: The name of the data type. Valid type names are identical to valid field names.
- property-data-type
- Syntax: <property-data-type>
- Description: Optional. The data type of the property. You can specify any built-in or declared data type.
Property data types
Specifying property data types is optional. The property data types can be any built-in or declared data type.
If the properties are known to SPL2, that is if SPL2 recognizes the property, then the data types don't have to be specified in the type declaration. If the properties are not known to SPL2, and you don't specify the data type, the default data type any
is used.
The following example declares a structured data type called person
with two properties, firstname
and surname
. The property data types are not declared.
type person = { firstname, surname }
The following example shows how to specify property data types. This example declares the person
data type with three properties. The firstname
and surname
properties must be strings and the age
property must be an integer.
type person = { firstname:string, surname:string, age:int }
Specifying unknown properties
Sometimes you don't know the properties in a structured object. Or you know some, but not all of the properties. To create a structured data type that includes zero or more properties of any
data type, use the wildcard ( * ) character
For example, the built-in object
data type allows for any property of any
data type:
type object = {*}
You can also express the object
data type like this
type my_object = {*:any}
If you know some but not all of the properties, you can combine what you know with the wildcard character. In this example, the person
type has at least three properties: firstname
, surname
and age
. The person
type might also have other unknown properties which can be of any
data type.
type person = { firstname:string, surname:string, age:int, * }
Specifying intermittent properties
In a data type declaration, you can use the question mark ( ? ) character after the property name to indicate that a property might or might not be present.
The following table describes several structure type examples of intermittent properties:
Example | Description |
---|---|
type person = { firstname:string, surname?, age:int, * } |
In this object example, the surname property might or might not be present. The surname might appear in some events and be absent from other events.
|
type person = { firstname:string, surname:string, age?:int, * } |
In this object example, there might not be an age property. If the age property does exist, it must be an integer.
|
Union types
A union data type specifies multiple valid data types. You declare a union type using the pipe ( | ) character, which is like specifying an OR operator between each type.
Syntax
The required syntax for union types is in bold:
- type <type-name> = <data-type> | <datatype> ...
Example
Suppose you create constrained types for IPv4 and IPv6 addresses. You can create a union data type like this:
type ipaddress = ipv4 | ipv6;
For the regular expressions used to create the ipv4
and ipv6
data types, see Regular expression examples In the Constrained types section.
Additional examples
Custom data types, custom functions, and searches
Custom data types are often used in custom functions, which are in turn used in searches. Here's an example.
You have some data about people that looks like this:
custID | info |
---|---|
0010 | {fname:"Claudia", sname:"Garcia"} |
0020 | {fname:"Ikraam", sname:"Rahat"} |
0025 | {fname:"Wei", sname:"Zhang"} |
You want to extract the information about each person from each info
object.
You need a function to extract the information for the first names and surnames in each object. The function might look something like this:
function getPerson($x: object) : <some_data_type> { return {<firstname_expression>, <lastname_expression>} }
- The name of the function is
getPerson
- The name of the function parameter is
$x
- The data type of the function parameter is
object
To extract the information about each person, you need to declare a structured data type that represents the person information. This is the <some_data_type>
portion of the function.
You create a data type called person
that identifies the data types of each of the key-value pairs in the object. The data type looks like this:
type person = { firstname:string, surname:string }
You can now specify the name of the data type in your function:
function getPerson($x: object) : person { return {firstname:$x.info.fname, surname:$x.info.sname} }
The expressions for the firstname and surname indicate the path to the information in the objects.
You can now create a search that uses the custom function and custom data type. This example uses a dataset literal, instead of a dataset, to show the data:
FROM [{custID:0010, info:{fname:"Claudia", sname:"Garcia"}}, {custID:0020, info:{fname:"Ikraam", sname:"Rahat"}}, {custID:0025, info:{fname:"Wei", sname:"Zhang"}}] AS names SELECT getPerson(names)
The results look something like this:
getPerson(name) |
---|
{"firstname":"Claudia", "surname":"Garcia"} |
{"firstname":"Ikraam", "surname":"Rahat"} |
{"firstname":"Wei", "surname":"Zhang"} |
To flatten the key-value pairs in each object, add .*
to the end of the search:
FROM [{custID:0010, info:{fname:"Claudia", sname:"Garcia"}}, {custID:0020, info:{fname:"Ikraam", sname:"Rahat"}}, {custID:0025, info:{fname:"Wei", sname:"Zhang"}}] AS names SELECT getPerson(names).*
The results look something like this:
firstname | surname |
---|---|
Claudia | Garcia |
Ikraam | Rahat |
Wei | Zhang |
Type checking example
This example has a function called baz
that expects a value that's number:
function baz($x: number) { return $x;}
This search calls the baz
function but supplies a string value "hello"
:
$out = FROM [{}] SELECT baz("hello")
When the function performs a type check, it expects a number but receives a string instead. An error message is returned that says the argument "hello" is a string in the 'baz' function and that it doesn't match the parameter type 'number'.
See also
- Related information
- Built-in data types
- Predicate expressions in the SPL2 Search Manual
- Custom eval functions
- Custom command functions
Built-in data types | SPL2 Command Quick Reference |
This documentation applies to the following versions of Splunk® Cloud Services: current
Feedback submitted, thanks!