Splunk® Cloud Services

SPL2 Search Reference

Acrobat logo Download manual as PDF


Acrobat logo Download topic as PDF

Custom data types

Data types define the characteristics of the data. With custom data types, you can specify a set of complex characteristics that define the shape of your data.

You can define your own data types by using either the built-in data types or other custom data types.

There are several advantages to defining your own data types:

Type checking
Prevents operations that expect a certain data type from being used with other, invalid data types. It's important to always use the right data type. Using the wrong data type can have a significant impact on search performance and storage. See Additional examples.
Auto-completion
Data types that are known to the system can take advantage of auto-completion features in a user interface (UI).
Identify data structures
One of the key advantages of using custom data types is the ability to identify data structures that don't match the SPL2 built-in data types.

Supported data types

SPL2 uses a rich data type system, including primitive types and derived types:

  • Primitive data types are used by themselves or as the basis for other data types.
  • Derived types are created from one or more primitive or other derived data types.

This table describes the data types supported in SPL2:

Type Description
Primitive types A set of common data types, such as string, integer, long, and Boolean.

All of the primitive types are built-in to SPL2. For the complete list of built-in data types, see Built-in data types.

Array types A derived data type that defines a sequence of homogeneous values.


For example:

type fullnames = string[]

The built-in dataset data type is another example of an array type.

Constrained types A derived data type that applies a constraint, in the form of a predicate expression, on an existing built-in or custom data type.


For example:

type http_error=int where $value in([403, 404, 408])
Stream types A derived data type that defines a continuous, unending sequence of values.
Structured types A derived data type that defines a group of different data types into a single data type. The built-in object data type is an example of a structured type. See Built-in data types.


Examples of structured data types are authentication events or user records, such as:

type person =  { 
   firstname:string, 
   surname:string,
   age:int 
}
Union types A derived type that specifies more than one valid data type. A union type can map to any of the specified types.


For example:

type ipaddress=ipv4|ipv6

The built-in number data type is another example of a union type.

Declaring data types

Data type declarations always start with the word type followed by the type name and the type expression. For example:

type <type-name> = <type-expression>

The type names follow the same conventions as field names:

  • Type names must begin with a-z, A-Z, or the underscore ( _ ) character.
  • Type names must be composed of a-z, A-Z, 0-9, and the underscore ( _ ) character.
  • If the name begins with or contains any other character, the type name must be enclosed in single quotation marks ( ' ).

The type expression depends on the data type that you are declaring.

For example, the following Union data type declaration uses an expression that includes alternative data types. The expression declares that the number data type can be either an integer, a long, a float, or a double.

type number = int | long | float | double

Array types

Array data types define a sequence of any other data type. You specify an empty set of square brackets [] to indicate that the type is an array.

Syntax

The required syntax for array types is in bold:

type <type-name> = <property-data-type>[]

Examples

You can declare an array type that is an array of strings:

type fullnames = string[]

You can define an array of objects:

type personnel = object[]

You can also define an array of objects with this declaration:

type personnel = {*}[]

The object data type is one of the built-in data types. Object data types consist of members of key-value pairs. For example, the following object consists of the keys type and 'game-name':

{type: "competitive", 'game-name': "Ticket to Ride"}

Constrained types

Sometimes a data type is too general. You can take a data type and put constraints on it to declare a narrower data type. This narrower data type is called a constrained data type.

You add a constraint by including a where clause followed by a predicate expression in the data type definition.

Syntax

The required syntax for constrained types is in bold:

type <type-name> = <type-name> where <predicate-expression>

For examples of predicate expressions, see Predicate expressions in the SPL2 Search Manual.

Integer examples

Integers can be positive or negative whole numbers. This example creates a data type called positive_integer that constrains an integer to values greater than or equal to zero ( 0 ):

type positive_integer = int where $value >= 0

This predicate could also be expressed as where $value > -1.

Suppose you want positive integers that are not zero. The following example declares a data type called non_zero_positive_integer that constrains an integer to values greater than zero:

type non_zero_positive_integer = int where $value > 0

These examples use the implicit $value variable to access the value in your data to check.

Regular expression examples

You can use constrained types for strings like IP addresses, credit card numbers, and social security numbers that match a particular regular expression.

The following examples use the match function for the <predicate-expression>.

This example declares a data type called ipv4 that is a string that matches a regular expression for valid IPv4 addresses:

type ipv4 = string WHERE match($value, "(([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\\.){3}([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])"

This example declares a data type called ipv6 that is a string that matches a regular expression for valid IPv6 addresses:

type ipv6 = string WHERE match($value, ("((([0-9a-fA-F]){1,4})\\:){7}([0-9a-fA-F]){1,4}")

With these two IP address data types, you can create a union data type that specifies that either IP address version is valid:

type ipaddress = ipv4 | ipv6

Structured type examples

You can create a constrained data type that is based on a property in a structured data type. Consider the following structured data type called person:

type person =  { 
	firstname:string, 
	surname:string,
	age:int 
}

You can create a constrained data type called elderly_person that references the person data type. In this example, the value for the age property in the person data type must be greater than 70:

type elderly_person = person WHERE $value.age > 70

This data type says any person that is older than 70 is also an elderly_person.

Stream types

Stream data types are similar to array data types. Both define a sequence of data types. A stream data type is a continuous, unending sequence of object instances. Because they are unending, the instances of stream types might not all fit into memory. Older instances will need to roll off to make room for newer instances.

For example, consider a stream data type that defines new employees. The company is always hiring new employees. At some point an employee is no longer new to the company.

type 'new-hire' = {
   emp_ID:int, 
   firstname:string, 
   surname:string, 
   division:string, 
   manager:string, 
   fulltime:boolean
}

In this example, the type name new-hire is in single quotation marks because the type name contains a dash. Type names that contain anything other than a-z, A-Z, 0-9, and the underscore ( _ ) character must be enclosed in single quotation marks.

You can't define a stream data type that uses another stream type.

Structured types

Structured data types consist of a collection of property names and property data types. Use structured types to group data of different types into a single data type.

The value assigned to a structured type is always a JSON object, which can be rendered as flattened fields or as the object itself.

Syntax

The required syntax for structured types is in bold:

type <type-name> = {<property-name> [: <property-data-type>], ...}

Separate each property with a comma. Enclose properties in curly brackets { }.

type-name
Syntax: <type-name>
Description: The name of the data type. Valid type names are identical to valid field names.
property-name
Syntax: <property-name>
Description: The name of the data type. Valid type names are identical to valid field names.
property-data-type
Syntax: <property-data-type>
Description: Optional. The data type of the property. You can specify any built-in or declared data type.

Property data types

Specifying property data types is optional. The property data types can be any built-in or declared data type.

If the properties are known to SPL2, that is if SPL2 recognizes the property, then the data types don't have to be specified in the type declaration. If the properties are not known to SPL2, and you don't specify the data type, the default data type any is used.

The following example declares a structured data type called person with two properties, firstname and surname. The property data types are not declared.

type person = { 
   firstname, 
   surname 
}

The following example shows how to specify property data types. This example declares the person data type with three properties. The firstname and surname properties must be strings and the age property must be an integer.

type person =  { 
	firstname:string, 
	surname:string,
	age:int 
}

Specifying unknown properties

Sometimes you don't know the properties in a structured object. Or you know some, but not all of the properties. To create a structured data type that includes zero or more properties of any data type, use the wildcard ( * ) character

For example, the built-in object data type allows for any property of any data type:

type object = {*}

You can also express the object data type like this

type my_object = {*:any}

If you know some but not all of the properties, you can combine what you know with the wildcard character. In this example, the person type has at least three properties: firstname, surname and age. The person type might also have other unknown properties which can be of any data type.

 type person =  { 
	firstname:string, 
	surname:string,
	age:int,
 	* 
}

Specifying intermittent properties

In a data type declaration, you can use the question mark ( ? ) character after the property name to indicate that a property might or might not be present.

The following table describes several structure type examples of intermittent properties:

Example Description
type person =  { 
	firstname:string, 
	surname?,
	age:int,
 	* 
}
In this object example, the surname property might or might not be present. The surname might appear in some events and be absent from other events.
 type person =  { 
	firstname:string, 
	surname:string,
	age?:int,
 	* 
}
In this object example, there might not be an age property. If the age property does exist, it must be an integer.

Union types

A union data type specifies multiple valid data types. You declare a union type using the pipe ( | ) character, which is like specifying an OR operator between each type.

Syntax

The required syntax for union types is in bold:

type <type-name> = <data-type> | <datatype> ...

Example

Suppose you create constrained types for IPv4 and IPv6 addresses. You can create a union data type like this:

type ipaddress = ipv4 | ipv6;

For the regular expressions used to create the ipv4 and ipv6 data types, see Regular expression examples In the Constrained types section.

Additional examples

Custom data types, custom functions, and searches

Custom data types are often used in custom functions, which are in turn used in searches. Here's an example.

You have some data about people that looks like this:

custID info
0010 {fname:"Claudia", sname:"Garcia"}
0020 {fname:"Ikraam", sname:"Rahat"}
0025 {fname:"Wei", sname:"Zhang"}

You want to extract the information about each person from each info object.

You need a function to extract the information for the first names and surnames in each object. The function might look something like this:

function getPerson($x: object) : <some_data_type> {
    return {<firstname_expression>, <lastname_expression>}
}
  • The name of the function is getPerson
  • The name of the function parameter is $x
  • The data type of the function parameter is object

To extract the information about each person, you need to declare a structured data type that represents the person information. This is the <some_data_type> portion of the function.

You create a data type called person that identifies the data types of each of the key-value pairs in the object. The data type looks like this:

type person = {
   firstname:string, 
   surname:string
}

You can now specify the name of the data type in your function:

function getPerson($x: object) : person {
    return {firstname:$x.info.fname, surname:$x.info.sname}
}

The expressions for the firstname and surname indicate the path to the information in the objects.

You can now create a search that uses the custom function and custom data type. This example uses a dataset literal, instead of a dataset, to show the data:

FROM [{custID:0010, info:{fname:"Claudia", sname:"Garcia"}}, {custID:0020, info:{fname:"Ikraam", sname:"Rahat"}}, {custID:0025, info:{fname:"Wei", sname:"Zhang"}}] AS names SELECT getPerson(names)


The results look something like this:

getPerson(name)
{"firstname":"Claudia", "surname":"Garcia"}
{"firstname":"Ikraam", "surname":"Rahat"}
{"firstname":"Wei", "surname":"Zhang"}

To flatten the key-value pairs in each object, add .* to the end of the search:

FROM [{custID:0010, info:{fname:"Claudia", sname:"Garcia"}}, {custID:0020, info:{fname:"Ikraam", sname:"Rahat"}}, {custID:0025, info:{fname:"Wei", sname:"Zhang"}}] AS names SELECT getPerson(names).*


The results look something like this:

firstname surname
Claudia Garcia
Ikraam Rahat
Wei Zhang

Type checking example

This example has a function called baz that expects a value that's number:

function baz($x: number) { return $x;}

This search calls the baz function but supplies a string value "hello":

$out = FROM [{}] SELECT baz("hello")

When the function performs a type check, it expects a number but receives a string instead. An error message is returned that says the argument "hello" is a string in the 'baz' function and that it doesn't match the parameter type 'number'.

See also

Related information
Built-in data types
Predicate expressions in the SPL2 Search Manual
Custom eval functions
Custom command functions
Last modified on 24 June, 2021
PREVIOUS
Custom command functions
  NEXT
SPL2 Command Quick Reference

This documentation applies to the following versions of Splunk® Cloud Services: current


Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters