Create a technology add-on
- Get data into Splunk
- Choose a folder name for the technology add-on
- Define a source type for the data
- Confirm that your data has been captured
- Handling timestamp recognition
- Configure line-breaking
- Understand your data and Enterprise Security dashboards
- Static strings and event fields
- Choose event
- Identify fields
- Create field extractions
- Extract fields
- Create field aliases
- Normalize field values
- Verify field extractions
- Document the technology add-on
- Product Name Technology Add-on
- Using this technology add-on
- Package the technology add-on
Create a technology add-on
Before you create a technology add-on, it is helpful to understand the format, structure, and meaning of the IT security data that you wish to import into the Splunk App for Enterprise Security.
Creating a technology add-on involves the following five steps:
Step 1: Capture and index the data
The first step in creating a technology add-on involves capturing the data and indexing the data within Splunk. Although this step doesn't require anything beyond the normal process of getting data into Splunk, decisions made during this phase can affect how a technology add-on identifies and normalizes the data at search time.
The explicit tasks that are done in this step include:
- Get the source data into Splunk
- Choose a name for your add-on directory
- Define a source type name to apply to the data
Note: Changes made in Step 1 affect the indexing of data, so this step will require dumping the indexed data and starting over if the steps are done incorrectly. Steps 2-4 do not affect indexing and can be altered while Enterprise Security is running without restarting the server.
Get data into Splunk
Getting data into Splunk is necessary before Enterprise Security can normalize the data at search time and use the information to populate dashboards and reports. This step is highlighted due to the different ways that some data sources can be captured, thus resulting in different formats of data. As you develop a technology add-on, it is important to ensure that you are accurately defining the method by which the data will be captured and indexed within Splunk.
Common data input techniques used to bring in data from security-relevant devices, systems, and software include:
- Streaming data over TCP or UDP (for example, syslog on UDP 514)
- API-based scripted input (for example, Qualys)
- SQL-based scripted input (for example, Sophos, McAfee EPO)
For more detailed information on getting data into Splunk, see the section "What Splunk can index" in the core Splunk product documentation.
The specific way in which a particular data set is captured and indexed in Splunk should be clearly documented in the README file in the technology add-on directory.
Choose a folder name for the technology add-on
A technology add-on is packaged as a Splunk app and must include all of the basic components of a Splunk app in addition to the components required to process the incoming data. All Splunk apps (including technology add-ons) reside in
The following table lists the files and folders of a basic technology add-on:
The transforms described in the
transforms.conf file describe operations that can performed on the data. The
props.conf file references the transforms so that they execute for a particular source type. In actual use, the distinction is not so clear because many of the operations in
props.conf can be accessed directly and avoid using
See "Create and maintain search time field extractions" in the core Splunk product documentation for more details.
Examples of these files are included in the
TA-template.zip located in the
When building a new technology add-on, you must decide on a name for the add-on folder. When choosing a name for your technology add-on folder, use the following naming convention:
The technology add-on folder should always begin with "TA-". This allows you to distinguish between technology add-ons and other add-ons within your Splunk deployment. The
<datasource> section of the name should represent the specific technology or data source that this technology add-on is for. Technology add-ons that are shipped as part of the Splunk App for Enterprise Security follow this convention.
TA-bluecoat TA-cisco TA-snort
For additional details on deploying and configuring technology add-ons with the Splunk App for Enterprise Security, see the Enterprise Security Installation and Configuration Manual.
Define a source type for the data
By default Splunk automatically sets a source type for a given data input. Each technology add-on should have at least one source type defined for the data that is captured and indexed within Splunk. This will require an override of the automatic source type that Splunk will attempt to assign to the data source. The source type definition is handled by Splunk at index time, along with line breaking, timestamp extraction, and timezone extraction. (All other information is set at search time.) See the section "Specify source type for a source" in the core Splunk product documentation.
You need to add an additional source type for your technology add-on, making the source type name match product so the technology add-on will work. Be aware that this process overrides some default Splunk behavior. For explicit information on how to define a source type within Splunk, see "Override automatic source type assignment" in the core Splunk product documentation.
The source type name should be chosen to match the name of the product for which you are building a technology add-on (for example, "nessus"). Technology add-ons can cover more than one product by the same vendor (for example, "juniper"), and may require multiple source type definitions as a result. If the data source for which you are building a technology add-on has more than one data file with different formats (for example, "apache_error_log" and "apache_access_log") you may choose to create multiple source types.
These source types can be defined as part of the technology add-on in
props.conf or using both
transforms.conf. These files must be created as part of the technology add-on. These files contain only the definitions for the source types that the technology add-on is designed to work with.
There are three ways to specify source types.
- Let Splunk automatically define source types in the data
- Define source types in
- Define source types in
We recommend that you define your source types in
inputs.conf. See "Configure rule-based source type recognition" in the core Splunk product documentation for more information about source types.
Tip: In addition to the source type definition, you can add a statement that forces the source type when a file is uploaded with the extension set to the source type name. This allows you to import files for a given source type simply by setting the file extension appropriately (for example, "mylogfile.nessus"). This statement can be added to the
props.conf file in the
Splunk_TA-<technology_add-on_name>/default/props.conf directory as in the following example:
[source:....nessus] sourcetype = nessus
Change the names "nessus" to match your source type.
Note: Usually, the source type of the data is statically defined for the data input. However, you may be able to define a statement that can recognize the source type based on the content of the logs.
Confirm that your data has been captured
Once you have decided on a folder name for your technology add-on and defined the source type(s) in the
props.conf or using both
props.conf files, your technology add-on can be enabled and you can start collecting your data within Splunk. Turn on the data source either by enabling the stream or the scripted input to begin data collection. Confirm that you are receiving the data and that the source type you've defined is working appropriately by searching for the source type you defined in your data.
Note: Restart Splunk in order for it to recognize the technology add-on and the source type defined in this step. After Splunk has been restarted, it automatically reloads the changes made to search time operations in
If the results of the search are different than expected or no results are displayed, do the following:
- Confirm that the data source has been configured to send data to Splunk. This can be validated by using Splunk to search for keywords and other information within the data.
- If the data source is sent via scripted input, confirm that the scripted input is working correctly. This may be more challenging to confirm, but can be done by checking the data source, script, or other points of failure.
- If the data is indexed by Splunk, but the source type is not defined as you expect, confirm your source type logic is defined appropriately and retest.
Handling timestamp recognition
Splunk is designed to understand most common timestamps that are found in log data. Occasionally, however, Splunk may not recognize a timestamp when processing an event. If this situation arises, it is necessary to manually configure the timestamp logic into the technology add-on. This can be done by adding the appropriate statements in the
props.conf file. For specific information on how to deal with manual timestamp recognition, check "How timestamp assignment works" in the Splunk documentation.
Multi-line events can be another challenge to deal with when creating a technology add-on. Splunk handles these types of events by default, but on some occasions Splunk will not recognize events as multi-line. If the data source is multi-line and Splunk has issues recognizing it, see the section "Configure linebreaking for multi-line events" in the core Splunk product documentation. The documentation provides additional guidance on how to configure Splunk for multi-line data sources. Use this information in a technology add-on by making the necessary modifications to the
Step 2: Identify relevant IT security events
After indexing, examine the data to identify the IT security events that the source includes and determine how these events fit into the Enterprise Security dashboards. This step allows selection of the correct Common Information Model tags and fields, so that data automatically populate the proper dashboards.
Understand your data and Enterprise Security dashboards
To analyze the data, a data sample needs to be available. If the data is loaded into Splunk, view the data based on the source type defined in Step 1. Determine which dashboards might use this data. See then Dashboard Requirements Matrix in this manual to understand what each of the dashboards and panels within Enterprise Security require and the types of events that are applicable. With this information, review the data to determine what events need to be normalized for use within Enterprise Security.
Note: The dashboards in Enterprise Security are designed to use information that typically appears in the related data sources. The data source for which the technology add-on is being built may not provide all the information necessary to populate the required dashboards. If this is the case, look further into the technology that drives the data source to determine if other data is available or accessible to fill the gap.
Reviewing the data source and identifying relevant events should be relatively easy to do. This document assumes familiarity with the data and the ability to determine what types of events the data contains.
After events and their contents have been reviewed, the next step is to match the data to the Enterprise Security dashboards. Review the dashboards directly in the Splunk App for Enterprise Security or check the descriptions in the Enterprise Security User Guide to see where the events might fit into these dashboard(s) .
The dashboards in Enterprise Security are grouped into three domains:
- Access: Provides information about authentication attempts and access control related events (login, logout, access allowed, access failure, use of default accounts, etc).
- Endpoint: Includes information about endpoints such as malware infections, system configuration, system state (CPU usage, open ports, uptime, etc.), patch status and history (which updates have been applied) and time synchronization information.
- Network: Includes information about network traffic provided from devices such as firewalls, routers, network based intrusion detection systems, network vulnerability scanners, and proxy servers.
Within these domains, find the specific dashboard(s) that relate to your data. In some cases, your data source may include events that belong in different dashboards, or even in different domains.
Assume that you have decided to capture logs from one of your firewalls. You would expect that the data produced by the firewall will contain network traffic related events. However, in reviewing the data captured by Splunk you may find that it also contains authentication events (login and logoff) and device change events (policy changes, addition/deletion of accounts, etc.).
Knowing that these events exist will help determine which of the dashboards within Enterprise Security are likely to be applicable. In this case, the Access Center, Traffic Center, and Network Changes dashboards are designed to report on the events found in this firewall data source.
Taking the time in advance to review the data and Enterprise Security dashboard functionality will help make the next steps of defining the Splunk fields and tags easier.
The product and vendor fields need to be defined. These fields are not included in the data itself, so they will be populated using a lookup.
Static strings and event fields
Do not assign static strings to event fields because this prevents the field values from being searchable. Instead, fields that do not exist should be mapped with a lookup.
For example, here is a log message:
Jan 12 15:02:08 HOST0170 sshd: [ID 800047 auth.info] Failed publickey for privateuser from 10.11.36.5 port 50244 ssh2
To extract a field "
action" from the log message and assign a value of "
failure" to that field, either a static field (non-searchable) or a lookup (searchable) could be used.
For example to extract a static string from the log message, this information would be added to
## props.conf [linux_secure] REPORT-action_for_sshd_failed_publickey = action_for_sshd_failed_publickey ## transforms.conf [action_for_sshd_failed_publickey] REGEX = Failed\s+publickey\s+for FORMAT = action::"failure" ## note the static assignment of action=failure above
This approach is not recommended; searching for "
action=failure" in Splunk would not return these events, because the text "
failure" does not exist in the original log message.
The recommended approach is to extract the actual text from the message and map it using a lookup:
## props.conf [linux_secure] LOOKUP-action_for_sshd_failed_publickey = sshd_action_lookup vendor_action OUTPUTNEW action REPORT-vendor_action_for_sshd_failed_publickey = vendor_action_for_sshd_failed_publickey ## transforms.conf [sshd_action_lookup] filename = sshd_actions.csv [vendor_action_for_sshd_failed_publickey] REGEX = (Failed\s+publickey) FORMAT = vendor_action::$1 ## sshd_actions.csv vendor_action,action "Failed publickey",failure
By mapping the event field to a lookup, Splunk is now able to search for the text "
failure" in the log message and find it.
Step 3: Create field extractions and aliases
After you identify the events within your data source and determine which dashboards the events correspond to within Enterprise Security, now you create the field extractions needed to normalize the data and match the fields required by the Common Information Model and the Enterprise Security dashboards and panels.
Use the following process to work through the creation of each required field, for each class of event that you have identified:
Note: This process can be done manually by editing configuration files or graphically by using the Interactive Field Extractor. See the "Examples" sections in this document for more detail on these options.
In the Splunk Search app, start by selecting a single event or several almost identical events to work with. Start the Interactive Field Extractor and use it to identify fields for this event. Verify your conclusions by checking that similar events contain the same fields.
Each event contains relevant information that you will need to map into the Common Information Model. Start by identifying the fields of information that are present within the event. Then check the Common Information Model to see which fields are required by the dashboards where you want to use the event.
Note: The Dashboard Requirements Matrix in this manual lists the fields required by each dashboard. Additional information about the Common Information Model can be found in the "Common Information Model" topic in the core Splunk product documentation.
Where possible, determine how the information within the event maps to the fields required by the Common Information Model (CIM). Some events will have fields that are not used by the dashboard. It may be necessary to create extractions for these fields. On the other hand, certain fields may be missing or have values other than those required by the CIM. Fields can be added or modified, or marked as unknown, in order to fulfill the dashboard requirements.
Note: In some cases, it may turn out that the events simply do not have enough information (or the right type of information) to be useful in the Splunk App for Enterprise Security. The type of data being brought into the Splunk App for Enterprise Security is not applicable to security (for example, weather information).
Create field extractions
After the relevant fields have been identified, create the Splunk field extractions that parse and/or normalize the information within the event. Splunk provides numerous ways to populate search-time field information. The specific technique will depend on the data source and the information available within that source. For each field required by the CIM, create a field property that will do one or more of the following:
- Parse the event and create the relevant field (using field extractions)
Example: In a custom application,
Error at the start of a line means authentication error, so it can be extracted to an authentication field and tagged "action=failed".
- Rename an existing field so that it matches (field aliasing)
Example: The "key=value" extraction has produced "target=10.10.10.1" and "target "needs to aliased to "dest".
- Convert an existing field to a field that matches the value expected by the Common Information Model (normalizing the field value)
Example: The "key=value" extraction has produced "action=stopped", and "stopped" needs to be changed to "blocked".
Review each field required by the Common Information Model and find the corresponding portion of the log message. Use the create field extraction statement to extract the field. See "field extractions" in the core Splunk product documentation for more information on creating field extractions.
Note: Splunk may auto-extract certain fields at search time if they appear as "field"="value" in the data. Typically, the names of the auto-extracted fields will differ from the names required by the CIM. This can be fixed by creating field aliases for those fields.
Create field aliases
It may be necessary to rename an existing field to populate another field. For example, the event data may include the source IP address in the field
src_ip, while the Common Information Model requires the source be placed in the "
src" field. The solution is to create a field alias for "
src" field that contains the value from "
src_ip". This is done by defining a field alias that will create a new field with a name that corresponds with the name defined in the Common Information Model.
See the "field aliases" topic in the core Splunk product documentation for more information about creating field aliases.
Normalize field values
You must make sure that the value populated by the field extraction matches the field value requirements in the Common Information Model. If the value does not match (for example, the value required by the Common Information Model is "
success" or "
failure" but the log message uses "
succeeded" and "
failed") then create a lookup to translate the field so that it matches the value defined in the Common Information Model. See the core Splunk documentation topics "Setting up lookup fields" and "Tag and alias field values in Splunk Web" to learn more about normalizing field values.
Verify field extractions
Once field extractions have been created for each of the security-relevant events to be processed, validate that the fields are in fact extracted properly. To confirm whether the data is extracted properly, search for the source type.
Perform a search
Run a search for the source type defined in the technology add-on to verify that each of the expected fields of information is defined and available on the field picker. If a field is missing or displays the wrong information, go back through these steps to troubleshoot the technology add-on and figure out what is going wrong.
Example: A technology add-on for Netscreen firewall logs is created by having network-communication events identified, a source type of "
netscreen:firewall" created, and required field extractions defined.
To validate that the field extractions are correct, run the following search:
These events are network-traffic events. The Dashboard Requirements Matrix in this manual shows that this type of data requires the following fields:
action, dvc, transport, src, dest, src_port, and
Use the field picker to display these fields at the bottom of the events and then scan the events to see that each of these fields is populated correctly.
If there is an incorrect value for a field, change the field extraction to correct the value. If there is an empty field, investigate to see whether the field is actually empty or should be populated.
The Splunk App for Enterprise Security uses tags to categorize information and specify which dashboards an event belongs in. Once you have the fields, you can create properties that tag the fields according to the Common Information Model.
Note: Step 4 is only required for Centers and Searches. Correlation searches can be created with data that is not tagged.
To create the tags, determine the dashboards where you want to use the data and see which tags those dashboards require. If there are different classes of events in the data stream, tag each class separately.
To determine which tags to use, see the Dashboard Requirements Matrix in this manual.
Create an event type
To tag the events, first create an event type that characterizes the class of events to tag, then tag that event type. An event type is defined by a search that is used to describe the event in question. Event types have to be unique, each with different names. The event types are created and stored in the
To create an event type for the technology add-on, edit the
default/eventtypes.conf file in the add-on directory. Give the event type a name that corresponds to the name and vendor of the data source, such as
cisco:asa. Creating an event type actually creates a new field, which can be tagged according to the Common Information Model.
Once the event type is created, create the tags (for example "authentication", "network", "communicate", and so on) that are used to group events into categorizations. To create the necessary tags, edit the
tags.conf file in the
default directory and enable the necessary tags on the event type field.
To verify that the data is being tagged correctly, display the event type tags and review the events. To do this, search for the source type you created and use the field discovery tool to display the field "tag::eventtype" at the bottom of each event. Then look at your events to verify that they are tagged correctly. If you created more than one event type, you also want to make sure that each event type is finding the events you intended.
Check Enterprise Security dashboards
Once the field extractions and tags have been created and verified, the data should begin to appear in the corresponding dashboard(s). Open each dashboard you wanted to populate and verify that the dashboard information displays properly. If it does not, check the fields and tags you created to identify and correct the problem.
Note: Many of the searches within the Splunk App for Enterprise Security run on a periodic scheduled basis. You may have to wait a few minutes for the scheduled search to run before data will be available within the respective dashboard.
Review the "Search view matrix" topic in the User Manual to see which searches need to run. Navigate to Manager > Searches to see if those searches are scheduled to run soon.
Note: Searches cannot be run directly from the Enterprise Security interface due to a known issue in the Splunk core permissions.
Step 5: Document and package the technology add-on
Once you have created your add-on and verified that it is working correctly, document and package the add-on for future reference, so that it is easy to deploy. Packaging your add-on correctly ensures that you can deploy and update multiple instances of the add-on.
Document the technology add-on
Edit or create the
README file under the root directory of your technology add-on and add the information necessary to remember what you did and to help others who may use the technology add-on.
Note: If the Interactive Field Extractor has been used to build field extractions and tags, they will be found under
The following is the suggested format for your
===Product Name Technology Add-on=== Author: The name of the person who created this technology add-on Version/Date: The version of the add-on or the date this version was created Supported product(s): The product(s) that the technology add-on supports, including which version(s) you know it supports Source type(s): The list of source types the technology add-on handles Input requirements: Any requirements for the configuration of the device that generates the IT data processed by the add-on ===Using this technology add-on=== Configuration: Manual/Automatic Ports for automatic configuration: List of ports on which the technology add-on will automatically discover supported data (automatic configuration only) Scripted input setup: How to set up the scripted input(s) (if applicable)
Package the technology add-on
Next, prepare the technology add-on so that it can be deployed easily. In particular, you need to ensure that any modifications or upgrades will not overwrite files that need to be modified or created locally.
First, make sure that the archive does not include any other files under the add-on's
local directory. The
local directory is reserved for files that are specific to an individual installation or to the system where the technology add-on is installed.
Next, add a
.default extension to any files that may need to be changed on individual instances of Splunk running the technology add-on. This includes dynamically-generated files (such as lookup files generated by saved searches) as well as lookup files that users must configure on a per install basis. If you include a lookup file in the archive and do not add a
.default extension, an upgrade will overwrite the corresponding file. Adding the
.default extension makes it clear to the administrator that the file is a default version of the file, and should be used only if the file does not exist already.
Finally, compress the technology add-on into a single file archive (such as a
tar.gz archive). To share the technology add-on, go to Splunk Apps, click upload an app and follow the instructions for the upload.
Add a custom technology add-on
Remove an app from app import
To remove or block a technology add-on from being automatically imported ("app import"), see "Remove a technology add-on from app import" in the Splunk App for Enterprise Security Installation and Configuration Manual for details.