Create a technology add-on

Before you create a technology add-on, it is helpful to understand the format, structure, and meaning of the PCI compliance data that you wish to import into the Splunk App for PCI Compliance.

Creating a technology add-on involves the following five steps:

Step 1: Capture and index the data

The first step in creating a technology add-on involves capturing the data and indexing the data within Splunk. Although this step doesn't require anything beyond the normal process of getting data into Splunk, decisions made during this phase can affect how a technology add-on identifies and normalizes the data at search time.

The explicit tasks that are done in this step include:

Get the source data into Splunk

Choose a name for your add-on directory

Define a source type name to apply to the data

Note: Changes made in Step 1 affect the indexing of data, so this step will require dumping the indexed data and starting over if the steps are done incorrectly. Steps 2-4 do not affect indexing and can be altered while the app is running without restarting the server.

Get data into Splunk

Getting data into Splunk is necessary before the Splunk App for PCI Compliance can normalize the data at search time and use the information to populate dashboards and reports. This step is highlighted due to the different ways that some data sources can be captured, thus resulting in different formats of data. As you develop a technology add-on, it is important to ensure that you are accurately defining the method by which the data will be captured and indexed within Splunk.

Common data input techniques used to bring in data from PCI compliance-relevant devices, systems, and software include:

Streaming data over TCP or UDP (for example, syslog on UDP 514)

API-based scripted input (for example, Qualys)

SQL-based scripted input (for example, Sophos, McAfee EPO)

For more detailed information on getting data into Splunk, see the section "What Splunk can index" in the core Splunk product documentation.

The specific way in which a particular data set is captured and indexed in Splunk should be clearly documented in the README file in the technology add-on directory.

Choose a folder name for the technology add-on

A technology add-on is packaged as a Splunk app and must include all of the basic components of a Splunk app in addition to the components required to process the incoming data. All Splunk apps (including technology add-ons) reside in $SPLUNK_HOME/etc/apps.

The following table lists the files and folders of a basic technology add-on:

File/Directory Name	Description
default	Contains the files related to the processing of the data
eventtypes.conf	Defines the event types (categories of events for the given source type)
app.conf	Describes the app and provides the ability to disable it
props.conf	Contains attribute/value pairs that define how the data is processed
tags.conf	Defines the tags
transforms.conf	Contains additional attribute/value pairs required to process the data
local	Includes configuration files that are custom to a particular installation
metadata	Includes files that describe the app parameters
default.meta	Defines the scope of the objects exported by the given app
lookups	Contains the lookup CSV files
README	Describes the add-on, including configuration instructions, and the supported product version

The transforms described in the transforms.conf file describe operations that Splunk Enterprise can perform on the data. The props.conf file references the transforms so that they execute for a particular source type. In actual use, the distinction is not so clear because many of the operations in props.conf can be accessed directly and avoid using transforms.conf.

See "Create and maintain search-time field extractions" in the core Splunk product documentation for more details.

When building a new technology add-on, you must decide on a name for the add-on folder. When choosing a name for your technology add-on folder, use the following naming convention:

  TA-<datasource>

The technology add-on folder should always begin with "TA-". This allows you to distinguish between technology add-ons and other add-ons within your Splunk deployment. The <datasource> section of the name should represent the specific technology or data source that this technology add-on is for. Technology add-ons that are shipped as part of the Splunk App for PCI Compliance follow this convention.

Examples include:

   TA-bluecoat
   TA-cisco
   TA-snort

For additional details on deploying and configuring technology add-ons with the Splunk App for PCI Compliance, see the PCI Compliance Installation and Configuration Manual.

Important: A template for creating your own add-on that includes a folder structure and sample files is available in the PCI Compliance Installer in: splunk/etc/apps/SplunkPCIComplianceSuiteInstaller/default/src/etc/apps/TA-template.zip.

Define a source type for the data

By default Splunk automatically sets a source type for a given data input. Each technology add-on should have at least one source type defined for the data that is captured and indexed within Splunk. This will require an override of the automatic source type that Splunk will attempt to assign to the data source. The source type definition is handled by Splunk at index time, along with line breaking, timestamp extraction, and timezone extraction. (All other information is set at search time.) See the section "Specify source type for a source" in the core Splunk product documentation.

You need to add a new source type for your technology add-on, making the source type name match product so the technology add-on will work. Be aware that this process overrides some default Splunk behavior. For explicit information on how to define a source type within Splunk, see "Override automatic source type assignment" in the core Splunk product documentation.

The source type name should be chosen to match the name of the product for which you are building a technology add-on (for example, "nessus"). Technology add-ons can cover more than one product by the same vendor (for example, "juniper"), and may require multiple source type definitions as a result. If the data source for which you are building a technology add-on has more than one data file with different formats (for example, "apache_error_log" and "apache_access_log") you may choose to create multiple source types.

These source types can be defined as part of the technology add-on in inputs.conf, just props.conf or using both props.conf and transforms.conf. These files must be created as part of the technology add-on. These files contain only the definitions for the source types that the technology add-on is designed to work with.

There are three ways to specify source types.

Let Splunk automatically define source types in the data
Define source types in transforms.conf
Define source types in inputs.conf

We recommend that you define your source types in inputs.conf. See "Configure rule-based source type recognition" in the core Splunk product documentation for more information about source types.

Tip: In addition to the source type definition, you can add a statement that forces the source type when a file is uploaded with the extension set to the source type name. This allows you to import files for a given source type simply by setting the file extension appropriately (for example, "mylogfile.nessus"). This statement can be added to the props.conf file in the Splunk_TA-<technology_add-on_name>/default/props.conf directory as in the following example:

[source:....nessus]
sourcetype = nessus

Change the names "nessus" to match your source type.

Note: Usually, the source type of the data is statically defined for the data input. However, you may be able to define a statement that can recognize the source type based on the content of the logs.

Confirm that your data has been captured

Once you have decided on a folder name for your technology add-on and defined the source type(s) in the inputs.conf, just props.conf or using both props.conf and transforms.conf, and props.conf files, your technology add-on can be given life and you can start collecting your data within Splunk. Turn on the data source either by enabling the stream or the scripted input to begin data collection. Confirm that you are receiving the data and that the source type you've defined is working appropriately by searching for the source type you defined in your data.

Note: Restart Splunk in order for it to recognize the technology add-on and the source type defined in this step. After Splunk has been restarted, it automatically reloads the changes made to search time operations in props.conf and transforms.conf.

If the results of the search are different than expected or no results are displayed, do the following:

Confirm that the data source has been configured to send data to Splunk. This can be validated by using Splunk to search for keywords and other information within the data.
If the data source is sent via scripted input, confirm that the scripted input is working correctly. This may be more challenging to confirm, but can be done by checking the data source, script, or other points of failure.
If the data is indexed by Splunk, but the source type is not defined as you expect, confirm your source type logic is defined appropriately and retest.

Handling timestamp recognition

Splunk is designed to understand most common timestamps that are found in log data. Occasionally, however, Splunk may not recognize a timestamp when processing an event. If this situation arises, it is necessary to manually configure the timestamp logic into the technology add-on. This can be done by adding the appropriate statements in the props.conf file. For specific information on how to deal with manual timestamp recognition, check "How timestamp assignment works" in the core Splunk product documentation.

Configure line-breaking

Multi-line events can be another challenge to deal with when creating a technology add-on. Splunk handles these types of events by default, but on some occasions Splunk will not recognize events as multi-line. If the data source is multi-line and Splunk has issues recognizing it, see the section "Configure event linebreaking" in the core Splunk product documentation. The documentation provides additional guidance on how to configure Splunk for multi-line data sources. Use this information in a technology add-on by making the necessary modifications to the props.conf file.

Step 2: Identify relevant PCI compliance events

After indexing, examine the data to identify the PCI compliance events that the source includes and determine how these events fit into the PCI compliance dashboards. This step allows selection of the correct Common Information Model tags and fields, so that data automatically populate the proper dashboards.

Understand your data and PCI compliance dashboards

To analyze the data, a data sample needs to be available. If the data is loaded into Splunk, view the data based on the source type defined in Step 1. Determine the dashboards into which this data might fit. See the "Reports" topic in the Splunk App for PCI Compliance Installation and Configuration Manual to understand what each of the reports and panels within the Splunk App for PCI Compliance require and the types of events that are applicable. With this information, review the data to determine what events need to be normalized for use within the app.

Note: The reports in the Splunk App for PCI Compliance are designed to use information that typically appears in the related data sources. The data source for which the technology add-on is being built may not provide all the information necessary to populate the required dashboards. If this is the case, look further into the technology that drives the data source to determine if other data is available or accessible to fill the gap.

Reviewing the data source and identifying relevant events should be relatively easy to do. This document assumes familiarity with the data and the ability to determine what types of events the data contains.

After events and their contents have been reviewed, the next step is to match the data to the application reports and dashboards. Review the dashboards directly in the Splunk App for PCI Compliance or check the descriptions in the Splunk App for PCI Compliance User Guide to see where the events might fit into these reports(s) .

The dashboards in the Splunk App for PCI Compliance are grouped into Scorecards, Reports, and Audit:

Scorecards: Provide at-a-glance summary information about current PCI compliance environment by requirement area. Scorecards present real-time views of the environment. At a glance, you can determine your PCI compliance status in each of the requirement areas.

Reports: Provide a historical view of activity related to each of the requirement areas. With reports you can track your PCI compliance, by requirement, over time.

Audit: The Audit dashboards provide a record of changes made in the PCI compliance environment to notable events, suppressions, forwarders, access to reports and scorecards, and consistency and security of data.

Within these areas, find the specific dashboard(s) that relate to your data. In some cases, your data source may include events that belong in different dashboards, or even in different areas.

Example:

Assume that you have decided to capture logs from one of your firewalls. You would expect that the data produced by the firewall will contain network traffic related events. However, in reviewing the data captured by Splunk you may find that it also contains authentication events (login and logoff) and device change events (policy changes, addition/deletion of accounts, etc.).

Knowing that these events exist will help determine which of the dashboards within the Splunk App for PCI Compliance are likely to be applicable. In this case, the Network Traffic Activity Report, Firewall Rule Activity Report, and Requirement 1 Scorecard dashboards are designed to report on the events found in this firewall data source.

Taking the time in advance to review the data and the application dashboard functionality will help make the next steps of defining the Splunk fields and tags easier.

The product and vendor fields need to be defined. These fields are not included in the data itself, so they will be populated using a lookup.

Static strings and event fields

Do not assign static strings to event fields because this prevents the field values from being searchable. Instead, fields that do not exist should be mapped with a lookup.

For example, here is a log message:

 Jan 12 15:02:08 HOST0170 sshd[25089]: [ID 800047 auth.info] Failed publickey 
 for privateuser from 10.11.36.5 port 50244 ssh2

To extract a field "action" from the log message and assign a value of "failure" to that field, either a static field (non-searchable) or a lookup (searchable) could be used.

For example to extract a static string from the log message, this information would be added to props.conf:

## props.conf
[linux_secure]
REPORT-action_for_sshd_failed_publickey = action_for_sshd_failed_publickey

## transforms.conf
[action_for_sshd_failed_publickey]
REGEX = Failed\s+publickey\s+for
FORMAT = action::"failure"
## note the static assignment of action=failure above

This approach is not recommended; searching for "action=failure" in Splunk would not return theses events, because the text "failure" does not exist in the original log message.

The recommended approach is to extract the actual text from the message and map it using a lookup:

## props.conf
[linux_secure]
LOOKUP-action_for_sshd_failed_publickey = sshd_action_lookup vendor_action OUTPUTNEW action
REPORT-vendor_action_for_sshd_failed_publickey = vendor_action_for_sshd_failed_publickey

## transforms.conf
[sshd_action_lookup]
filename = sshd_actions.csv

[vendor_action_for_sshd_failed_publickey]
REGEX = (Failed\s+publickey)
FORMAT = vendor_action::$1

## sshd_actions.csv
vendor_action,action
"Failed publickey",failure

By mapping the event field to a lookup, Splunk is now able to search for the text "failure" in the log message and find it.

Step 3: Create field extractions and aliases

After you identify the events within your data source and determine which dashboards the events correspond to within the Splunk App for PCI Compliance, now you create the field extractions needed to normalize the data and match the fields required by the Common Information Model and the PCI compliance dashboards and panels.

Use the following process to work through the creation of each required field, for each class of event that you have identified:

Note: This process can be done manually by editing configuration files or graphically by using the Interactive Field Extractor. See the Examples section in this document for more detail on these options.

Choose event

In the Splunk Search app, start by selecting a single event or several almost identical events to work with. Start the Interactive Field Extractor and use it to identify fields for this event. Verify your conclusions by checking that similar events contain the same fields.

Identify fields

Each event contains relevant information that you will need to map into the Common Information Model. Start by identifying the fields of information that are present within the event. Then check the Common Information Model to see which fields are required by the dashboards where you want to use the event.

Note: See "Create reports" in the Splunk App for PCI Compliance Installation and Configuration Manual, which lists the fields required by each report (or dashboard). Additional information about the Common Information Model can be found in the "Common Information Model" topic in the Splunk documentation.

Where possible, determine how the information within the event maps to the fields required by the Common Information Model (CIM). Some events will have fields that are not used by the dashboard. It may be necessary to create extractions for these fields. On the other hand, certain fields may be missing or have values other than those required by the CIM. Fields can be added or modified, or marked as unknown, in order to fulfill the dashboard requirements.

Note: In some cases, it may turn out that the events simply do not have enough information (or the right type of information) to be useful in the Splunk App for PCI Compliance. The type of data being brought into the Splunk App for PCI Compliance is not applicable to security (for example, weather information).

Create field extractions

After the relevant fields have been identified, create the Splunk field extractions that parse and/or normalize the information within the event. Splunk provides numerous ways to populate search-time field information. The specific technique will depend on the data source and the information available within that source. For each field required by the CIM, create a field property that will do one or more of the following:

Parse the event and create the relevant field (using field extractions)

Example: In a custom application, Error at the start of a line means authentication error, so it can be extracted to an authentication field and tagged "action=failed".

Rename an existing field so that it matches (field aliasing)

Example: The "key=value" extraction has produced "target=10.10.10.1" and "target "needs to aliased to "dest".

Convert an existing field to a field that matches the value expected by the Common Information Model (normalizing the field value)

Example: The "key=value" extraction has produced "action=stopped", and "stopped" needs to be changed to "blocked".

Extract fields

Review each field required by the Common Information Model and find the corresponding portion of the log message. Use the create field extraction statement to extract the field. See "field extractions" for more information on creating field extractions.

Note: Splunk may auto-extract certain fields at search time if they appear as "field"="value" in the data. Typically, the names of the auto-extracted fields will differ from the names required by the CIM. This can be fixed by creating field aliases for those fields.

Create field aliases

It may be necessary to rename an existing field to populate another field. For example, the event data may include the source IP address in the field src_ip, while the Common Information Model requires the source be placed in the "src" field. The solution is to create a field alias for "src" field that contains the value from "src_ip". This is done by defining a field alias that will create a new field with a name that corresponds with the name defined in the Common Information Model.

See "field aliases" for more information about creating field aliases.

Normalize field values

You must make sure that the value populated by the field extraction matches the field value requirements in the Common Information Model. If the value does not match (for example, the value required by the Common Information Model is "success" or "failure" but the log message uses "succeeded" and "failed") then create a lookup to translate the field so that it matches the value defined in the Common Information Model.

See setting up lookup fields and tag and alias field values to learn more about normalizing field values.

Verify field extractions

Once field extractions have been created for each of the security-relevant events to be processed, validate that the fields are in fact extracted properly. To confirm whether the data is extracted properly, search for the source type.

Perform a search

Run a search for the source type defined in the technology add-on to verify that each of the expected fields of information is defined and available on the field picker. If a field is missing or displays the wrong information, go back through these steps to troubleshoot the technology add-on and figure out what is going wrong.

Example: A technology add-on for Netscreen firewall logs is created by having network-communication events identified, a source type of "netscreen:firewall" created, and required field extractions defined.

To validate that the field extractions are correct, run the following search:

sourcetype="netscreen:firewall"

These events are network-traffic events. See the "Create reports" topic in the Splunk App for PCI Compliance Installation and Configuration Manual shows that this type of data requires the following fields: action, dvc, transport, src, dest, src_port, and dest_port.

Use the field picker to display these fields at the bottom of the events and then scan the events to see that each of these fields is populated correctly.

If there is an incorrect value for a field, change the field extraction to correct the value. If there is an empty field, investigate to see whether the field is actually empty or should be populated.

Step 4: Create event types and tags

The Splunk App for PCI Compliance uses event types and tags to categorize information and specify which dashboards an event belongs in. Once you have the fields, you can create properties that tag the fields according to the Common Information Model.

Note: Step 4 is only required for Centers and Searches. Correlation searches can be created with data that is not tagged.

Identify necessary tags

To create the tags, determine the dashboards where you want to use the data and see which tags those dashboards require. If there are different classes of events in the data stream, tag each class separately.

To determine which tags to use, see the "Search view matrix" topic in the Splunk App for PCI Compliance User Manual.

Create an event type

To tag the events, first create an event type that characterizes the class of events to tag, then tag that event type. An event type is defined by a search that is used to describe the event in question. Event types have to be unique, each with different names. The event types are created and stored in the eventtypes.conf file.

To create an event type for the technology add-on, edit the default/eventtypes.conf file in the add-on directory. Give the event type a name that corresponds to the name and vendor of the data source, such as cisco:asa. Creating an event type actually creates a new field, which can be tagged according to the Common Information Model.

Once the event type is created, you then create tags (for example "authentication", "network", "communicate", and so on) that are used to group events into categorizations. To create the necessary tags, edit the tags.conf file in the default directory and enable the necessary tags on the event type field.

Verify the tags

To verify that the data is being tagged correctly, display the event type tags and review the events. To do this, search for the source type you created and use the field discovery tool to display the field "tag::eventtype" at the bottom of each event. Then look at your events to verify that they are tagged correctly. If you created more than one event type, you also want to make sure that each event type is finding the events you intended.

Check PCI compliance dashboards

Once the field extractions and tags have been created and verified, the data should begin to appear in the corresponding dashboard(s). Open each dashboard you wanted to populate and verify that the dashboard information displays properly. If it does not, check the fields and tags you created to identify and correct the problem.

Note: Many of the searches within the Splunk App for PCI Compliance run on a periodic scheduled basis. You may have to wait a few minutes for the scheduled search to run before data will be available within the respective dashboard.

Review the Search View Matrix in the Splunk App for PCI Compliance User Manual to see which searches need to run. Navigate to Manager > Searches to see if those searches are scheduled to run soon.

Note: Searches cannot be run directly from the PCI compliance interface due to a known issue in the Splunk core permissions.

Step 5: Document and package the technology add-on

Once you have created your add-on and verified that it is working correctly, document and package the add-on for future reference, so that it is easy to deploy. Packaging your add-on correctly ensures that you can deploy and update multiple instances of the add-on.

Document the technology add-on

Edit or create the README file under the root directory of your technology add-on and add the information necessary to remember what you did and to help others who may use the technology add-on.

Note: If the Interactive Field Extractor has been used to build field extractions and tags, they will be found under $SPLUNK_HOME/etc/users/$USER/.

The following is the suggested format for your README:

   ===Product Name Technology Add-on===

   Author: The name of the person who created this technology add-on

   Version/Date: The version of the add-on or the date this version was created

   Supported product(s): The product(s) that the technology add-on supports, including 
   which version(s) you know it supports

   Source type(s): The list of source types the technology add-on handles

   Input requirements: Any requirements for the configuration of the device that 
   generates the IT data processed by the add-on

   ===Using this technology add-on===

   Configuration: Manual/Automatic 

   Ports for automatic configuration: List of ports on which the technology add-on 
   will automatically discover supported data (automatic configuration only)

   Scripted input setup: How to set up the scripted input(s) (if applicable)

Package the technology add-on

Next, prepare the technology add-on so that it can be deployed easily. In particular, you need to ensure that any modifications or upgrades will not overwrite files that need to be modified or created locally.

First, make sure that the archive does not include any other files under the add-on's local directory. The local directory is reserved for files that are specific to an individual installation or to the system where the technology add-on is installed.

Next, add a .default extension to any files that may need to be changed on individual instances of Splunk running the technology add-on. This includes dynamically-generated files (such as lookup files generated by saved searches) as well as lookup files that users must configure on a per install basis. If you include a lookup file in the archive and do not add a .default extension, an upgrade will overwrite the corresponding file. Adding the .default extension makes it clear to the administrator that the file is a default version of the file, and should be used only if the file does not exist already.

Finally, compress the technology add-on into a single file archive (such as a zip or tar.gz archive). To share the technology add-on, go to Splunkbase, click upload an app and follow the instructions for the upload.

Data Source Integration Manual

Create a technology add-on

Step 1: Capture and index the data

Get data into Splunk

Choose a folder name for the technology add-on

Define a source type for the data

Confirm that your data has been captured

Handling timestamp recognition

Configure line-breaking

Step 2: Identify relevant PCI compliance events

Understand your data and PCI compliance dashboards

Static strings and event fields

Step 3: Create field extractions and aliases

Choose event

Identify fields

Create field extractions

Extract fields

Create field aliases

Normalize field values

Verify field extractions

Perform a search

Step 4: Create event types and tags

Identify necessary tags

Create an event type

Verify the tags

Check PCI compliance dashboards

Step 5: Document and package the technology add-on

Document the technology add-on

Package the technology add-on

Comments

Create a technology add-on