Splunk® DB Connect

Deploy and Use Splunk DB Connect

Download manual as PDF

NOTE - Splunk DB Connect version 1.x reached its End of Life on July 28, 2016. Please see the migration information.
This documentation does not apply to the most recent version of DBX. Click here for the latest version.
Download topic as PDF

Configure database input queries

A database input lets you fetch and index data from a SQL database. Unlike other input sources, database inputs are retrieved periodically by the DBmon scheduler.

Note: Because Splunk DB Connect queries your database, it can have an impact on database performance. This is likely if your initial tail query retrieves a large amount of data. Subsequent queries that retrieve new data only are likely to have less impact on database performance.

To add a database input:

1. In Splunk Web, select Apps > Splunk DB Connect. The Splunk DB Connect app opens.

2. Select Settings > Manage database inputs. The Database Inputs page opens.

3. Click New. The Add New Database Inputs page opens. Use this page to configure database input, output, and schedule query intervals.

Configure input query

1. Assign a unique Name to your input.

2. Select your Input Type from the drop-down list.

  • Tail finds the new records you want and returns only those records with each query.
  • Dump invokes the same query each time and returns all results.

3. Select the Database from the drop-down list.

Important: On MySQL instances, the terms "database" and "schema" are interchangeable. Unlike other RDBMS systems, MySQL only supports a single schema per database. Therefore, when selecting from the database drop-down list, the schema dropdown will be always be set to "all."

4. (optional) Select the Specify SQL query check box if you want to run a custom SQL query against the database. This opens the SQL Query field where you type your query string. For example:

SELECT * FROM my_table {{WHERE $rising_column$ > ?}}

Place the WHERE clause in curly braces {{...}}. The literal $rising_column$ is replaced with the name specified in the Rising Column field in step 6. The checkpoint value is substituted for the literal ?.

For the initial run, when there is no checkpoint state, the query does not include the part inside the curly braces, {{...}}. On subsequent queries, the query includes the part inside the curly braces.

If Rising column is a date, wrap the checkpoint parameter in a "to_date" construct. For example: {{AND $rising_column$ > to_date (?,'YYYY-MM-DD"T"HH:MI:SS')}}. The correct "to_date" function to use depends on the database type. In MySQL, the to_date function is STR_TO_DATE.

For Oracle, use uppercase for the name of the Rising column.

For more information, see How dbmon-tail inputs work.

5. If you do not specify a SQL query, type the database table name that you want to query in the Table Name field. Splunk provides the appropriate query string.

6. If you selected the Tail input type in step 2, specify the Rising Column in the Tail input settings panel. Choose a column with an increasing value, such as the creation or modification timestamp, or a sequential identifier.

Caution: Do not rename the rising_column. Doing so can break your database input. See "Renaming rising_column breaks database input" in the Troubleshooting section of this manual.

7. Specify a data Sourcetype.

The following formats are associated with the sourcetypes:

  • Key-Value format with dbmon:kv sourcetype
  • Multi-line Key-Value format with dbmon:mkv sourcetype
  • Template with dbmon:tpl sourcetype
  • CSV format with CSV sourcetype

Note: If you leave the Sourcetype field blank, the pre-defined sourcetype associated with the format is used.

To use a custom sourcetype, specify line-break and timestamp settings in $SPLUNK_HOME$/etc/apps/dbx/local/props.conf or $SPLUNK_HOME$/etc/system/local/props.conf file.

For sourcetype line-break specifications, see $SPLUNK_HOME$/etc/apps/dbx/default/props.conf or $SPLUNK_HOME$/etc/system/default/props.conf.

8. Specify the name of the Index associated with this input.

9. Specify the name of the host as Host Field value, for this database.

How dbmon-tail inputs work

When you create a Tail input in the UI, DB Connect adds a dbmon-tail stanza to your inputs.conf file, in $SPLUNK_HOME/etc/apps/dbx/local.

Unlike dbmon-dump inputs, which index all data from the specified table each time the SQL query executes, a dbmon-tail input filters the table data input based on an increasing value specified in a "rising column." This lets you index only new data appended to the table since the last SQL query.

You can specify as rising column any column whose value increases over time, such as a timestamp or sequential ID. For example, a rising column could be last_update, employee_id, customer_id, transaction_id and so on.

For dbmon-tail inputs, the SQL query is broken into two parts: The main SQL, plus a filter condition, such as

{{WHERE $rising_column$ > ?}}

For example, the SQL statement:

SELECT customer_id, last_name, first_name FROM customer {{WHERE $rising_column$  > ?}}

is executed as follows:

Note: customer_id is set as rising column.

1. When DB Connect runs this SQL statement for the first time, only the main part of the SQL statement executes:

SELECT customer_id, last_name, first_name FROM customer

Once the last record is retrieved from the main SQL query, DB Connect stores the highest value of the rising column in the state.xml file pertaining to this input, under $SPLUNK_HOME/var/lib/splunk/persistentstorage/dbx.

2. Upon subsequent executions of this dbmon-tail, DB connect emits the full SQL statement, which includes both the main SQL statement and the tail portion containing the filter condition:

SELECT customer_id, last_name, first_name FROM customer {{WHERE $rising_column$  > ?}}

So this would be the actual SQL DB Connect emits to the database:

SELECT customer_id, last_name, first_name FROM customer WHERE customer_id  > 10

Note: In this case, the number 10 is the value that DB Connect stores in the state.xml file from the previous execution. "?" is the variable that represents the state.xml value.

Configure database output

Define your database output parameters.

1. Select an Output Format to determine how Splunk renders your output data.

  • Key-Value format
  • Multi-line Key-Value format
  • Template
  • CSV
  • CSV (with headers)

If you select Template output format, specify the template with placeholders for the column values returned from the database. DB Connect applies the template to each row returned by the query, then writes the resulting text to the output, which is then indexed. For example, this template:

Event ID $ID$ from $HOST$ at $timestamp$

returns output like this:

Event ID 4712 from myhost.foobar.com at 2013-10-30T13:59:12.201Z

Note: Line-breaking settings in props.conf and transforms.conf can impact template output. In addition, template output format does not provide automatic field extractions, so you must extract fields manually. We recommend that you use Template output format only if you have a specific format that you want to use for your database input.

If you select CSV or CSV (with headers) output format, and you want a complete CSV file without line breaks, then you must specify the SHOULD_LINEMERGE attribute as "true" in $SPLUNK_HOME/etc/apps/dbx/local/props.conf. For example:

[source::$sourcename$]
SHOULD_LINEMERGE=True

Note: For CSV and CSV (with headers) options, if you have modified the default source name in props.conf, but want to maintain line breaking, specify the regular expression for the LINE_BREAKER attribute in $SPLUNK_HOME/etc/apps/dbx/local/props.conf.

2. Select the Output timestamp check box to prefix the event with a timestamp.

3. Specify the Timestamp column of the table/query to use as the timestamp. If you do not specify a column, the current time is used.

4. Specify the Timestamp format. This is a Java SimpleDateFormat pattern. The default format is configurable during setup.

About timestamps and database output

Splunk assigns timestamps to event data at index time. In most cases, Splunk automatically recognizes and extracts timestamps from your data. If an event does not contain an explicit timestamp, Splunk tries to assign a timestamp through other means, according to specific timestamp precedence rules.

In some cases, you might need to help Splunk recognize the timestamps in your database output.

For example, when Splunk indexes your data, it looks for a timestamp of the DATETIME datatype. If your timestamp is a string value (such as VARCHAR, NVARCHAR, etc.), you can try to convert the timestamp to the correct datatype using a custom SQL statement with CAST, CONVERT, or TO_TIMESTAMP functions.

Or, if your data does not have a time reference, you can configure Splunk to use an alternate timestamp source, such as the system time when Splunk indexes your data.

Note: Incorrect timestamp formatting can cause line-breaking issues when Splunk indexes your data. For help, see Issues with bad line breaking/line merging in the Troubleshooting section of this manual.

Timestamp best practice

Follow these steps to help ensure that Splunk assigns proper timestamps to your database output at index time.

1. Look at your data. Is the time established in your source data?

  • If yes, then Splunk might be able to assign the timestamp from your data. Proceed to step 2.
  • If no, then skip ahead to step 3.

2. Is there a column with the time in it?

  • If yes, is that column configured as a DATETIME datatype?
    • If yes, then Splunk assigns the timestamp from this column to your data.
    • If no, then try to use an SQL statement to convert the timestamp to the DATETIME datatype (using CAST or CONVERT) functions. If this doesn't work, you can try specifying the timestamp parse format in $SPLUNK_HOME/etc/apps/dbx/inputs.conf, as shown in this workaround.
    • Note: Do not assign a timestamp to a rising column.

3. Is the time established outside of your data?

  • If no, then you can configure Splunk to set the timestamp to the time Splunk indexes the data, as follows:
    • Go to Database Inputs > Add New > Outputs > Output Format, then check Output timestamp and leave the Timestamp column field blank. Splunk uses the current time as timestamp as it indexes your data.

You can configure how Splunk recognizes timestamps in your data by editing timestamp attributes in $SPLUNK_HOME/etc/apps/dbx/local/props.conf. See "Edit timestamp properties in props.conf".

Specify input query interval

Specify an Interval for your database queries. This is the amount of time Splunk DB Connect waits between queries. If you leave the Interval field blank, DB Connect chooses a time interval based on the amount of data fetched.

You can specify the Interval using a relative time expression or a valid cron expression.

Relative time expressions

A relative time expression specifies the amount of time between each input query. The syntax for relative time expressions is:

<integer><time_unit>

Relative time units are specified as seconds (s), minute (m), hour (h), day (d), week (w), month (mon), quarter (q), and year (y). For example, if you want your input query to run every 15 minutes, you would enter 15m.

Cron expressions

Cron expressions let you schedule your input queries to run on a recurring basis. Cron expressions typically consist of five or six fields that encode a time specification similar to this:

minutes (0-59) | hours (0-23) | date of month (1-31) | month of year (1-12) | day of week (0-6, 0=Sunday)

For example, if you want your input query to run every Sunday at 3:30 AM, you would enter a cron expression such as:

30 3 * * 0

This literally translates into: "the 30th minute, of the 3rd hour, of any date of the month, of any Month of the year, on Sunday."

Note: When you specify an Interval value, DB Connect stores that value in the inputs.conf file, under the $SPLUNK_HOME/etc/apps/dbx/local directory. You can edit the interval value inside inputs.conf, under the dbmon stanza that specifies the input. See inputs.conf.spec

PREVIOUS
Manage a database connection
  NEXT
Set up a lookup table

This documentation applies to the following versions of Splunk® DB Connect: 1.1, 1.1.1, 1.1.2, 1.1.3, 1.1.4, 1.1.5, 1.1.6, 1.1.7, 1.2.0, 1.2.1, 1.2.2


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters