Create and manage database inputs
A database input object lets you retrieve and index data from a database. Database inputs are what enable Splunk Enterprise to query your database, identify the data to consume, and then tag and index the data. Once you have set up a database input, you can use that in the same way that you use other data input you have defined in Splunk Enterprise.
DB Connect supports running on a search head cluster in Splunk Enterprise as of DB Connect version 3.13.0, and in Splunk Cloud as of DB Connect version 3.10.0.
For ad hoc, interactive usage of database connections by live users, you must have DB Connect installed on search heads. For scheduled indexing from databases and output of data to databases, you must have DB Connect installed on heavy forwarders.
Splunk DB Connect 3.5.x supports creating inputs using templates created in both Splunk Add-ons and in Splunk DB Connect. This saves the work of recreating basic content of inputs. Splunk lists the templates in the template field of DB Connect. These add-ons include:
- Splunk Add-on for Microsoft SQL Server
- Splunk Add-on for Oracle Database
- Splunk Add-on for McAfee ePO Syslog
- Splunk Add-on for Nagios Core
Create a database input
You can create a new input from scratch. Additionally, you can create your new input using a template.
- If you have not yet created an identity or a connection, you must do so before you can create a database input.
- If you create the input using a template, to make a supported add-on appear in the Templates list in the DB Connect UI, you must copy the
db_input_templates.conf
file from your add-on'sdefault
directory to yoursplunk_db_connect/local
directory. You must reconfigure templates when upgrading to the latest version of Splunk DB Connect. - Inputs only support SQL queries such as "select...from...where", not DDL such as "create table..."
Complete the following instructions to create a database input in DB Connect:
- In Splunk DB Connect, select Data Lab > Inputs and then New Input.
- On the Set SQL Query page, complete the following steps and then select Next to go to the Set Properties page.
- Choose table. Select the database table you want to use with this input.
- Specify SQL query. Specify a query to run to retrieve data from your database.
- Choose input mode. Specify the input mode and related templates for this input.
- Choose input type. Specify the input type and related settings of this input.
- On the Set Properties page, complete the following steps and then select Finish.
- Basic information. Specify the name, description and the application of this input.
- Parameter settings. Configure the fetch size, execution frequency and max row to retrieve of this input.
- Metadata. Specify the metadata of this input, Splunk uses the value to index your data events.
Choose Table
- Select a connection that you want to use for this input from the list under the Connection field. The list displays all the connections you have configured in DB Connect, you can also enter the connection name to search the connection you want to use.
Once you have selected the connection, Splunk DB Connect validates the connection, and displays an error message if it is not able to do so. You cannot continue the new input setup process unless you specify a valid connection. - From the corresponding dialog menus, choose the Catalog, Schema, and Table that contain the data you want to pull into Splunk platform.
Specify SQL query
After you select the table, Splunk displays the corresponding SQL query in SQL Editor, you can preview the result of the query. If you need to further edit or write your own SQL query, you can write it directly in SQL Editor and select Execute SQL to preview the result. You can make complex SQL statements easier to see by selecting Format SQL.
Note: If you want to use SQL query from template, select the template from the menu. Be aware that using a template overwrites the previous SQL you specified.
Note: Inputs only support SQL queries such as "select...from...where", not DDL such as "create table..."
Choose input mode
Select an input mode for your query, either Metric or Event.
- Metric. An input mode that allows for the user to ingest performance metric data for DB Connect. Splunk does not currently support getting data into DB Connect as metrics.
- Event. An input mode that allows for the user to ingest event data.
After selecting the input mode, select the related templates for the input mode from the available list of templates in the Template menu.
If you create the input by using a template, Splunk sets all the settings from the template for the new input. You can change them based on your needs, but be aware that Splunk saves the changes to the input you create but not to the template.
Choose input type
Specify an input type for your query, either Batch or Rising. Then configure the related fields for the input type.
- Batch. A batch input invokes the same database query each time you run the input and returns all results.
- Rising. A rising input has a column that DB Connect uses to keep track of what rows are new from one input execution to the next.
Batch input
A batch input invokes the same database query each time you run the input and returns all results. It does not keep track of whether you add rows since the last time you ran the input. Batch input mode is ideal for unchanging historical data that you want to index into Splunk once. It can also be useful for reindexing data that updates through time, such as personnel records or state tables, though this has license impact; lookups might be a better choice.
To create a batch input type,
- Select Batch under Input Type field.
- Select the Timestamp column.
Specify which column contains the timestamp that you want to use to order this data in the Splunk index. Splunk uses this required value to populate the _time
variable and index time-series data.
- Current Index Time: Assigns indexed data a timestamp value that is equal to index time. Select this option if your data has no timestamp.
- Choose Column: Select the column that contains the timestamp value. If this column is in a format which Splunk can't parse as a timestamp by DB Connect, Splunk displays a field where a Java SimpleDateTimeFormat you can enter a compatible parser so that Splunk can understand the timestamp value.
- Column: This option only appears if you selected Choose Column in the previous option. Select the column that contains the timestamps you want to use.
- Query timeout: Enter the number of seconds to wait for the query to complete. The default is 30 if you leave it blank.
Rising input
A rising input has a column that DB Connect uses to keep track of what rows are new from one input execution to the next. When you create a rising input type, you must select the rising column. You can select a rising column as any column whose value increases or decreases over time, such as a timestamp or sequential ID. For example, you can use columns such as row_id, transaction_id, employee_id, customer_id, last_updated, and so on.
Timestamps are not ideal for rising columns, though they often are the best available choice. Using a timestamp for rising column can produce the following problem conditions:
- A high rate of event generation can cause data duplication or loss, because checkpointing in a stream of timestamp-distinguished records assumes there is never more than one row created in a given time. If you set the time to a one second level of resolution and get five records per second, you lose or duplicate four records from every run.
- Clock skew, NTP skew corrections, physical moves between timezones, and daylight savings events can cause data mis-ordering, duplication, or loss. If the skew is towards the future, then the resulting checkpoint value might temporarily or permanently stop data collection.
- Splunk does not evaluate Nonnumeric datetime values numerically, and lexical sorting can produce unpredictable results. If you order time series data lexically, then the resulting checkpoint value might temporarily or permanently stop data collection.
To create a rising input, select Rising and configure the following fields:
- Rising column: The rising column is the column from which DB Connect updates the checkpoint value each time you run the input.
- Checkpoint value: The checkpoint value is how DB Connect determines what rows are new from one input execution to the next. The first time you run the input, DB Connect only selects those rows that contain the value higher or lower than the value you specified in this column. Each time Splunk finishes running the input, DB Connect updates the input's checkpoint value with the value in the last row of the checkpoint column.
- Update the SQL query that includes a ? symbol and order by clause for the checkpoint value. The question mark (?) as the checkpoint placeholder and an order by clause. Every time you run the input, DB Connect replaces the question mark with the latest checkpoint value .
Note:
- From DB Connect 3.10.0 and later, the rising column checkpoints of the input are stored in the Splunk KV Store in the collection
dbx_db_input
. For each input DB Connect creates a separate entry in the collection.- To see what column types (varchar, number, timestamp, and so on) Splunk supports as rising column value types in DB Connect-supported databases, see supported rising column types by database.
With a rising column mode input, you can:
- Effectively create a "descending column", or a column wherein the checkpoint decreases every time the input runs.
SELECT abc, xyz FROM table WHERE abc < ? ORDER BY abc DSC
- Customize the comparison operation to determine the checkpoint value by evaluating a SQL function result, while simultaneously avoiding using query wrapping.
SELECT CONCAT(last_name, first_name) as NAME, ACTOR_ID, FIRST_NAME, LAST_NAME FROM actor WHERE CONCAT(last_name, first_name) > ? ORDER BY CONCAT(last_name, first_name)
- Customize the WHERE clause to add additional conditions other than the comparison expression that the rising column inputs specify.
SELECT *FROM (SELECT * FROM ACTOR WHERE ACTOR_ID > ?) ACTOR JOIN FILM_ACTOR ON actor.ACTOR_ID = FILM_ACTOR.ACTOR_ID JOIN FILM ON FILM_ACTOR.FILM_ID = FILM.FILM_ID ORDER BY ACTOR.ACTOR_ID
- Use other advanced SQL features in the WHERE clause; for example, a CASE statement.
After you specify the input type, select Next to go to the Set Properties page.
Basic information
Configure the following fields,
- Name: Specify the name of the input. Be aware the input name cannot contain space or special characters.
- Description: Optional. The description of the input.
- Application: The name of the Splunk Enterprise app where you want to save this input object. By default, Splunk sets the pop-up menu to Splunk DB Connect. This menu enables other apps to use DB Connect inputs, outputs, and lookups within their own context.
Parameter settings
Complete the following fields,
- Max Rows to Retrieve: Optional. The maximum number of rows to retrieve with each query. If you set this to 0 or leave it blank, you can have unlimited rows.
- Fetch size: Optional. The number of rows to return at a time from the database. The default is 300 if you leave it blank.
- Execution Frequency: The number of seconds or a valid cron expression, such as 0 18 * * * (every day at 6 PM).
Metadata
Splunk uses the metadata fields to index your data events. As you search the indexed data using Splunk Enterprise, you can identify data from the query by the fields that you enter here.
- Host: Optional. Splunk uses the host defined on the connection if you leave it blank.
- Source: Optional. Splunk uses the input name if you leave it blank.
- Source type: Enter a sourcetype field value for Splunk Enterprise to assign to queried data as you index. select the field and enter a value, or select an existing value from the menu that appears.
- Index: Enter an index value for the index in which you want Splunk Enterprise to store indexed data. You can enter the index name or select it from the typeahead menu.
Note: If you want to use the customized index, you have to make sure the index exists in the Splunk platform. Otherwise you have to create the index in Splunk Enterprise first to prevent data loss.
Edit database inputs
To see a list of the defined database inputs, first select the Data Lab>Inputs tab. Splunk displays a list of your database inputs.
To edit a database input, select its name. You can make changes to a database input using the following buttons on the input page:
- Enable/Disable: Disable an input by selecting Enable/Disable here.
- Edit: Edit the input by selecting the name or the Edit button.
- Clone: Creates a copy of the input. You must give the copy a new name.
- Delete: Deletes the input.
You can also edit any of the attributes of a database input listed in Create a database input, except its name. To change the name of an input, clone it, give the clone the name you want, and then delete the original input.
Supported rising column types by database
The following matrix summarizes what column types (varchar, number, timestamp, and so on) Splunk supports as rising column value types in DB Connect-supported databases.
Database | varchar | int, float, real, bigint, number |
timestamp | datetime | date |
---|---|---|---|---|---|
AWS RDS Aurora | x | x | x | x | x |
AWS RedShift | x | x | x | ||
DB2/Linux | x | x | x | ||
Informix * | |||||
MemSQL | x | x | x | x | x |
Microsoft SQL Server | x | x | x | x | x |
MySQL | x | x | x | x | x |
Oracle | x | x | x | x | x |
PostgreSQL | x | x | x | ||
SAP SQL Anywhere | x | x | x | ||
Sybase ASE | x | x | x | ||
Sybase IQ | x | x | x | ||
Teradata | x | x | x |
The column marked x means Splunk supports this column.
* Information for Informix databases is not currently available.
Use database inputs
Once you've configured a database input and Splunk Enterprise has indexed your data, you can use that input as you do any other data input you've defined in Splunk Enterprise. Use the Search Processing Language (SPL) to write a series of commands and arguments.
For a quick view of what an input returns to Splunk Enterprise, go to the database inputs page by selecting Data Lab>Inputs in the top navigation bar, select the name of the input to view, and then select the Find Events button. The search app opens with a pre-populated search that searches on your input.
You can search for keywords and use Boolean operators such as AND
, OR
, and NOT
, plus wildcard characters ("*
"):
manufacturing (widgets AND gadgets OR gewgaw*)
Fields in Splunk Enterprise correspond to columns in your database. Search for fields using the syntax: fieldname="field value"
source="db2input"
Chain search commands and arguments together using the pipe ("|
") character. For example, the following search retrieves indexed events from the database input db2input
that contain the term "manufacturing" and, for those events, reports the most common LOCATION
values:
manufacturing source="db2input" | top LOCATION
To refine your search further, show only the fields you want using the fields keyword. The following search only shows the five fields listed after fields, plus a timestamp, for each event retrieved:
source="db2input" | fields ADMRDEPT DEPTNAME DEPTNO LOCATION MGRNO
To learn more:
- Read the Search Tutorial. Since you've already gotten your data into Splunk Enterprise, start with Part 4, Using Splunk Search.
- See Search and reporting in the Splunk Enterprise Overview manual for a guide to documentation based on your level of familiarity with Splunk Enterprise and on what you want to do with your data.
- For information specific to SQL users, see SQL tips and tricks in this manual, and Splunk SPL for SQL users in the Splunk Enterprise Search Reference manual.
Create and manage database connections | Create and manage bulk operations of the database inputs |
This documentation applies to the following versions of Splunk® DB Connect: 3.11.0, 3.11.1
Feedback submitted, thanks!