Create and manage database inputs
A database input object lets you fetch and index data from a database. Database inputs are what enable Splunk Enterprise to query your database, identify the data to consume, and then tag and index the data. Once you have set up a database input, you can use that in the same way that you use other data input you have defined in Splunk Enterprise.
Splunk DB Connect 3.1.0 supports creating input(s) using the template(s) you create in Splunk Add-ons, you’ll be saved the work of repeatedly recreating basic content of inputs. The templates of the add-on will be listed in the template field of DB Connect. These add-on includes:
- Splunk Add-on for Microsoft SQL Server
- Splunk Add-on for Oracle Database
- Splunk Add-on for McAfee
- Splunk Add-on for Nagios Core
Create a database input
You have two ways to create a new input, either creating it from scratch or using a template.
Follow the instruction below to create a database input in DB Connect.
- If you have not yet created an identity or a connection, you are prompted to do so before you can create a database input.
- If you create the input by using a template, all the settings from the template will be set for the new input. You can change them based on your needs, be aware that the changes will be saved to the input you create but not to the template.
To create a new input, click Data Lab > Inputs and then New Input.
- On the Set SQL Query page, complete the following steps and then click Next to go to the Set Properties page.
- Choose table. Choose the database table you want to use with this input.
- Specify SQL query. Specify a query to run to fetch data from your database.
- Choose input type. Specify the input type and related settings of this input.
- On the Set Properties page, complete the following steps and then click Finish.
- Basic information. Specify the name, description and the application of this input.
- Parameter settings. Configure the fetch size, execution frequency and max row to retrieve of this input.
- Metadata. Specify the metadata of this input, the value will be used by Splunk to index your data events.
- Choose a connection that you want to use for this input from the drop-down list under the Connection field. The drop-down list lists all the connections you have configured in DB Connect, you can also enter the connection name to search the connection you want to use.
Once you have select the connection, Splunk DB Connect will try to validate the connection, and will display an error message if it is not able to do so. You cannot continue the new input setup process unless a valid connection is specified.
- From the corresponding dialog menus, choose the Catalog, Schema, and Table that contain the data you want to pull into Splunk platform.
Specify SQL query
After you choose the table, the corresponding SQL query will be displayed in SQL Editor, you can preview the result of the query. If you need further edit or write your own SQL query, you can write it directly in SQL Editor and click Execute SQL to preview the result. You can make complex SQL statements easier to read by clicking Format SQL.
Note: If you want to use SQL query from template, select the template from the drop-down menu. Be aware that using a template will overwrite the previous SQL you specified.
Choose input type
Specify an input type for your query, either Batch or Rising. Then configure the related fields for the input type.
- Batch. A batch input invokes the same database query each time the input is run and returns all results.
- Rising. A rising input has a column that DB Connect uses to keep track of what rows are new from one input execution to the next.
A batch input invokes the same database query each time the input is run and returns all results. It does not keep track of whether rows have been added since the last time the input was run. Batch input mode is ideal for unchanging historical data that will be indexed into Splunk once. It can also be useful for re-indexing data that updates through time, such as personnel records or state tables, though this has license impact; lookups may be a better choice.
To create a batch input type,
- Select Batch under Input Type field.
- Choose the Timestamp column.
Specify which column contains the timestamp that will be used to order this data in the Splunk index. This value will be used to populate the
_time variable and is required for indexing time-series data.
- Current Index Time: Assigns indexed data a timestamp value that is equal to index time. Specify this option if your data has no timestamp.
- Choose Column: Select the column that contains the timestamp value. If this column is in a format which cannot be parsed as a timestamp by DB Connect, you will be shown a field where a Java SimpleDateTimeFormat compatible parser can be entered so that Splunk can understand the timestamp value.
- Column: This option only appears if you selected Choose Column in the previous option. Choose the column that contains the timestamps you want to use.
- Query timeout: Enter the number of seconds to wait for the query to complete. The default is 30 if you leave it blank.
A rising input has a column that DB Connect uses to keep track of what rows are new from one input execution to the next. When you create a rising input type, you must specify the rising column. You can specify rising column as any column whose value increases or decreases over time, such as a timestamp or sequential ID. For example, you can use columns such as row_id, transaction_id, employee_id, customer_id, last_updated, and so on.
Note that timestamps are not ideal for rising columns, though they often are the best available choice. Using a timestamp for rising column can produce the following problem conditions:
- A high rate of event generation can cause data duplication or loss, because checkpointing in a stream of timestamp-distinguished records assumes there is never more than one row created in a given time. If you set the time to a one second level of resolution and get five records per second, you lose or duplicate four records from every run.
- Clock skew, NTP skew corrections, physical moves between timezones, and daylight savings events can cause data mis-ordering, duplication, or loss. If the skew is towards the future, then the resulting checkpoint value may temporarily or permanently stop data collection.
- Non-numeric datetime values cannot be evaluated numerically, and lexical sorting can produce unpredictable results. If time series data is ordered lexically, then the resulting checkpoint value may temporarily or permanently stop data collection.
To create a rising input, select Rising and configure the following fields:
- Rising column: The rising column is the column from which DB Connect will update the checkpoint value each time the input is run.
- Checkpoint value: The checkpoint value is how DB Connect determines what rows are new from one input execution to the next. The first time the input is run, DB Connect will only select those rows that contain the value higher or lower than the value you specified in this column. Each time the input is finished running, DB Connect updates the input's checkpoint value with the value in the last row of the checkpoint column.
- Update the SQL query that includes a ? symbol and order by clause for the checkpoint value. The question mark (?) as the checkpoint placeholder and an order by clause. Every time the input is run, DB Connect replaces the question mark with the latest checkpoint value .
- From DB Connect 3 and later, the rising column checkpoints of the input are stored in
$SPLUNK_HOME/var/lib/splunk/modinputs/server/splunk_app_db_connect. For each input, it creates a separate checkpoint file.
- To see what column types (varchar, number, timestamp, and so on) are supported as rising column value types in DB Connect-supported databases, see supported rising column types by database.
With a rising column mode input, you could:
- Effectively create a "descending column," or a column wherein the checkpoint decreases every time the input runs.
SELECT abc, xyz FROM table WHERE abc < ? ORDER BY abc DSC
- Customize the comparison operation to determine the checkpoint value by evaluating a SQL function result, while simultaneously avoiding using query wrapping.
SELECT CONCAT(last_name, first_name) as NAME, ACTOR_ID, FIRST_NAME, LAST_NAME FROM actor WHERE CONCAT(last_name, first_name) > ? ORDER BY CONCAT(last_name, first_name)
- Customize the WHERE clause to add additional conditions other than the comparison expression that is specified in rising column inputs.
SELECT *FROM (SELECT * FROM ACTOR WHERE ACTOR_ID > ?) ACTOR JOIN FILM_ACTOR ON actor.ACTOR_ID = FILM_ACTOR.ACTOR_ID JOIN FILM ON FILM_ACTOR.FILM_ID = FILM.FILM_ID ORDER BY ACTOR.ACTOR_ID
- Use other advanced SQL features in the WHERE clause—for example a CASE statement.
After the input type is specified, click Next to go to the Set Properties page.
Configure the following fields,
- Name: Specify the name of the input. Be aware the input name cannot contain space or special characters.
- Description: Optional. The description of the input.
- Application: The name of the Splunk Enterprise app where this input object will be saved. By default, the pop-up menu is set to Splunk DB Connect. This menu enables other apps to use DB Connect inputs, outputs, and lookups within their own context.
Complete the following fields,
- Max Rows to Retrieve: Optional. The maximum number of rows to retrieve with each query. If you set this to 0 or leave it blank, it will be unlimited.
- Fetch size: Optional. The number of rows to return at a time from the database. The default is 300 if you leave it blank.
- Execution Frequency: The number of seconds or a valid cron expression i.e. 0 18 * * * (every day at 6PM).
The metadata fields are used by Splunk to index your data events. As you search the indexed data using Splunk Enterprise, data from the query can be identified by fields value you enter here.
- Host: Optional. The host defined on the connection will be used if you leave it blank.
- Source: Optional. The input name will be used if you leave it blank.
- Source type: Enter a sourcetype field value for Splunk Enterprise to assign to queried data as it is indexed. Click the field and enter a value, or choose an existing value from the menu that appears.
- Index: Enter an index value for the index in which you want Splunk Enterprise to store indexed data. You can enter the index name or choose it from the typeahead menu.
Note: If you want to use the customized index, you have to make sure the index exists in Splunk platform. Otherwise you have to create the index in Splunk Enterprise first to prevent data loss.
Edit database inputs
To see a list of the defined database inputs, first click the Data Lab>Inputs tab. You will see a list of your database inputs.
To edit a database input, click its name. You can make changes to a database input using the following buttons on the input page:
- Enable/Disable: Disable an input by clicking Enable/Disable here.
- Edit: Edit the input by clicking the name or the Edit button.
- Clone: Creates a copy of the input. You must give the copy a new name.
- Delete: Deletes the input.
You can also edit any of the attributes of a database input listed in Create a database input, except its name. To change the name of an input, clone it, give the clone the name you want, and then delete the original input.
Supported rising column types by database
The following matrix summarizes what column types (varchar, number, timestamp, and so on) are supported as rising column value types in DB Connect-supported databases.
|Database||varchar||int, float, real,
|AWS RDS Aurora||x||x||x||x||x|
|Microsoft SQL Server||x||x||x||x||x|
|SAP SQL Anywhere||x||x||x|
The column marked x means this column is supported.
* Information for Informix databases is not currently available.
Use database inputs
Once you've configured a database input and your data has been indexed by Splunk Enterprise, you can use that input just as you do any other data input you've defined in Splunk Enterprise. Use the Search Processing Language (SPL) to write a series of commands and arguments.
For a quick view of what an input returns to Splunk Enterprise, go to the database inputs page by clicking Data Lab>Inputs in the top navigation bar, click the name of the input to view, and then click the Find Events button on the right side of the screen. The search app opens with a pre-populated search that searches on your input.
You can search for keywords and use Boolean operators such as
NOT, plus wildcard characters ("
manufacturing (widgets AND gadgets OR gewgaw*)
Fields in Splunk Enterprise correspond to columns in your database. Search for fields using the syntax:
Chain search commands and arguments together using the pipe ("
|") character. For example, the following search retrieves indexed events from the database input
db2input that contain the term "manufacturing" and, for those events, reports the most common
manufacturing source="db2input" | top LOCATION
To refine your search further, show only the fields you want using the fields keyword. The following search only shows the five fields listed after fields, plus a timestamp, for each event retrieved:
source="db2input" | fields ADMRDEPT DEPTNAME DEPTNO LOCATION MGRNO
Of course, you can do much more with search in Splunk Enterprise. To learn more:
- Read the Search Tutorial. Since you've already gotten your data into Splunk Enterprise, start with Part 4, Using Splunk Search.
- See Searching and Reporting for a guide to documentation that is based on your level of familiarity with Splunk Enterprise and on what you want to do with your data.
- For information specific to SQL users, see SQL tips and tricks in this manual, and Splunk SPL for SQL users in the Splunk Enterprise Search Reference manual.
Create and manage database connections
Bulk operations of the database inputs
This documentation applies to the following versions of Splunk® DB Connect: 3.1.0, 3.1.1, 3.1.2
Feedback submitted, thanks!