Use fields to search
Use fields to search
This topic assumes you know how to run simple searches and use the time range picker and timeline. If you're not sure, review the previous topics, beginning with Start searching.
You can learn a lot about your data from just running ad hoc searches, using nothing more than keywords and the time range. But you can't take full advantage of Splunk's more advanced searching and reporting features without understanding what fields are and how to use them. This part of the tutorial will familiarize you with:
- default fields and other fields that Splunk automatically extracts
- using the fields sidebar and Fields dialog to find helpful fields
- searching with fields
Let's return to the happenings at the online Flower and Gift shop. You spent the morning investigating some general issues and reporting the problems you found to other teams. You feel pretty good about what you've learned about the online shop and its customers, but you want to capture this and share it with your team.
The best way to do this is to use fields.
Briefly, about fields
What are fields
Fields exist in machine data in many forms. Often, a field is a value (with a fixed, delimited position on the line) or a name and value pair, where there is a single value to each field name. A field can also be multivalued; that is, it appears more than once in an event and has a different value for each appearance.
In Splunk, fields are searchable name/value pairings that distinguish one event from another because not all events will have the same fields and field values. Fields enable you to write more tailored searches to retrieve the specific events that you want. Fields also enable you to take advantage of the search language, create charts, and build reports.
Some examples of fields are
clientip for IP addresses accessing your Web server,
_time for the timestamp of an event, and
host for domain name of a server. One of the more common examples of multivalue fields is email address fields. While the "From" field will contain only a single email address, the "To" and "Cc" fields may have one or more email addresses associated with them.
For more information (and there's a lot more), read About fields in the Knowledge Manager manual.
Splunk extracts fields from event data twice. It extracts default and other indexed fields during event processing when that data is indexed. And it extracts a different set of fields at search time, when you run a search. Read more about "Index time versus search time" in the Managing Indexers and Clusters manual.
At index time, Splunk automatically finds and extracts default fields for each event it processes. These fields include
sourcetype (which you should already be familiar with). For a complete list of the default fields, see "Use default fields" in the Knowledge Manager Manual.
Splunk also extracts certain fields at search time--when you run a search. You'll see some examples of these searches later. For more information, read the "Overview of search-time field extractions" in the Knowledge Manager manual.
1. Go back to the Search dashboard and search for web access activity. Select Other > Yesterday from the time range picker:
You were actually using fields all along! Each time you searched for
sourcetype=access_*, you told Splunk to only retrieve events from your web access logs and nothing else.
To search for a particular field, specify the field name and value:
sourcetype is a field name and
access_combined_wcookie is a field value. Here, the wildcarded value is used to match all field values beginning with
access_ (which would include access_common, access_combined, and access_combined_wcookie) .
Note: Field names are case sensitive, but field values are not!
2. Scroll through the search results.
If you're familiar with the access_combined format of Apache logs, you will recognize some of the information in each event, such as:
- IP addresses for the users accessing the website.
- URIs and URLs for the page request and referring page.
- HTTP status codes for each page request.
- Page request methods.
As Splunk retrieves these events, the Fields sidebar updates with selected fields and interesting fields. These are the fields that Splunk extracted from your data.
Notice that default fields host, source, and sourcetype are selected fields and are displayed in your search results:
3. Scroll through interesting fields to see what else Splunk extracted.
You should recognize the field names that apply to the Web access logs. For example, there's
clientip, method, and status. These are not default fields; they have (most likely) been extracted at search time.
4. Click the Edit link in the fields sidebar.
The Fields dialogue opens and displays all the fields that Splunk extracted.
- Available Fields are the fields that Splunk identified from the events in your current search (some of these fields were listed under interesting fields).
- Selected Fields are the fields you picked (from the available fields) to show in your search results (by default,
host, source, and sourcetypeare selected).
5. Scroll through the list of Available Fields.
You're already familiar with the fields that Splunk extracted from the Web access logs based on your search. You should also see other default fields that Splunk defined--some of these fields are based on each event's
timestamp (everything beginning with
date_*), punctuation (
punct), and location (
But, you should also notice other extracted fields that are related to the online store. For example, there are
product_id. From conversations with your coworker, you may know that these fields are:
|action||what a user does at the online shop.|
|category_id||the type of product a user is viewing or buying.|
|product_id||the catalog number of the product the user is viewing or buying.|
6. From the Available fields list, select action, category_id, and product_id.
7. Click Save.
When you return to the Search view, the fields you selected will be included in your search results if they exist in that particular event. Different events will have different fields.
The fields sidebar doesn't just show you what fields Splunk has captured from your data. It also displays how many values exist for each of these fields. For the fields you just selected, there are 2 for action, 5 for category_id, and 9 for product_id. This doesn't mean that these are all the values that exist for each of the fields--these are just the values that Splunk knows about from the results of your search.
What are some of these values?
8. Under selected fields, click action for the
This opens the field summary for the action field.
This window tells you that, in this set of search results, Splunk found two values for
action and they are
update. Also, it tells you that the
action field appears in 71% of your search results. This means that three-quarters of the Web access events are related to the purchase of an item or an update (of the item quantity in the cart, perhaps).
9. Close this window and look at the other two fields you selected,
category_id (what types of products the shop sells) and
product_id (specific catalog names for products).
Now you know a little bit more about the information in your data relating to the online Flower and Gift shop. The online shop sells a selection of flowers, gifts, plants, candy, and balloons. Let's use these fields,
product_id, to see what people are buying.
Use fields to run more targeted searches
These next two examples compares the results when searching with and without fields.
Return to the search you ran to check for errors in your data. Select Other > Yesterday from the time range picker:
error OR failed OR severe OR (sourcetype=access_* (404 OR 500 OR 503))
Run this search again, but this time, use fields in your search.
The HTTP error codes are values of the
status field. Now your search looks like this:
error OR failed OR severe OR (sourcetype=access_* (status=404 OR status=500 OR status=503))
Notice the difference in the count of events between the two searches--because it's a more targeted search, the second search returns fewer events.
When you run simple searches based on arbitrary keywords, Splunk matches the raw text of your data. When you add fields to your search, Splunk looks for events that have those specific field/value pairs.
Before you learned about the fields in your data, you might have run this search to see how many times flowers were purchased from the online shop:
sourcetype=access_* purchase flower*
As you typed in "flower", search assistant shows you both "flower" and "flowers' in the typeahead. Since you don't know which is the one you want, you use the wildcard to match both.
If you scroll through the (many) search results, you'll see that some of the events have
category_id that have a value other than
flowers. These are not events that you wanted!
Run this search instead. Select Other > Yesterday from the time range picker:
sourcetype=access_* action=purchase category_id=flower*
For the second search, even though you still used the wildcarded word "flower*", there is only one value of
category_id that it matches (
Notice the difference in the number of events that Splunk retrieved for each search; the second search returns significantly fewer events. Searches with fields are more targeted and retrieves more exact matches against your data.
Now that you know how to use fields, you can start using the search language to filter, modify, reorder, and group your search results. When you're ready, proceed to the next topic and learn how to use the search language.