Splunk® Data Stream Processor

Function Reference

Acrobat logo Download manual as PDF

Acrobat logo Download topic as PDF

Sentiment Analysis

Sentiment Analysis generates a label on unstructured text input using natural language processing. The Sentiment Analysis function classifies raw text as positive, negative, or neutral. Raw text input can include data streams such as messages, customer reviews, IT tickets, or computer logs. The function predicts the classified output label on observed text samples in real time on the stream. The user may provide optional sentiment labels, which can be used to improve the model incrementally as each new labelled example is ingested and observed over time.

Sentiment Analysis is useful for downstream processes such as flagging negative customer reviews for contact by a customer service representative in a customer service application.

Function Input/Output Schema

Function Input
collection<record<R>>
This function takes in collections of records with schema R.
Function Output
collection<record<S>>
This function outputs collections of records with schema S.

Syntax

The required fields are in bold.

| analyze_sentiment value="input"

Required arguments

input
Syntax: string
Description: Name of column containing the free text. For example, reviews, tweets etc.
Example: ucast(map_get('json-map', "reviewText"), "string", null)

Optional arguments

label
Syntax: double
Description: If the label is given (-1 for negative and 1 for positive), the labeled text is used to update the model. If the label is 0 or not present, the model makes an inference for the sentiment of the text without making any change to the model.
Example: cast(label1, "double")

Usage

For each observed data point, Sentiment Analysis computes and outputs a probability score between 0 and 1. The closer the probability score is to 1, the more likely the sentiment is negative.

A threshold should be applied to the Sentiment Analysis probability score to classify each sample as positive, negative, or neutral. For example, the following thresholds can typically be applied for each class of labels:

  • < 0.25 can be applied to label positive sentiment
  • 0.25 < p < 0.75 can be applied to label neutral sentiment
  • > 0.75 can be applied to label negative sentiment

Sentiment Analysis is useful to monitor the overall feeling and sentiment of text over time. Detecting sudden changes in overall sentiment may indicate an important shift in user behavior or customer satisfaction. It is possible to apply downstream operators, like Drift Detection or Anomaly Detection, on the output of the Sentiment Analysis model.

For example, it is possible to identify shifts in average sentiment in a stream of user feedback by applying the Drift Detection model to the average sentiment score on a rolling window. It is also possible to identify outliers that represent extremely negative or extremely positive reviews by applying an anomaly detection model, like Adaptive Thresholding, to the numeric stream stream of sentiment scores

SPL2 example

The following example uses Sentiment Analysis on review text:

| from splunk_firehose() 
| eval json=cast(body, "string"), 
	'json-map'=from_json_object(json), 
	input=ucast(map_get('json-map', "reviewText"), "string", null),
	label1=ucast(map_get('json-map', "label"), "integer", null),
	label=cast(label1, "double"),
	key="" 
| analyze_sentiment value="input";
Last modified on 30 October, 2020
PREVIOUS
Select
  NEXT
Sequential Outlier Detection

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0


Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters