Splunk® App for Data Science and Deep Learning

Use the Splunk App for Data Science and Deep Learning

Query LLM with vector data

After pulling the LLM model to your local Docker container and encoding document or log data into the vector database, you can carry out inferences using the LLM. See, Encode data into a vector database.

Before you begin make sure you have encoded some documents or log data into the vector database.

To search LLM with VectorDB, complete these tasks:

  • Standalone LLM: A one-shot Q&A agent to answer user's questions based on prior knowledge within the training data.
  • RAG-based LLM: Uses additional knowledge that has been encoded in the vector database.
  • LLM with Function Calling: Runs predefined functions to acquire additional information and generate answers.
  • Manage your LLMs: List, pull, and delete LLM models in your on-premises environment.

Standalone LLM

You can use this standalone LLM to conduct text-based classification or summarization by passing the text field to the algorithm along with a prompt that states the task.

Go to the Standalone LLM page:

  1. In DSDL, go to Assistants.
  2. Select LLM-RAG, then Querying LLM with Vector Data, and then Standalone LLM.

Parameters

The Standalone LLM page has the following parameters:

Parameter name Description
model_name The name of an LLM model that exists in your environment.
prompt A prompt that explains the task to the LLM model.

Run the fit or compute command

Use the following syntax to run the fit command or the compute command:

  • Run the fit command:
    | makeresults | eval text = "Email text: Click to win prize"
    | fit MLTKContainer algo=llm_rag_ollama_text_processing model_name="llama3" prompt="You will examine if the following email content is phishing." text into app:llm_rag_ollama_text_processing
    
  • Run the compute command:

    Make sure you append the compute command to a search pipeline that generates a table with a field called text to use Standalone LLM.

    | makeresults | eval text = "Email text: Click to win prize"
    | compute algo:llm_rag_ollama_text_processing model_name:llama3 prompt:"You will examine if the following email content is phishing." text
    

Dashboard view

The following image shows the dashboard view for the Standalone LLM page:

This image shows an example of an Inference with Standalone LLM dashboard.

The dashboard view includes the following components:

Dashboard component Description
Search bar Create a search pipeline that generates a table with a field called text that contains the text you want to search.


If there is no specific text to use, keep the search string as it is and search on that instead.

Select LLM Model The name of an LLM model that exists in your environment. If no model is shown in the dropdown menu, go to the LLM management page to pull models.
Prompt Write a prompt explaining the task to the LLM. For example, "Is the following email phishing?"
Run Inference Start searching after finishing all the inputs.
Refresh Page Reset all the tokens on this dashboard.
Return to Menu Return to the main menu.

RAG-based LLM

In the Splunk App for Data Science and Deep Learning (DSDL), navigate to Assistants, then LLM-RAG, then Querying LLM with Vector Data, and then RAG-based LLM.

Parameters

The RAG-based LLM page has the following parameters:

Parameter name Description
model_name The name of an LLM model that exists in your environment.
embedder_name The name of the sentence-transformers embedding model. Use all-MiniLM-L6-v2 for English and use intfloat/multilingual-e5-large for other languages. The embedder must be consistent with the one used for encoding.
use_local Whether the embedding models are stored locally. When set to 0, the embedder model is downloaded from the internet. When set to 1, the command assumes that the embedder model files are stored in /srv/app/model/data/. For more details, see Set up LLM-RAG in an air-gapped environment.
embedder_dimension Dimensionality of the vector produced by the embedder model. Set to 384 for all-MiniLM-L6-v2 and 1024 for intfloat/multilingual-e5-large.
collection_name A unique name of the collection to store the vectors in. The name must start with a letter or number and contain no spaces. If you are adding data to an existing collection, make sure to use the same embedder model.
top_k Number of document pieces to retrieve for generation.
rag_type Type of data in the vector collection to search on. Use Documents if using document data. Use Logs if using log data.

Run the fit or compute command

Use the following syntax to run the fit command or the compute command:

  • Run the fit command:
    | makeresults 
    | eval query = "Tell me more about the Buttercup online store architecture" 
    | fit MLTKContainer algo=llm_rag_script model_name="llama3" embedder_name="all-MiniLM-L6-v2" use_local=0 embedder_dimension=384 collection_name="document_collection_example" top_k=5 rag_type=Documents query into app:llm_rag_script as RAG
    
  • Run the compute command:
    | makeresults 
    | eval query = "Tell me more about the Buttercup online store architecture" 
    | compute algo:llm_rag_script model_name:llama3 embedder_name:"all-MiniLM-L6-v2" use_local:1 embedder_dimension:384 collection_name:"document_collection_example" top_k:5 rag_type:Documents query
    

Dashboard view

The following image shows the dashboard view for the RAG-based LLM page:

This image shows an example of an LLM with Retrieval-Augmented Generation (RAG) dashboard.

The dashboard view includes the following components:

Dashboard component Description
Collection Name An existing collection you want to use for the LLM-RAG.
Embedder Name Choose between English and Multilingual based on your case. The embedder model must be consistent with the one used for creating the selected collection.
Select LLM Model The name of an LLM model that exists in your environment. If no model is shown in the dropdown menu, go to the LLM management page to pull models.
Use Local Embedder Whether the embedder has been downloaded and saved on your Docker volume.
Number of docs to retrieve Number of document pieces or log messages you wish to use in the RAG.
Input your query Write your search in the text box.
Next Submit the inputs and move on to search input.
Query Select after entering your search and start Retrieval-Augmented Generation (RAG).
Refresh Page Reset all the tokens on this dashboard.
Return to Menu Return to the main menu.

LLM with Function Calling

There are 2 built-in function tools for the model to use:

  • Function 1: Search Splunk events
  • Function 2: Search Milvus database

You can configure which function to enable within the command. Use customization for specific use cases.

In the Splunk App for Data Science and Deep Learning (DSDL), navigate to Assistants, then LLM-RAG, then Querying LLM with Vector Data, and then LLM with Function Calling.


Parameters

The LLM with Function Calling page has the following parameters:

Parameter name Description
prompt Search for the LLM in natural language.
model_name The name of an LLM model that exists in your environment.
func1 Whether to use function 1 which is a Splunk search. Set to 1 to turn on and 0 to turn off.
func2 Whether to use function 2 which is a Milvus search. Set to 1 to turn on and 0 to turn off.

Run the fit or compute command

Use the following syntax to run the fit command or the compute command:

  • Run the fit command:
    | makeresults 
    | fit MLTKContainer algo=llm_rag_function_calling prompt="Search Splunk for index _internal and sourcetype splunkd for events containing keyword error from 60 minutes ago to 30 minutes ago. Tell me how many events occurred" model_name="llama3" func1=1 func2=0 _time into app:llm_rag_function_calling as RAG
    
  • Run the compute command:
    | makeresults 
    | compute algo:llm_rag_function_calling prompt:"Search Splunk for index _internal and sourcetype splunkd for events containing keyword error from 60 minutes ago to 30 minutes ago. Tell me how many events occurred" model_name:mistral func1:1 func2:0 _time
    

Dashboard view

The following image shows the dashboard view for the LLM with Function Calling page:

This image shows an example of an LLM with Function Calling dashboard.

The dashboard view includes the following components:

Dashboard component Description
Select LLM Model The name of an LLM model that exists in your environment. If no model is shown in the dropdown menu, go to the LLM management page to pull models.
Function - Splunk Event Search Whether to use the function tool for searching Splunk events.

The remote search token setting on the DSDL setup page is required for this function to work properly.

Function - VectorDB Record Search Whether to use the function tool for searching vectorDB collections.
Input your query Write your search in the text box.
Next Submit the inputs and move on to search input.
Query Select after entering your search and start Retrieval-Augmented Generation (RAG).
Refresh Page Reset all the tokens on this dashboard.
Return to Menu Return to the main menu.

Manage your LLMs

Navigate to manage your LLMs as follows: Assistants > LLM-RAG > Querying LLM with Vector Data > Manage your LLMs

Parameters

The Manage your LLMs page has the following parameters:

Parameter name Description
task The specific task for management. Choose PULL to download a model and DELETE to delete a model.
model_type Type of model to pull or delete. Choose from LLM and embedder_model.

Downloaded embedder models are stored under /srv/app/model/data.

model_name The specific model name. This field is required for downloading or deleting models.

Run the fit or compute command

Use the following syntax to run the fit command or the compute command:

  • Run the fit command:
    | makeresults 
    | fit MLTKContainer algo=llm_rag_ollama_model_manager task=pull model_type=LLM model_name=mistral _time into app:llm_rag_ollama_model_manager
    
  • Run the compute command:
    | makeresults 
    | makeresults 
    | compute algo:llm_rag_ollama_model_manager task:list model_type:LLM _time
    

Dashboard view

The following image shows the dashboard view for the Manage your LLMs page:

This image shows an example of an LLM Model Management dashboard.

The dashboard view includes the following components:

Dashboard component Description
Task The task to perform. Choose PULL to download a model and DELETE to delete a model.
Model Type Type of model to pull or delete. Choose from LLM and embedder_model. The downloaded embedder models are saved under /srv/app/model/data.
Model Name The name of the model to perform a certain task on.
Submit Select to run the task.
Refresh Page Reset all the tokens on this dashboard.
Return to Menu Return to the main menu.
Last modified on 25 January, 2025
Encode data into a vector database   Troubleshoot the Splunk App for Data Science and Deep Learning

This documentation applies to the following versions of Splunk® App for Data Science and Deep Learning: 5.2.0


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters