Knowledge management agent
The FoundationaLLM (FLLM) Knowledge Management agent type supports the following scenarios:
- With an Inline Context: Knowledge Management agents with an Inline Context pass the user's prompt directly to the Large Language Model (LLM).
- Without an Inline Context: Knowledge Management agents without an Inline Context implement the Retrieval Augmented Generation (RAG) design pattern. RAG augments the user prompt with additional context to generate a more accurate response. The RAG flow uses a retrieval model to retrieve relevant documents from a knowledge base, such as a vector store, and then uses the retrieved documents to augment the user prompt before sending it to the LLM.
- The creation of a Knowledge Management agent without an Inline Context requires an existing knowledge base, such as a vector store. Use the Vectorization API to create a vector store prior to the creation of the agent.
Knowledge Management Agent Configuration
The Knowledge Management agent configuration may reference the following resources:
Vectorization text embedding profile: The text embedding profile contains the configuration of the text embedding model used to embed the user prompt and perform a vector search in the knowledge base. This must match the text embedding profile used to populate the knowledge base.
Vectorization indexing profile: The indexing profile contains the configuration of the service hosting the index.
Prompt: The system prompt of the agent, describing the persona of the agent.
Note: The Knowledge Management agent implementation currently supports the
AzureAISearchIndexer
indexing profile.
The structure of a Knowledge Management agent is the following:
{
"type": "knowledge-management",
"name": "<name>",
"object_id": "/instances/<instance_id>/providers/FoundationaLLM.Agent/agents/<name>",
"description": "<description>",
"display_name": "<display_name>",
"inline_context": true,
"vectorization": {
"dedicated_pipeline": "",
"data_source_object_id": "<data_source_object_id>",
"indexing_profile_object_id": "<indexing_profile_object_id>",
"text_embedding_profile_object_id": "<text_embedding_profile_object_id>",
"text_partitioning_profile_object_id": "<text_partitioning_profile_object_id>",
"vectorization_data_pipeline_object_id": "",
"trigger_type": "",
"trigger_cron_schedule": ""
},
"prompt_object_id": "<prompt_resource_objectid>",
"language_model": {
"type": "openai",
"provider": "microsoft",
"temperature": 0.0,
"use_chat": true,
"api_endpoint": "FoundationaLLM:AzureOpenAI:API:Endpoint",
"api_key": "FoundationaLLM:AzureOpenAI:API:Key",
"api_version": "FoundationaLLM:AzureOpenAI:API:Version",
"version": "FoundationaLLM:AzureOpenAI:API:Completions:ModelVersion",
"deployment": "FoundationaLLM:AzureOpenAI:API:Completions:DeploymentName"
},
"sessions_enabled": true,
"conversation_history": {
"enabled": true,
"max_history": 5
},
"gatekeeper": {
"use_system_setting": false,
"options": [
"ContentSafety",
"Presidio"
]
},
"orchestration_settings": {
"orchestrator": "LangChain",
"endpoint_configuration": {
"endpoint": "",
"api_version": "",
"api_key": "",
"auth_type": "",
"api_key": "",
"provider": "",
"operation_type": "chat"
},
"model_parameters": {
"deployment_name": ""
}
}
}
where:
<name>
is the name of the agent.<instance_id>
is the instance ID of the deployment.<description>
is the description of the agent. Ensure that this description details the purpose of the agent.<display_name>
controls the title of the agent in the Chat UI dropdown menu.<data_source_object_id>
is the object ID of the Data Source resource.<indexing_profile_object_id>
is the object ID of the indexing profile resource.<text_embedding_profile_object_id>
is the object ID of the text embedding profile resource.<text_partitioning_profile_object_id>
is the object ID of the text partitioning profile resource.<prompt_resource_objectid>
is the object ID of the prompt resource.
Parameter | Description |
---|---|
type |
The type of the agent - will always be knowledge-management . type must be the first key in the request body. |
name |
The name of the agent. |
object_id |
The object ID of the agent. Remove this element when creating an agent as this is generated by the Management API. |
description |
The description of the agent, ensure this description details the purpose of the agent. |
display_name |
The title of the agent in the Chat UI dropdown menu. This field is optional. |
inline_context |
Whether or not the agent has an Inline Context. |
vectorization |
The vectorization object is only required for Knowledge Management agents without an Inline Context (inline_context is false ). If the vectorization object is included, the indexing_profile_object_id and text_embedding_profile_object_id keys are required. |
vectorization.dedicated_pipeline |
A boolean indicating whether or not the agent has a dedicated Vectorization pipeline (implemented in an upcoming release). |
vectorization.data_source_object_id |
The object ID of the Data Source resource. |
vectorization.indexing_profile_object_id |
The object ID of the indexing profile resource. |
vectorization.text_embedding_profile_object_id |
The object ID of the text embedding profile resource. |
vectorization.text_partitioning_profile_object_id |
The object ID of the text partitioning profile resource. |
vectorization.vectorization_data_pipeline_object_id |
The resource ID of the agent's Vectorization pipeline (implemented in an upcoming release). |
vectorization.trigger_type |
The trigger type of the agent's Vectorization pipeline (implemented in an upcoming release). Permissible values are Manual , Schedule , and Event . |
vectorization.trigger_cron_schedule |
The schedule of the trigger in Cron format (implemented in an upcoming release). This property is valid only when trigger_type is Schedule . |
prompt_object_id |
The object ID of the prompt resource. |
language_model |
The language model configuration. The language_model object has been deprecated as of release 0.6.0. |
language_model.type |
The type of the language model. Currently supporting OpenAI based langauge models. |
language_model.provider |
The provider of the language model. Currently supporting microsoft or openai . |
language_model.temperature |
The temperature value for the language model. A value between 0 and 1. Values closer to 0 return more factual information whereas values closer to 1 yield more creative responses. |
language_model.use_chat |
Determines the type of language model to use, as an example, when using Microsoft's Azure OpenAI, specifying use_chat equal to true will use the AzureChatOpenAI model vs. the AzureOpenAI model in LangChain. |
language_model.api_endpoint |
The configuration setting key that houses the API endpoint of the language model. The example above uses default FLLM values. Ensure this value is populated in application configuration. |
language_model.api_key |
The configuration setting key that houses a reference to a key vault value containing the API key for the language model service. Ensure these values are populated in key vault and app configuration. |
language_model.api_version |
The configuration setting key that houses the API version of the language model. The example above uses default FLLM values. Ensure this value is populated in application configuration. |
language_model.version |
The configuration setting key that houses the version of the language model deployment. The example above uses default FLLM values. Ensure this value is populated in application configuration. |
language_model.deployment |
The configuration setting key that houses the name given to the deployed language model. The example above uses default FLLM values. Ensure this value is populated in application configuration. |
sessions_enabled |
A boolean value that indicates whether the agent is session-less (false) or supports sessions(true). |
conversation_history |
The conversation history configuration. |
conversation_history.enabled |
Indicates if conversation history is retained for subsequent agent interactions(true). |
conversation_history.max_history |
indicates the number of messages to be retained. |
gatekeeper |
The gatekeeper configuration. |
gatekeeper.use_system_setting |
Indicates if the system settings are used for the gatekeeper. |
gatekeeper.options |
Contains the list of gatekeeper options. The sample provided overrides the system setting for gatekeeper and enables Azure Content Safety and MS Presidio in the messaging pipeline. |
orchestration_settings |
The settings for the agent orchestrator. |
orchestration_settings.orchestrator |
FoundationaLLM currently supports LangChain and SemanticKernel for both types of Knowledge Management agents; however, Knowledge Management agents with an Inline Context can also use the AzureOpenAIDirect and AzureAIDirect orchestrators. |
orchestration_settings.endpoint_configuration |
The endpoint configuration of the hosted LLM. FoundationaLLM currently supports Azure OpenAI and OpenAI. |
orchestration_settings.endpoint_configuration.endpoint |
The endpoint URL of the hosted LLM. The URL should be provided directly for the LangChain or SemanticKernel orchestrators; it should be provided as an Azure App Configuration key reference for the AzureOpenAIDirect or AzureAIDirect orchestrators. |
orchestration_settings.endpoint_configuration.api_version |
The API version of the hosted LLM. For Azure OpenAI, this value should be set to the latest GA version. The API version should be provided directly for the LangChain or SemanticKernel orchestrators; it should be provided as an Azure App Configuration key reference for the AzureOpenAIDirect or AzureAIDirect orchestrators. |
orchestration_settings.endpoint_configuration.auth_type |
The authentication method of the hosted LLM. This value can either be token or key . For Azure OpenAI deployments, this value should be token , which configures the orchestrator to use Managed Identities for authentication. key -based authentication uses API keys. |
orchestration_settings.endpoint_configuration.api_key |
The name of the Azure App Configuration key storing the LLM endpoint API key. This parameter is required if auth_type is set to key . |
orchestration_settings.endpoint_configuration.provider |
The provider of the hosted LLM. FoundationaLLM currently supports microsoft (Azure OpenAI) or openai . |
orchestration_settings.endpoint_configuration.operation_type |
This field is set to chat by default and can be omitted. |
orchestration_settings.model_parameters |
Endpoint-specific model parameters. This field must be non-null if the provider is microsoft . |
orchestration_settings.model_parameters.deployment_name |
This field should be set to the name of the Azure OpenAI model deployment if the provider is microsoft . |
AzureOpenAIDirect Orchestrator
The AzureOpenAIDirect
orchestrator passes the user's prompt to an LLM deployed in an instance of Azure OpenAI Service, bypassing LangChain or Semantic Kernel.
Example Configuration:
{
"orchestration_settings": {
"orchestrator": "AzureOpenAIDirect",
"endpoint_configuration": {
"endpoint": "FoundationaLLM:AzureOpenAI:API:Endpoint",
"api_version": "FoundationaLLM:AzureOpenAI:API:Version",
"auth_type": "key",
"api_key": "FoundationaLLM:AzureOpenAI:API:Key",
"operation_type": "chat"
},
"model_parameters": {
"deployment_name": "completions"
}
}
}
Note:
AzureOpenAIDirect
is only compatible with Knowledge Management agents with an Inline Context.
AzureAIDirect Orchestrator
The AzureAIDirect
orchestrator passes the user's prompt to an LLM deployed as an Azure AI Studio real-time endpoint. This orchestrator allows customers to use a wider range of LLMs with FLLM agents.
Example Configuration:
{
"orchestration_settings": {
"orchestrator": "AzureAIDirect",
"endpoint_configuration": {
"endpoint": "<AZURE APP CONFIGURATION KEY>",
"api_key": "<AZURE APP CONFIGURATION KEY>"
},
"model_parameters": {
"temperature": 0.8,
"max_new_tokens": 1000,
"deployment_name": "<AZURE AI STUDIO DEPLOYMENT NAME>"
}
}
}
Note:
AzureAIDirect
is only compatible with Knowledge Management agents with an Inline Context.
Managing Knowledge Management Agents
This section describes how to manage knowledge management agents using the Management API. {{baseUrl}}
is the base URL of the Management API. {{instanceId}}
is the unique identifier of the FLLM instance.
Retrieve
HTTP GET {{baseUrl}}/instances/{{instanceId}}/providers/FoundationaLLM.Agent/agents
Create or update
HTTP POST {{baseUrl}}/instances/{{instanceId}}/providers/FoundationaLLM.Agent/agents/<name>
Content-Type: application/json
BODY
<agent_configuration>
where <agent_configuration>
is the JSON agent configuration structure described above.
Delete
HTTP DELETE {{baseUrl}}/instances/{{instanceId}}/providers/FoundationaLLM.Agent/agents/<name>
Note
FLLM currently implements logical deletes for Knowledge Management agents. This means that users cannot create a Knowledge Management agent with the same name as a deleted Knowledge Management agent. Support for purging Knowledge Management agents will be added in a future release.
Validating a Knowledge Management Agent
Once configured, the knowledge management agent can be validated using an API call to the Core API or via the User Portal.
Note
It can take up to 5 minutes for a new Knowledge Management agent to appear in the User Portal or be accessible for requests from the Core API.
Overriding agent parameters
The agent parameters can be overridden at the time of the API call. Refer to the Core API documentation for more information.