Knowledge management agent

The FoundationaLLM (FLLM) Knowledge Management agent type supports the following scenarios:

With an Inline Context: Knowledge Management agents with an Inline Context pass the user's prompt directly to the Large Language Model (LLM).
Without an Inline Context: Knowledge Management agents without an Inline Context implement the Retrieval Augmented Generation (RAG) design pattern. RAG augments the user prompt with additional context to generate a more accurate response. The RAG flow uses a retrieval model to retrieve relevant documents from a knowledge base, such as a vector store, and then uses the retrieved documents to augment the user prompt before sending it to the LLM.
- The creation of a Knowledge Management agent without an Inline Context requires an existing knowledge base, such as a vector store. Use the Vectorization API to create a vector store prior to the creation of the agent.

Knowledge Management Agent Configuration

The Knowledge Management agent configuration may reference the following resources:

Vectorization text embedding profile: The text embedding profile contains the configuration of the text embedding model used to embed the user prompt and perform a vector search in the knowledge base. This must match the text embedding profile used to populate the knowledge base.
Vectorization indexing profile: The indexing profile contains the configuration of the service hosting the index.
Prompt: The system prompt of the agent, describing the persona of the agent.

Note: The Knowledge Management agent implementation currently supports the AzureAISearchIndexer indexing profile.

The structure of a Knowledge Management agent is the following:

{
  "type": "knowledge-management",
  "name": "<name>",
  "object_id": "/instances/<instance_id>/providers/FoundationaLLM.Agent/agents/<name>",
  "description": "<description>",
  "display_name": "<display_name>",
  "inline_context": true,
  "vectorization": {
    "dedicated_pipeline": "",
    "data_source_object_id": "<data_source_object_id>",
    "indexing_profile_object_id": "<indexing_profile_object_id>",
    "text_embedding_profile_object_id": "<text_embedding_profile_object_id>",
    "text_partitioning_profile_object_id": "<text_partitioning_profile_object_id>",
    "vectorization_data_pipeline_object_id": "",
    "trigger_type": "",
    "trigger_cron_schedule": ""
  },
  "prompt_object_id": "<prompt_resource_objectid>",
  "language_model": {
    "type": "openai",
    "provider": "microsoft",
    "temperature": 0.0,
    "use_chat": true,
    "api_endpoint": "FoundationaLLM:AzureOpenAI:API:Endpoint",
    "api_key": "FoundationaLLM:AzureOpenAI:API:Key",
    "api_version": "FoundationaLLM:AzureOpenAI:API:Version",
    "version": "FoundationaLLM:AzureOpenAI:API:Completions:ModelVersion",
    "deployment": "FoundationaLLM:AzureOpenAI:API:Completions:DeploymentName"
  },
  "sessions_enabled": true,
  "conversation_history": {
    "enabled": true,
    "max_history": 5
  },
  "gatekeeper": {
    "use_system_setting": false,
    "options": [
      "ContentSafety",
      "Presidio"
    ]
  },
  "orchestration_settings": {
    "orchestrator": "LangChain",
    "endpoint_configuration": {
      "endpoint": "",
      "api_version": "",
      "api_key": "",
      "auth_type": "",
      "api_key": "",
      "provider": "",
      "operation_type": "chat"
    },
    "model_parameters": {
      "deployment_name": ""
    }
  }
}

where:

<name> is the name of the agent.
<instance_id> is the instance ID of the deployment.
<description> is the description of the agent. Ensure that this description details the purpose of the agent.
<display_name> controls the title of the agent in the Chat UI dropdown menu.
<data_source_object_id> is the object ID of the Data Source resource.
<indexing_profile_object_id> is the object ID of the indexing profile resource.
<text_embedding_profile_object_id> is the object ID of the text embedding profile resource.
<text_partitioning_profile_object_id> is the object ID of the text partitioning profile resource.
<prompt_resource_objectid> is the object ID of the prompt resource.

Parameter	Description
`type`	The type of the agent - will always be `knowledge-management`. `type` must be the first key in the request body.
`name`	The name of the agent.
`object_id`	The object ID of the agent. Remove this element when creating an agent as this is generated by the Management API.
`description`	The description of the agent, ensure this description details the purpose of the agent.
`display_name`	The title of the agent in the Chat UI dropdown menu. This field is optional.
`inline_context`	Whether or not the agent has an Inline Context.
`vectorization`	The `vectorization` object is only required for Knowledge Management agents without an Inline Context (`inline_context` is `false`). If the `vectorization` object is included, the `indexing_profile_object_id` and `text_embedding_profile_object_id` keys are required.
`vectorization.dedicated_pipeline`	A boolean indicating whether or not the agent has a dedicated Vectorization pipeline (implemented in an upcoming release).
`vectorization.data_source_object_id`	The object ID of the Data Source resource.
`vectorization.indexing_profile_object_id`	The object ID of the indexing profile resource.
`vectorization.text_embedding_profile_object_id`	The object ID of the text embedding profile resource.
`vectorization.text_partitioning_profile_object_id`	The object ID of the text partitioning profile resource.
`vectorization.vectorization_data_pipeline_object_id`	The resource ID of the agent's Vectorization pipeline (implemented in an upcoming release).
`vectorization.trigger_type`	The trigger type of the agent's Vectorization pipeline (implemented in an upcoming release). Permissible values are `Manual`, `Schedule`, and `Event`.
`vectorization.trigger_cron_schedule`	The schedule of the trigger in Cron format (implemented in an upcoming release). This property is valid only when `trigger_type` is `Schedule`.
`prompt_object_id`	The object ID of the prompt resource.
`language_model`	The language model configuration. The `language_model` object has been deprecated as of release 0.6.0.
`language_model.type`	The type of the language model. Currently supporting OpenAI based langauge models.
`language_model.provider`	The provider of the language model. Currently supporting `microsoft` or `openai`.
`language_model.temperature`	The temperature value for the language model. A value between 0 and 1. Values closer to 0 return more factual information whereas values closer to 1 yield more creative responses.
`language_model.use_chat`	Determines the type of language model to use, as an example, when using Microsoft's Azure OpenAI, specifying `use_chat` equal to true will use the AzureChatOpenAI model vs. the AzureOpenAI model in LangChain.
`language_model.api_endpoint`	The configuration setting key that houses the API endpoint of the language model. The example above uses default FLLM values. Ensure this value is populated in application configuration.
`language_model.api_key`	The configuration setting key that houses a reference to a key vault value containing the API key for the language model service. Ensure these values are populated in key vault and app configuration.
`language_model.api_version`	The configuration setting key that houses the API version of the language model. The example above uses default FLLM values. Ensure this value is populated in application configuration.
`language_model.version`	The configuration setting key that houses the version of the language model deployment. The example above uses default FLLM values. Ensure this value is populated in application configuration.
`language_model.deployment`	The configuration setting key that houses the name given to the deployed language model. The example above uses default FLLM values. Ensure this value is populated in application configuration.
`sessions_enabled`	A boolean value that indicates whether the agent is session-less (false) or supports sessions(true).
`conversation_history`	The conversation history configuration.
`conversation_history.enabled`	Indicates if conversation history is retained for subsequent agent interactions(true).
`conversation_history.max_history`	indicates the number of messages to be retained.
`gatekeeper`	The gatekeeper configuration.
`gatekeeper.use_system_setting`	Indicates if the system settings are used for the gatekeeper.
`gatekeeper.options`	Contains the list of gatekeeper options. The sample provided overrides the system setting for gatekeeper and enables Azure Content Safety and MS Presidio in the messaging pipeline.
`orchestration_settings`	The settings for the agent orchestrator.
`orchestration_settings.orchestrator`	FoundationaLLM currently supports `LangChain` and `SemanticKernel` for both types of Knowledge Management agents; however, Knowledge Management agents with an Inline Context can also use the `AzureOpenAIDirect` and `AzureAIDirect` orchestrators.
`orchestration_settings.endpoint_configuration`	The endpoint configuration of the hosted LLM. FoundationaLLM currently supports Azure OpenAI and OpenAI.
`orchestration_settings.endpoint_configuration.endpoint`	The endpoint URL of the hosted LLM. The URL should be provided directly for the `LangChain` or `SemanticKernel` orchestrators; it should be provided as an Azure App Configuration key reference for the `AzureOpenAIDirect` or `AzureAIDirect` orchestrators.
`orchestration_settings.endpoint_configuration.api_version`	The API version of the hosted LLM. For Azure OpenAI, this value should be set to the latest GA version. The API version should be provided directly for the `LangChain` or `SemanticKernel` orchestrators; it should be provided as an Azure App Configuration key reference for the `AzureOpenAIDirect` or `AzureAIDirect` orchestrators.
`orchestration_settings.endpoint_configuration.auth_type`	The authentication method of the hosted LLM. This value can either be `token` or `key`. For Azure OpenAI deployments, this value should be `token`, which configures the orchestrator to use Managed Identities for authentication. `key`-based authentication uses API keys.
`orchestration_settings.endpoint_configuration.api_key`	The name of the Azure App Configuration key storing the LLM endpoint API key. This parameter is required if `auth_type` is set to `key`.
`orchestration_settings.endpoint_configuration.provider`	The provider of the hosted LLM. FoundationaLLM currently supports `microsoft` (Azure OpenAI) or `openai`.
`orchestration_settings.endpoint_configuration.operation_type`	This field is set to `chat` by default and can be omitted.
`orchestration_settings.model_parameters`	Endpoint-specific model parameters. This field must be non-null if the `provider` is `microsoft`.
`orchestration_settings.model_parameters.deployment_name`	This field should be set to the name of the Azure OpenAI model deployment if the `provider` is `microsoft`.

AzureOpenAIDirect Orchestrator

The AzureOpenAIDirect orchestrator passes the user's prompt to an LLM deployed in an instance of Azure OpenAI Service, bypassing LangChain or Semantic Kernel.

Example Configuration:

{
  "orchestration_settings": {
    "orchestrator": "AzureOpenAIDirect",
    "endpoint_configuration": {
      "endpoint": "FoundationaLLM:AzureOpenAI:API:Endpoint",
      "api_version": "FoundationaLLM:AzureOpenAI:API:Version",
      "auth_type": "key",
      "api_key": "FoundationaLLM:AzureOpenAI:API:Key",
      "operation_type": "chat"
    },
    "model_parameters": {
      "deployment_name": "completions"
    }
  }
}

Note: AzureOpenAIDirect is only compatible with Knowledge Management agents with an Inline Context.

AzureAIDirect Orchestrator

The AzureAIDirect orchestrator passes the user's prompt to an LLM deployed as an Azure AI Studio real-time endpoint. This orchestrator allows customers to use a wider range of LLMs with FLLM agents.

Example Configuration:

{
  "orchestration_settings": {
    "orchestrator": "AzureAIDirect",
    "endpoint_configuration": {
      "endpoint": "<AZURE APP CONFIGURATION KEY>",
      "api_key": "<AZURE APP CONFIGURATION KEY>"
    },
    "model_parameters": {
      "temperature": 0.8,
      "max_new_tokens": 1000,
      "deployment_name": "<AZURE AI STUDIO DEPLOYMENT NAME>"
    }
  }
}

Note: AzureAIDirect is only compatible with Knowledge Management agents with an Inline Context.

Managing Knowledge Management Agents

This section describes how to manage knowledge management agents using the Management API. {{baseUrl}} is the base URL of the Management API. {{instanceId}} is the unique identifier of the FLLM instance.

Retrieve

HTTP GET {{baseUrl}}/instances/{{instanceId}}/providers/FoundationaLLM.Agent/agents

Create or update

HTTP POST {{baseUrl}}/instances/{{instanceId}}/providers/FoundationaLLM.Agent/agents/<name>
Content-Type: application/json

BODY
<agent_configuration>

where <agent_configuration> is the JSON agent configuration structure described above.

Delete

HTTP DELETE {{baseUrl}}/instances/{{instanceId}}/providers/FoundationaLLM.Agent/agents/<name>

Note

FLLM currently implements logical deletes for Knowledge Management agents. This means that users cannot create a Knowledge Management agent with the same name as a deleted Knowledge Management agent. Support for purging Knowledge Management agents will be added in a future release.

Validating a Knowledge Management Agent

Once configured, the knowledge management agent can be validated using an API call to the Core API or via the User Portal.

Note

It can take up to 5 minutes for a new Knowledge Management agent to appear in the User Portal or be accessible for requests from the Core API.

Overriding agent parameters

The agent parameters can be overridden at the time of the API call. Refer to the Core API documentation for more information.