Configuring vectorization

This section provides details on how to configure the vectorization API and workers in FoundationaLLM.

Note

These configurations should already be in place if you deployed FoundationaLLM (FLLM) using the recommended deployment scripts. The details presented here are provided for cases in which you need to troubleshoot or customize the configuration.

Configuration for Vectorization API

The following table describes the Azure artifacts required for the vectorization pipelines.

Artifact name Description
vectorization-input Azure storage container used by default to store documents to be picked up by the vectorization pipeline. Must be created on a Data Lake storage account (with the hierarchical namespace enabled).

The following table describes the environment variables required for the vectorization pipelines.

Environment variable Description
FoundationaLLM_AppConfig_ConnectionString Connection string to the Azure App Configuration instance.

The following table describes the required configuration parameters for the vectorization pipelines.

App Configuration Key Default Value Description
FoundationaLLM:APIs:VectorizationAPI:APIUrl The URL of the vectorization API.
FoundationaLLM:APIs:VectorizationAPI:APIKey Key Vault secret name: foundationallm-apis-vectorizationapi-apikey The API key of the vectorization API.
FoundationaLLM:APIs:VectorizationAPI:AppInsightsConnectionString Key Vault secret name: foundationallm-app-insights-connection-string The connection string to the Application Insights instance used by the vectorization API.
Note

Refer to the App Configuration values page for more information on how to set these and other configuration values.

Configuration for Vectorization workers

The following table describes the Azure artifacts required for the vectorization pipelines.

Artifact Name Description
embed Azure storage queue used for the embed vectorization pipeline. Can be created on the storage account used for the other queues.
extract Azure storage queue used for the extract vectorization pipeline. Can be created on the storage account used for the other queues.
index Azure storage queue used for the index vectorization pipeline. Can be created on the storage account used for the other queues.
partition Azure storage queue used for the partition vectorization pipeline. Can be created on the storage account used for the other queues.
vectorization-state Azure storage container used for the vectorization state service. Can be created on the storage account used for the other queues.
resource-provider Azure storage container used for the internal states of the FoundationaLLM resource providers.
resource-provider/FoundationaLLM.Vectorization/vectorization-pipelines.json Azure storage blob used for the vectorization pipeline resources managed by the FoundationaLLM.Vectorization resource provider. For more details, see vectorization pipelines.
resource-provider/FoundationaLLM.Vectorization/vectorization-text-partitioning-profiles.json Azure storage blob used for the text partitioning profiles managed by the FoundationaLLM.Vectorization resource provider. For more details, see vectorization text partitioning profiles.
resource-provider/FoundationaLLM.Vectorization/vectorization-text-embedding-profiles.json Azure storage blob used for the text embedding profiles managed by the FoundationaLLM.Vectorization resource provider. For more details, see vectorization text embedding profiles.
resource-provider/FoundationaLLM.Vectorization/vectorization-indexing-profiles.json Azure storage blob used for the indexing profiles managed by the FoundationaLLM.Vectorization resource provider. For more details, see vectorization indexing profiles.
resource-provider/FoundationaLLM.DataSources Azure storage directory where the data sources managed by the FoundationaLLM.DataSources resource provider are stored.
vectorization-state/requests/{requestid-yyyyMMdd}.json Azure storage directory where vectorization requests managed by the vectorization state service are stored.
vectorization-state/execution-state/{canonical_id} Azure storage directory where the execution state of the vectorization requests and their resulting artifacts are stored. The canonical_id is defined in the vectorization request.
vectorization-state/pipeline-state/{pipeline_name}/{pipeline_name}-{execution_id}.json Azure storage directory where the state of the vectorization pipeline execution is stored. The pipeline_name is the name of the vectorization pipeline and the execution_id is the unique identifier of the execution.

The following table describes the environment variables required for the vectorization pipelines.

Environment variable Description
FoundationaLLM_AppConfig_ConnectionString Connection string to the Azure App Configuration instance.

The following table describes the required App Configuration parameters for the vectorization pipelines.

App Configuration Key Default Value Description
FoundationaLLM:APIs:VectorizationWorker:APIUrl The URL of the vectorization worker API.
FoundationaLLM:APIs:VectorizationWorker:APIKey Key Vault secret name: foundationallm-apis-vectorizationworker-apikey The API key of the vectorization worker API.
FoundationaLLM:APIs:VectorizationWorker:AppInsightsConnectionString Key Vault secret name: foundationallm-app-insights-connection-string The connection string to the Application Insights instance used by the vectorization worker API.
FoundationaLLM:Vectorization:VectorizationWorker The settings used by each instance of the vectorization worker service. For more details, see default vectorization worker settings.
FoundationaLLM:Vectorization:Queues:Embed:AccountName The account name of the Azure Storage account used for the embed vectorization queue.
FoundationaLLM:Vectorization:Queues:Extract:AccountName The account name of the Azure Storage account used for the extract vectorization queue.
FoundationaLLM:Vectorization:Queues:Index:AccountName The account name of the Azure Storage account used for the index vectorization queue.
FoundationaLLM:Vectorization:Queues:Partition:AccountName The account name of the Azure Storage account used for the partition vectorization queue.
FoundationaLLM:Vectorization:StateService:Storage:AuthenticationType The authentication type used to connect to the underlying storage. Can be one of AzureIdentity, AccountKey, or ConnectionString.
FoundationaLLM:Vectorization:ResourceProviderService:Storage:AuthenticationType The authentication type used to connect to the underlying storage. Can be one of AzureIdentity, AccountKey, or ConnectionString.
FoundationaLLM:Vectorization:SemanticKernelTextEmbeddingService:APIKey Key Vault secret name: foundationallm-vectorization-semantickerneltextembedding-openai-apikey The API key used to connect to the Azure OpenAI service.
FoundationaLLM:Vectorization:SemanticKernelTextEmbeddingService:AuthenticationType The authentication type used to connect to the Azure OpenAI service. Can be one of AzureIdentity or APIKey.
FoundationaLLM:Vectorization:SemanticKernelTextEmbeddingService:DeploymentName The name of the Azure OpenAI model deployment. The default value is embeddings.
FoundationaLLM:Vectorization:SemanticKernelTextEmbeddingService:Endpoint The endpoint of the Azure OpenAI service.
FoundationaLLM:Vectorization:AzureAISearchIndexingService:APIKey Key Vault secret name: foundationallm-vectorization-azureaisearch-apikey The API key used to connect to the Azure OpenAI service.
FoundationaLLM:Vectorization:AzureAISearchIndexingService:AuthenticationType The authentication type used to connect to the Azure OpenAI service. Can be one of AzureIdentity or APIKey.
FoundationaLLM:Vectorization:AzureAISearchIndexingService:Endpoint The endpoint of the Azure OpenAI service.
Note

Refer to the App Configuration values page for more information on how to set these and other configuration values.

The following table describes the external content used by the vectorization worker to initialize:

Uri Description
https://openaipublic.blob.core.windows.net/encodings/cl100k_base.tiktoken The public Azure Blob Storage account used to download the OpenAI BPE ranking files.
Note

The vectorization worker must be able to open HTTPS connections to the external content listed above.

Default vectorization worker settings

The default settings for the vectorization worker are stored in the FoundationaLLM:Vectorization:VectorizationWorker App Configuration key. The default structure for this key is:

{
    "RequestManagers": [
        {
            "RequestSourceName": "extract",
            "MaxHandlerInstances": 1,
            "QueueProcessingPace": 5,
            "QueuePollingInterval": 60,
            "QueueMaxNumberOfRetries": 5
        },
        {
            "RequestSourceName": "partition",
            "MaxHandlerInstances": 1,
            "QueueProcessingPace": 5,
            "QueuePollingInterval": 60,
            "QueueMaxNumberOfRetries": 5
        },
        {
            "RequestSourceName": "embed",
            "MaxHandlerInstances": 1,
            "QueueProcessingPace": 5,
            "QueuePollingInterval": 60,
            "QueueMaxNumberOfRetries": 5
        },
        {
            "RequestSourceName": "index",
            "MaxHandlerInstances": 1,
            "QueueProcessingPace": 5,
            "QueuePollingInterval": 60,
            "QueueMaxNumberOfRetries": 5
        }
    ],
    "RequestSources": [
        {
            "Name": "extract",
            "AccountName": "{{accountName}}",
            "VisibilityTimeoutSeconds": 600
        },
        {
            "Name": "partition",
            "AccountName": "{{accountName}}",
            "VisibilityTimeoutSeconds": 600
        },
        {
            "Name": "embed",
            "AccountName": "{{accountName}}",
            "VisibilityTimeoutSeconds": 600
        },
        {
            "Name": "index",
            "AccountName": "{{accountName}}",
            "VisibilityTimeoutSeconds": 600
        }
    ],
    "QueuingEngine": "AzureStorageQueue"
}

The following table provides details about the configuration parameters:

Parameter Description
RequestManagers The list of request managers used by the vectorization worker. Each request manager is responsible for managing the execution of vectorization pipelines for a specific vectorization step. The configuration must include all request managers.
RequestManagers.MaxHandlerInstances The maximum number of request handlers that process requests for the specified request source. By default, the value is 1. You can change the value to increase the processing capacity of each vectorization worker instance. The value applies to all istances of the vectorization worker. NOTE: It is important to align the value of this setting with the level of compute and memory resources allocated to the individual vectorization worker instances.
RequestManagers.QueueProcessingPace Optional The delay in seconds to wait between requests after a request has been processed. The default value is 5.
RequestManagers.QueuePollingInterval Optional The polling interval in seconds, this is the amount of time to wait if the previous check on the queue had no items. The default value is 60.
RequestManagers.QueueMaxNumberOfRetries Optional The maximum number of retries to attempt to process a request before being removed from the queue. The default value is 5.
RequestSources The list of request sources used by the vectorization worker. Each request source is responsible for managing the requests for a specific vectorization step. The configuration must include all request sources.
RequestSources.Name The name of the request source. The name must match the name of the request manager.
RequestSources.AccountName The name of the configuration key for the Azure Storage account used for the queue (include the tokens after FoundationaLLM:Vectorization:Queues:).
RequestSources.VisibilityTimeoutSeconds In the case of queue-based request sources (the default for the vectorization worker), specifies the time in seconds until a dequeued vectorization step request must be executed. During this timeout, the message will not be visible to other handler instances within the same worker or from other worker instances. If the handler fails to process the vectorization step request successfully and remove it from the queue within the specified timeout, the message will become visibile again. The default value is 600 seconds and should not be changed.