Class ModelDeploymentContext
- Namespace
- FoundationaLLM.Gateway.Services
- Assembly
- FoundationaLLM.Gateway.dll
Provides context associated with model deployment used in text operations (embeddings or completions).
public class ModelDeploymentContext
- Inheritance
-
ModelDeploymentContext
- Inherited Members
- Extension Methods
Constructors
ModelDeploymentContext(AzureOpenAIAccountDeployment, double, ITextOperationService, ILoggerFactory, GatewayMetrics)
Provides context associated with model deployment used in text operations (embeddings or completions).
public ModelDeploymentContext(AzureOpenAIAccountDeployment deployment, double tokenRateLimitMultiplier, ITextOperationService textOperationService, ILoggerFactory loggerFactory, GatewayMetrics metrics)
Parameters
deployment
AzureOpenAIAccountDeploymentThe AzureOpenAIAccountDeployment object with the details of the model deployment.
tokenRateLimitMultiplier
doubleThe token rate limit multiplier used to account for the tokenization differences between the Gateway API and the deployed model.
textOperationService
ITextOperationServiceThe service providing the implementation of the text operation.
loggerFactory
ILoggerFactoryThe ILoggerFactory used to create loggers for logging.
metrics
GatewayMetricsThe FoundationaLLM Gateway telemetry metrics.
Fields
_deployment
protected readonly AzureOpenAIAccountDeployment _deployment
Field Value
_effectiveRequestRateLimit
protected readonly int _effectiveRequestRateLimit
Field Value
_effectiveTokenRateLimit
protected readonly int _effectiveTokenRateLimit
Field Value
_embeddingDimensionsIndexMapping
Embedding operations are grouped by the number of dimensions they require. For each embedding dimension, we send a single request to the model. This dictionary maps the number of dimensions to the index in the _textOperationRequests list.
protected readonly Dictionary<int, int> _embeddingDimensionsIndexMapping
Field Value
- Dictionary<int, int>
_jsonSerializerOptions
protected readonly JsonSerializerOptions _jsonSerializerOptions
Field Value
_logger
protected readonly ILogger<ModelDeploymentContext> _logger
Field Value
_loggerFactory
protected readonly ILoggerFactory _loggerFactory
Field Value
_metrics
protected readonly GatewayMetrics _metrics
Field Value
_requestRateWindowActualRequestCount
The actual cummulated number of requests for the current request rate window.
protected int _requestRateWindowActualRequestCount
Field Value
_requestRateWindowProjectedRequestCount
The projected cummulated number of requests for the current request rate window.
protected int _requestRateWindowProjectedRequestCount
Field Value
_requestRateWindowStart
The start timestamp of the current request rate window.
protected DateTime _requestRateWindowStart
Field Value
_textOperationRequests
protected readonly List<InternalTextOperationRequest> _textOperationRequests
Field Value
_textOperationService
protected readonly ITextOperationService _textOperationService
Field Value
_tokenRateLimitMultiplier
protected readonly double _tokenRateLimitMultiplier
Field Value
_tokenRateWindowActualTokenCount
The actual cummulated number of tokens for the current token rate window.
protected int _tokenRateWindowActualTokenCount
Field Value
_tokenRateWindowProjectedTokenCount
The projects cummulated number of tokens for the current token rate window.
protected int _tokenRateWindowProjectedTokenCount
Field Value
_tokenRateWindowStart
The start timestamp of the current token rate window.
protected DateTime _tokenRateWindowStart
Field Value
Properties
HasInput
public bool HasInput { get; }
Property Value
ModelCanDoCompletions
Indicates whether the model in the deployment can perform completions.
public bool ModelCanDoCompletions { get; }
Property Value
ModelCanDoEmbeddings
Indicates whether the model in the deployment can perform embeddings.
public bool ModelCanDoEmbeddings { get; }
Property Value
Methods
ProcessTextOperationRequests()
public Task<List<InternalTextOperationResult>> ProcessTextOperationRequests()
Returns
TryAddInputTextChunk(TextChunk, Dictionary<string, object>)
Attempts to add a new text chunk to the input for the text operation request.
public bool TryAddInputTextChunk(TextChunk textChunk, Dictionary<string, object> modelParameters)
Parameters
textChunk
TextChunkThe text chunk to be added.
modelParameters
Dictionary<string, object>The model parameters for the text operation.
Returns
Remarks
For embedding operations, modelParameters
must always contain a single property named
TextOperationContextPropertyNames.EmbeddingDimensions which specifies the number of dimensions required for embedding.
For completion operations, modelParameters
can contain the following parameters:
- TextOperationContextPropertyNames.Temperature - the completion model temperature.
- TextOperationContextPropertyNames.TopP - the completion model top-p value.
- TextOperationContextPropertyNames.MaxOutputTokenCount - the completion model max output token count.