Class ModelDeploymentContext
- Namespace
 - FoundationaLLM.Gateway.Services
 
- Assembly
 - FoundationaLLM.Gateway.dll
 
Provides context associated with model deployment used in text operations (embeddings or completions).
public class ModelDeploymentContext
  - Inheritance
 - 
      
      ModelDeploymentContext
 
- Inherited Members
 
- Extension Methods
 
Constructors
ModelDeploymentContext(AzureOpenAIAccountDeployment, double, ITextOperationService, ILoggerFactory, GatewayMetrics)
Provides context associated with model deployment used in text operations (embeddings or completions).
public ModelDeploymentContext(AzureOpenAIAccountDeployment deployment, double tokenRateLimitMultiplier, ITextOperationService textOperationService, ILoggerFactory loggerFactory, GatewayMetrics metrics)
  Parameters
deploymentAzureOpenAIAccountDeploymentThe AzureOpenAIAccountDeployment object with the details of the model deployment.
tokenRateLimitMultiplierdoubleThe token rate limit multiplier used to account for the tokenization differences between the Gateway API and the deployed model.
textOperationServiceITextOperationServiceThe service providing the implementation of the text operation.
loggerFactoryILoggerFactoryThe ILoggerFactory used to create loggers for logging.
metricsGatewayMetricsThe FoundationaLLM Gateway telemetry metrics.
Fields
_deployment
protected readonly AzureOpenAIAccountDeployment _deployment
  Field Value
_effectiveRequestRateLimit
protected readonly int _effectiveRequestRateLimit
  Field Value
_effectiveTokenRateLimit
protected readonly int _effectiveTokenRateLimit
  Field Value
_embeddingDimensionsIndexMapping
Embedding operations are grouped by the number of dimensions they require. For each embedding dimension, we send a single request to the model. This dictionary maps the number of dimensions to the index in the _textOperationRequests list.
protected readonly Dictionary<int, int> _embeddingDimensionsIndexMapping
  Field Value
- Dictionary<int, int>
 
_jsonSerializerOptions
protected readonly JsonSerializerOptions _jsonSerializerOptions
  Field Value
_logger
protected readonly ILogger<ModelDeploymentContext> _logger
  Field Value
_loggerFactory
protected readonly ILoggerFactory _loggerFactory
  Field Value
_metrics
protected readonly GatewayMetrics _metrics
  Field Value
_requestRateWindowActualRequestCount
The actual cummulated number of requests for the current request rate window.
protected int _requestRateWindowActualRequestCount
  Field Value
_requestRateWindowProjectedRequestCount
The projected cummulated number of requests for the current request rate window.
protected int _requestRateWindowProjectedRequestCount
  Field Value
_requestRateWindowStart
The start timestamp of the current request rate window.
protected DateTime _requestRateWindowStart
  Field Value
_textOperationRequests
protected readonly List<InternalTextOperationRequest> _textOperationRequests
  Field Value
_textOperationService
protected readonly ITextOperationService _textOperationService
  Field Value
_tokenRateLimitMultiplier
protected readonly double _tokenRateLimitMultiplier
  Field Value
_tokenRateWindowActualTokenCount
The actual cummulated number of tokens for the current token rate window.
protected int _tokenRateWindowActualTokenCount
  Field Value
_tokenRateWindowProjectedTokenCount
The projects cummulated number of tokens for the current token rate window.
protected int _tokenRateWindowProjectedTokenCount
  Field Value
_tokenRateWindowStart
The start timestamp of the current token rate window.
protected DateTime _tokenRateWindowStart
  Field Value
Properties
HasInput
public bool HasInput { get; }
  Property Value
ModelCanDoCompletions
Indicates whether the model in the deployment can perform completions.
public bool ModelCanDoCompletions { get; }
  Property Value
ModelCanDoEmbeddings
Indicates whether the model in the deployment can perform embeddings.
public bool ModelCanDoEmbeddings { get; }
  Property Value
Methods
ProcessTextOperationRequests()
public Task<List<InternalTextOperationResult>> ProcessTextOperationRequests()
  Returns
TryAddInputTextChunk(TextChunk, Dictionary<string, object>)
Attempts to add a new text chunk to the input for the text operation request.
public bool TryAddInputTextChunk(TextChunk textChunk, Dictionary<string, object> modelParameters)
  Parameters
textChunkTextChunkThe text chunk to be added.
modelParametersDictionary<string, object>The model parameters for the text operation.
Returns
Remarks
For embedding operations, modelParameters must always contain a single property named
TextOperationContextPropertyNames.EmbeddingDimensions which specifies the number of dimensions required for embedding.
For completion operations, modelParameters can contain the following parameters:
- TextOperationContextPropertyNames.Temperature - the completion model temperature.
 - TextOperationContextPropertyNames.TopP - the completion model top-p value.
 - TextOperationContextPropertyNames.MaxOutputTokenCount - the completion model max output token count.