Interface ITokenizerService
- Namespace
- FoundationaLLM.Common.Interfaces
- Assembly
- FoundationaLLM.Common.dll
Represents a text tokenizer.
public interface ITokenizerService
- Extension Methods
Methods
CountTokens(string, string?)
Count the number of tokens in a given text.
long CountTokens(string text, string? encoderName = null)
Parameters
Returns
- long
The number of tokens in the text.
Decode(int[], string?)
Decode an array of integer token ids.
string Decode(int[] tokens, string? encoderName = null)
Parameters
tokens
int[]An array of integer token ids.
encoderName
stringThe name of the encoder used for tokenization.
Returns
- string
Decoded text.
Encode(string, string?)
Encode a string with a set of allowed special tokens that are not broken apart.
List<int> Encode(string text, string? encoderName = null)