Class TikTokenizerConfig
- Namespace
- FoundationaLLM.Common.Services.Tokenizers
- Assembly
- FoundationaLLM.Common.dll
Provides the configuration values required to create a new TikTokenizer instance.
public record TikTokenizerConfig : IEquatable<TikTokenizerConfig>
- Inheritance
-
TikTokenizerConfig
- Implements
- Inherited Members
- Extension Methods
Constructors
TikTokenizerConfig(string, string, Dictionary<string, int>)
Provides the configuration values required to create a new TikTokenizer instance.
public TikTokenizerConfig(string RegexPattern, string MergeableRanksFileUrl, Dictionary<string, int> SpecialTokens)
Parameters
RegexPattern
stringRegex pattern to break a long string.
MergeableRanksFileUrl
stringThe URL used to download the BPE rank file.
SpecialTokens
Dictionary<string, int>Special tokens mapping.
Properties
MergeableRanksFileContent
The raw content of the BPE rank file.
public byte[]? MergeableRanksFileContent { get; set; }
Property Value
- byte[]
MergeableRanksFileUrl
The URL used to download the BPE rank file.
public string MergeableRanksFileUrl { get; init; }
Property Value
RegexPattern
Regex pattern to break a long string.
public string RegexPattern { get; init; }
Property Value
SpecialTokens
Special tokens mapping.
public Dictionary<string, int> SpecialTokens { get; init; }