Table of Contents

Class DataLakeContentSourceService

Namespace
FoundationaLLM.Vectorization.Services.ContentSources
Assembly
FoundationaLLM.Vectorization.Engine.dll

Implements a vectorization content source for content residing in data lake storage.

public class DataLakeContentSourceService : ContentSourceServiceBase, IContentSourceService
Inheritance
DataLakeContentSourceService
Implements
Inherited Members
Extension Methods

Constructors

DataLakeContentSourceService(BlobStorageServiceSettings, ILoggerFactory)

Creates a new instance of the vectorization content source service.

public DataLakeContentSourceService(BlobStorageServiceSettings storageSettings, ILoggerFactory loggerFactory)

Parameters

storageSettings BlobStorageServiceSettings
loggerFactory ILoggerFactory

Methods

ExtractTextAsync(ContentIdentifier, UnifiedUserIdentity, CancellationToken)

Reads the content of a data source item.

public Task<string> ExtractTextAsync(ContentIdentifier contentId, UnifiedUserIdentity userIdentity, CancellationToken cancellationToken)

Parameters

contentId ContentIdentifier

The ContentIdentifier providing the unique identifier of the item being read.

userIdentity UnifiedUserIdentity

The UnifiedUserIdentity providing information about the calling user identity.

cancellationToken CancellationToken

The cancellation token that signals that operations should be cancelled.

Returns

Task<string>

The string content of the item.

Remarks

contentId[0] = the URL of the storage account. contentId[1] = the container name. contentId[2] = path of the file relative to the container name.