Table of Contents

Class DataPipelineStateService

Namespace
FoundationaLLM.DataPipelineEngine.Services
Assembly
FoundationaLLM.DataPipelineEngine.dll

Provides capabilities for data pipeline state management.

public class DataPipelineStateService : IDataPipelineStateService
Inheritance
DataPipelineStateService
Implements
Inherited Members
Extension Methods

Constructors

DataPipelineStateService(IAzureCosmosDBDataPipelineService, IStorageService, ILogger<DataPipelineStateService>)

Provides capabilities for data pipeline state management.

public DataPipelineStateService(IAzureCosmosDBDataPipelineService cosmosDBService, IStorageService storageService, ILogger<DataPipelineStateService> logger)

Parameters

cosmosDBService IAzureCosmosDBDataPipelineService

The Azure Cosmos DB service providing database services.

storageService IStorageService

The storage service providing blob storage capabilities.

logger ILogger<DataPipelineStateService>

The logger used for logging.

Methods

GetActiveDataPipelineRuns()

Gets a list of active data pipeline runs.

public Task<List<DataPipelineRun>> GetActiveDataPipelineRuns()

Returns

Task<List<DataPipelineRun>>

The list of active data pipeline runs.

GetDataPipelineRun(string)

Gets a data pipeline run by its identifier.

public Task<DataPipelineRun?> GetDataPipelineRun(string runId)

Parameters

runId string

The data pipeline run identifier.

Returns

Task<DataPipelineRun>

The requested data pipeline run object.

GetDataPipelineRunStageWorkItems(string, string)

Gets the list of data pipeline run work items associated with a specified stage of a run.

public Task<List<DataPipelineRunWorkItem>> GetDataPipelineRunStageWorkItems(string runId, string stage)

Parameters

runId string

The data pipeline run identifier.

stage string

The stage of the data pipeline run.

Returns

Task<List<DataPipelineRunWorkItem>>

The list of data pipeline run work items associated with the specified stage of the run.

GetDataPipelineRunWorkItem(string, string)

Gets a data pipeline run work item by its identifier.

public Task<DataPipelineRunWorkItem?> GetDataPipelineRunWorkItem(string workItemId, string runId)

Parameters

workItemId string

The data pipeline run work item identifier.

runId string

The data pipeline run identifier.

Returns

Task<DataPipelineRunWorkItem>

The requests data pipeline run work item object.

GetDataPipelineRuns(DataPipelineRunFilter)

Gets a list of data pipeline runs filtered by the provided filter criteria.

public Task<List<DataPipelineRun>> GetDataPipelineRuns(DataPipelineRunFilter dataPipelineRunFilter)

Parameters

dataPipelineRunFilter DataPipelineRunFilter

The filter criteria used to filter data pipeline runs.

Returns

Task<List<DataPipelineRun>>

The list of requests data pipeline runs.

InitializeDataPipelineRunState(DataPipelineRun, List<DataPipelineContentItem>)

Initializes the state of a data pipeline run.

public Task<bool> InitializeDataPipelineRunState(DataPipelineRun dataPipelineRun, List<DataPipelineContentItem> contentItems)

Parameters

dataPipelineRun DataPipelineRun

The details of the data pipeline run.

contentItems List<DataPipelineContentItem>

The list of content items to be processed by the data pipeline run.

Returns

Task<bool>

true if the initialization is successful.

LoadDataPipelineRunWorkItemArtifacts(DataPipelineDefinition, DataPipelineRun, DataPipelineRunWorkItem, string)

Loads the artifacts associated with a data pipeline run work item.

public Task<List<DataPipelineStateArtifact>> LoadDataPipelineRunWorkItemArtifacts(DataPipelineDefinition dataPipelineDefinition, DataPipelineRun dataPipelineRun, DataPipelineRunWorkItem dataPipelineRunWorkItem, string artifactsNameFilter)

Parameters

dataPipelineDefinition DataPipelineDefinition

The data pipeline definition associated with the work item.

dataPipelineRun DataPipelineRun

The data pipeline run item associated with the work item.

dataPipelineRunWorkItem DataPipelineRunWorkItem

The data pipeline run work item.

artifactsNameFilter string

The name pattern used to identify a subset of the artifacts.

Returns

Task<List<DataPipelineStateArtifact>>

A list with the binary contents of the artifacts.

LoadDataPipelineRunWorkItemParts<T>(DataPipelineDefinition, DataPipelineRun, DataPipelineRunWorkItem, string)

Loads the content item parts associated with a data pipeline run work item.

public Task<IEnumerable<T>> LoadDataPipelineRunWorkItemParts<T>(DataPipelineDefinition dataPipelineDefinition, DataPipelineRun dataPipelineRun, DataPipelineRunWorkItem dataPipelineRunWorkItem, string fileName) where T : class, new()

Parameters

dataPipelineDefinition DataPipelineDefinition

The data pipeline definition associated with the work item.

dataPipelineRun DataPipelineRun

The data pipeline run item associated with the work item.

dataPipelineRunWorkItem DataPipelineRunWorkItem

The data pipeline run work item.

fileName string

The name of the file that contains the content item parts.

Returns

Task<IEnumerable<T>>

A list with the content item parts associated with the data pipeline run work item.

Type Parameters

T

The type of the content item parts to be loaded.

PersistDataPipelineRunWorkItems(List<DataPipelineRunWorkItem>)

Persists a list of data pipeline run work items.

public Task<bool> PersistDataPipelineRunWorkItems(List<DataPipelineRunWorkItem> workItems)

Parameters

workItems List<DataPipelineRunWorkItem>

The list of data pipeline work items to be persisted.

Returns

Task<bool>

true if the items are successfully persisted.

SaveDataPipelineRunWorkItemArtifacts(DataPipelineDefinition, DataPipelineRun, DataPipelineRunWorkItem, List<DataPipelineStateArtifact>)

Saves the artifacts associated with a data pipeline run work item.

public Task SaveDataPipelineRunWorkItemArtifacts(DataPipelineDefinition dataPipelineDefinition, DataPipelineRun dataPipelineRun, DataPipelineRunWorkItem dataPipelineRunWorkItem, List<DataPipelineStateArtifact> artifacts)

Parameters

dataPipelineDefinition DataPipelineDefinition

The data pipeline definition associated with the work item.

dataPipelineRun DataPipelineRun

The data pipeline run item associated with the work item.

dataPipelineRunWorkItem DataPipelineRunWorkItem

The data pipeline run work item.

artifacts List<DataPipelineStateArtifact>

The list with the binary contents of the artifacts.

Returns

Task

SaveDataPipelineRunWorkItemParts<T>(DataPipelineDefinition, DataPipelineRun, DataPipelineRunWorkItem, IEnumerable<T>, string)

Saves the content item parts associated with a data pipeline run work item.

public Task SaveDataPipelineRunWorkItemParts<T>(DataPipelineDefinition dataPipelineDefinition, DataPipelineRun dataPipelineRun, DataPipelineRunWorkItem dataPipelineRunWorkItem, IEnumerable<T> contentItemParts, string fileName) where T : class, new()

Parameters

dataPipelineDefinition DataPipelineDefinition

The data pipeline definition associated with the work item.

dataPipelineRun DataPipelineRun

The data pipeline run item associated with the work item.

dataPipelineRunWorkItem DataPipelineRunWorkItem

The data pipeline run work item.

contentItemParts IEnumerable<T>

The list with the content item parts.

fileName string

The name of the file that contains the content item parts.

Returns

Task

Type Parameters

T

The type of the content item parts to be loaded.

StartDataPipelineRunWorkItemProcessing(Func<DataPipelineRunWorkItem, Task>)

Starts processing data pipeline run work items.

public Task<bool> StartDataPipelineRunWorkItemProcessing(Func<DataPipelineRunWorkItem, Task> processWorkItem)

Parameters

processWorkItem Func<DataPipelineRunWorkItem, Task>

The asynchronous delegate that is invoked for each data pipeline run work item.

Returns

Task<bool>

true if the processing is successfully started.

StopDataPipelineRunWorkItemProcessing()

Stops processing data pipeline run work items.

public Task StopDataPipelineRunWorkItemProcessing()

Returns

Task

UpdateDataPipelineRunStatus(DataPipelineRun)

Updates the status of a data pipeline run.

public Task<bool> UpdateDataPipelineRunStatus(DataPipelineRun dataPipelineRun)

Parameters

dataPipelineRun DataPipelineRun

The data pipeline run whose status is to be updated.

Returns

Task<bool>

true if the status update is successful.

UpdateDataPipelineRunWorkItem(DataPipelineRunWorkItem)

Updates a data pipeline run work item.

public Task<bool> UpdateDataPipelineRunWorkItem(DataPipelineRunWorkItem workItem)

Parameters

workItem DataPipelineRunWorkItem

The data pipeline run work item to be updated.

Returns

Task<bool>

true if the data pipeline run work item is successfully updated.

UpdateDataPipelineRunWorkItemsStatus(List<DataPipelineRunWorkItem>)

Updates the status of data pipeline run work items.

public Task<bool> UpdateDataPipelineRunWorkItemsStatus(List<DataPipelineRunWorkItem> workItems)

Parameters

workItems List<DataPipelineRunWorkItem>

The list of data pipeline work items whose status must be updated.

Returns

Task<bool>

true if the items statuses are successfully updated.