Table of Contents

Class DOCXContentTextExtractionPlugin

Namespace
FoundationaLLM.Plugins.DataPipeline.Plugins.ContentTextExtraction
Assembly
FoundationaLLM.DataPipelinePlugins.dll

Implements the DOCX Content Text Extraction Plugin.

public class DOCXContentTextExtractionPlugin : PluginBase, IContentTextExtractionPlugin
Inheritance
DOCXContentTextExtractionPlugin
Implements
Inherited Members
Extension Methods

Constructors

DOCXContentTextExtractionPlugin(Dictionary<string, object>, IPluginPackageManager, IPluginPackageManagerResolver, IServiceProvider)

Implements the DOCX Content Text Extraction Plugin.

public DOCXContentTextExtractionPlugin(Dictionary<string, object> pluginParameters, IPluginPackageManager packageManager, IPluginPackageManagerResolver packageManagerResolver, IServiceProvider serviceProvider)

Parameters

pluginParameters Dictionary<string, object>

The dictionary containing the plugin parameters.

packageManager IPluginPackageManager

The package manager for the plugin.

packageManagerResolver IPluginPackageManagerResolver

The package manager resolver for the plugin.

serviceProvider IServiceProvider

The service provider of the dependency injection container.

Properties

Name

protected override string Name { get; }

Property Value

string

Methods

ExtractText(BinaryData)

Extracts text from the provided raw content.

public Task<PluginResult<string>> ExtractText(BinaryData rawContent)

Parameters

rawContent BinaryData

The binary content to extract text from.

Returns

Task<PluginResult<string>>

A PluginResult<T> object with the extracted text.