Class PDFTextExtractor
- Namespace
- FoundationaLLM.Vectorization.DataFormats.PDF
- Assembly
- FoundationaLLM.Vectorization.Engine.dll
Extracts text from PDF files.
public class PDFTextExtractor
- Inheritance
-
PDFTextExtractor
- Inherited Members
- Extension Methods
Methods
GetText(BinaryData)
Extracts the text content from a PDF document.
public static string GetText(BinaryData binaryContent)
Parameters
binaryContent
BinaryDataThe binary content of the PDF document.
Returns
- string
The text content of the PDF document.