PDFTextExtractionContext¶
Namespace: O2S.Components.PDF4NET.Content
Defines the context for extracting text from PDF files.
Inheritance Object → PDFTextExtractionContext
Constructors¶
PDFTextExtractionContext()¶
Initializes a new PDFTextExtractionContext object.
PDFTextExtractionContext(PDFDisplayRectangle)¶
Initializes a new PDFTextExtractionContext object.
Parameters
visualExtractionBounds PDFDisplayRectangle
Bounds for text extraction.
Properties¶
EnableExtendedInformation¶
Gets or sets a value indicating whether extended text information should be loaded for text.
Property Value
Boolean
If true then extended information is loaded.
Remarks
This flag is used only by PDFContentExtractor.ExtractTextRuns(), PDFContentExtractor.ExtractTextRuns(PDFContentExtractionContext),
PDFContentExtractor.ExtractTextRuns() and PDFContentExtractor.ExtractTextRuns(PDFContentExtractionContext) methods.
By default this property is true which allows to analyze text fragment positions in order to group extracted text into lines.
If it is set to false then only the text is loaded and no other properties (suc as positions, font info, colors, etc).
IncludePartialMatches¶
Gets or sets a value indicating whether characters that fit partially the extraction bounds should be included in the extracted text.
Property Value
Boolean
If true then the characters that fit partially inside the extraction bounds are included in the extracted text.
UseActualTextIfAvailable¶
Gets or sets a flag indicating whether the text extraction process should use the text included in the /ActualText entry applied to current showText operator.
Property Value
Boolean
True if the text extraction process should ignore the glyph values and font encoding and use the text included in the /ActualText entry applied to current showText operator.
VisualExtractionBounds¶
Gets or sets the bounds (in visual coordinates) for text extraction.
Property Value
PDFDisplayRectangle
A rectangle in visual coordinates that specifies the area on the page from which the text should be extracted.