Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCR seems correct, textlayer full of blank characters #681

Open
ReaderGuy42 opened this issue Aug 22, 2024 · 0 comments
Open

OCR seems correct, textlayer full of blank characters #681

ReaderGuy42 opened this issue Aug 22, 2024 · 0 comments

Comments

@ReaderGuy42
Copy link

The OCR process itself seems correct, which I verify by spot checking the PDF via the generated word list.

However, in processing the PDF to include the OCR'ed text either as an invisible text layer or visible, the OCR'ed text becomes blank square characters, rendering the process useless.

image

I have all the spell check and traineddata installed, have switched languages, and also had previously successfully used gImageReader to OCR a document a few weeks ago, on this same system and install.

Exporting to plain text works.
Exporting to ODT works.

Under "Export to PDF" you can select Show Preview, and this shows the desired text overlay.
However, once actually exported to PDF there is no text overlay. If I choose invisible text overlay, I can't select anything except some hyphens, and if I select regular PDF, it just outputs the blank characters as seen in the screenshot.

Any ideas how to solve this?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant