Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Extracting bullet points #416

Open
Elikrag opened this issue Oct 6, 2020 · 2 comments
Open

[BUG] Extracting bullet points #416

Elikrag opened this issue Oct 6, 2020 · 2 comments
Labels

Comments

@Elikrag
Copy link

Elikrag commented Oct 6, 2020

Description

UniPDF v3.12.1
Getting pageText from ExtractPageText() and using pageText.Marks() to extract the words on a page. Bulletpoints are showing up as x.

The problem can be seen in this example from page 4 of the attached PDF:
Screenshot from 2020-10-06 13-11-14

x
All
combustion
devices
installed
on
or
after
May
1,
2014,
must
be
equipped
with
an
operational
auto-igniter
upon
installation
of
the
combustion
device;

Attachments

Full PDF: Speer_Permit.pdf

@github-actions
Copy link

github-actions bot commented Oct 6, 2020

Welcome! Thanks for posting your first issue. The way things work here is that while customer issues are prioritized, other issues go into our backlog where they are assessed and fitted into the roadmap when suitable. If you need to get this done, consider buying a license which also enables you to use it in your commercial products. More information can be found on https://unidoc.io/

@Elikrag
Copy link
Author

Elikrag commented Oct 6, 2020

Not sure how to specify this as a "customer issue"

@gunnsth gunnsth added extract feature New feature labels Nov 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants