Export to image support #55

werenall · 2020-04-02T12:01:37Z

First of all - kudos for this library! It proves to be very useful to our project in Magnet.
However we need an export to image functionality that Apache's PDFbox provides. We fought that it would be nice if your library has it as well.

We'd be happy to make a PR with this.

[Re dotemacs#55] This commit also changes slightly the prerequisities for split function Previously it only allowed strings as inputs. IMHO it should also accept files.

dotemacs · 2020-04-02T12:12:32Z

First of all - kudos for this library! It proves to be very useful to our project in Magnet.

Thank you, I'm glad that you're finding it useful.

However we need an export to image functionality that Apache's PDFbox provides.

OK. Is this functionality already present in any of the Java examples here:
https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/

I'm asking because I'm trying to understand what exactly are you trying to do: extract images out of a PDF or ...?

We fought that it would be nice if your library has it as well.
We'd be happy to make a PR with this.

OK, but let me understand what you're trying to do first. Then if you're willing to do the work, then that would be great.

werenall · 2020-04-02T12:20:49Z

We have pdfs (possibly multi-paged) that we need thumbnails for. In our case, each page gets converted into an image. Something like with Google Drive - they don't display a pdf in the preview. Just an image with its thumbnail.

[Re dotemacs#55] This commit also changes slightly the prerequisities for split function Previously it only allowed strings as inputs. IMHO it should also accept files.

avocade · 2024-03-11T16:30:00Z

We have a use case where we want to extract all images from the entire document so we can then do ML on each image. Extracting the text is done separately. PDFBox looks like the right tool for it:

https://docs.aspose.com/pdf/java/extract-images-from-pdf-file/

Similar use case with the nodeJS pdf-lib (the extract-images.zip example which seems to work well):
Hopding/pdf-lib#83 (comment)

werenall mentioned this issue Apr 2, 2020

Add support for exporting pdf to image #56

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Export to image support #55

Export to image support #55

werenall commented Apr 2, 2020

dotemacs commented Apr 2, 2020

werenall commented Apr 2, 2020

avocade commented Mar 11, 2024

Export to image support #55

Export to image support #55

Comments

werenall commented Apr 2, 2020

dotemacs commented Apr 2, 2020

werenall commented Apr 2, 2020

avocade commented Mar 11, 2024