meaning of input_binary #95

bertsky · 2023-02-16T15:17:14Z

The only documentation for this kwarg is in the standalone CLI:

in general, eynollah uses RGB as input but if the input document is strongly dark, bright or for any other reason you can turn binarized input on. This option does not mean that you have to provide a binary image, otherwise this means that the tool itself will binarized the RGB input document

I find that second sentence very confusing (esp. around otherwise).

So this means that binarization is attempted internally (when activated)? What steps of the pipeline are affected?

(Also, implementation-wise, it looks like binarization is repeated multiple times, without re-using the previous result...)

Can anything be said about how pretrained models would fare when passed (externally) binarized images?

The text was updated successfully, but these errors were encountered:

cneud · 2023-04-12T20:24:23Z

As far as I understand (and please @vahidrezanezhad correct me), Eynollah will almost always produce a better result from a grayscale or color image than from a binarized image.

However, if the input image is "strongly dark or bright" (and this needs a bit more explanation), the user may try to get a better result by setting "input_binary" to true. In this case, Eynollah itself will binarize the image and the user does not have to worry about having to binarize the image with another tool. (Note: I would like to fully integrate sbb_binarization for this)

I find that second sentence very confusing (esp. around otherwise).

Agreed, we will try and reformulate this for better clarity.

What steps of the pipeline are affected?

@vahidrezanezhad should be able to answer this.

it looks like binarization is repeated multiple times, without re-using the previous result

This we will also check wrt to performance.

Can anything be said about how pretrained models would fare when passed (externally) binarized images?

The only thing I can say is that it would be an interesting experiment to evaluate this :) But I am afraid it will require a lot of effort to do this properly (per step, with different binarization methods/models and good metrics for OCR and layout) and only be relevant for few images with bad quality.

bertsky · 2023-04-12T20:38:20Z

Ok, then (besides reformulation of the description) I highly recommend renaming that option, e.g. apply_binarization: after all, it's not the input that must/can be binary, but the internal step that is performed.

bertsky · 2023-04-12T20:46:07Z

Integrating sbb_binarization / experimenting with external tools: the OCR-D way would be to just use whatever derived images with binarized in @comments can be found, i.e. whatever binarization has been on the workflow. So whether it is sbb_binarization or any other tool – it would be up to the user to decide and experiment. (But if the internal binarizer here is different than sbb_binarize and perhaps better, then it gets more complicated...)

cneud · 2023-04-12T20:55:55Z

Let me first confirm the above and then we can rename the option, ideally also consistent for scaling, enhancing, resizing.

vahidrezanezhad · 2023-04-13T08:52:37Z

As far as I understand (and please @vahidrezanezhad correct me), Eynollah will almost always produce a better result from a grayscale or color image than from a binarized image.

This is exactly the case. Our best performance can be met from a grayscale or color image.

vahidrezanezhad · 2023-04-13T08:55:18Z

(Also, implementation-wise, it looks like binarization is repeated multiple times, without re-using the previous result...)

I will check it. By the way it should not be implemented multiple times.

vahidrezanezhad · 2023-04-13T08:57:11Z

Integrating sbb_binarization / experimenting with external tools: the OCR-D way would be to just use whatever derived images with binarized in @comments can be found, i.e. whatever binarization has been on the workflow. So whether it is sbb_binarization or any other tool – it would be up to the user to decide and experiment. (But if the internal binarizer here is different than sbb_binarize and perhaps better, then it gets more complicated...)

The internal binarizer uses the same models as sbb_binarization.

cneud added the documentation Improvements or additions to documentation label Mar 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

meaning of input_binary #95

meaning of input_binary #95

bertsky commented Feb 16, 2023

cneud commented Apr 12, 2023

bertsky commented Apr 12, 2023

bertsky commented Apr 12, 2023

cneud commented Apr 12, 2023

vahidrezanezhad commented Apr 13, 2023

vahidrezanezhad commented Apr 13, 2023

vahidrezanezhad commented Apr 13, 2023

meaning of input_binary #95

meaning of input_binary #95

Comments

bertsky commented Feb 16, 2023

cneud commented Apr 12, 2023

bertsky commented Apr 12, 2023

bertsky commented Apr 12, 2023

cneud commented Apr 12, 2023

vahidrezanezhad commented Apr 13, 2023

vahidrezanezhad commented Apr 13, 2023

vahidrezanezhad commented Apr 13, 2023