Memory usage explosion with very narrow images (e.g. book spine) #67

mikegerber · 2022-02-15T11:56:32Z

With this document (PPN894261851.zip) we experienced an OOM error. Further investigation revealed this memory usage (measured using procpath):

The culprit seems to be this "page" from the document - an image of a book spine:

Relevant parts from the log output:

18:25:30.757 INFO eynollah - INPUT FILE PHYS_0017 (17/18)
18:25:30.780 INFO eynollah - resize and enhance image
18:25:30.780 INFO eynollah - Detected 25 DPI
18:25:40.756 INFO eynollah - Found 5 columns ([[4.1955504e-01 1.7818451e-13 2.7631987e-21 7.5972243e-22 5.8044493e-01
  0.0000000e+00]])
18:31:39.449 INFO eynollah - Image is enhanced
18:31:40.369 INFO eynollah - Enhancing took 369.5891568660736s
18:31:47.043 INFO eynollah - Image dimensions: 448x672
18:43:35.935 INFO eynollah - Image dimensions: 224x448
18:52:07.638 INFO eynollah - Image dimensions: 448x672
19:01:28.031 INFO eynollah - Textregion detection took 1787.6620445251465s
19:01:36.604 INFO eynollah - Graphics detection took 8.571088552474976s
19:01:36.604 INFO eynollah - cont_page [array([[  519,   445],
       [ 4404,   445],
       [ 4404, 27685],
       [  519, 27685]])]
19:01:41.160 INFO eynollah - Image dimensions: 448x672
19:08:15.645 INFO eynollah - textline detection took 399.04073786735535s
19:26:32.295 INFO eynollah - slope_deskew: -90.0
19:26:32.451 INFO eynollah - deskewing took 1096.8060252666473s
19:26:33.040 INFO eynollah - detection of marginals took 0.5885534286499023s
19:26:55.466 INFO eynollah - Image dimensions: 896x896
19:27:51.663 INFO eynollah - Image dimensions: 896x896
19:34:22.576 INFO eynollah - areas_cnt_text [1.60449940e-05 3.67248936e-05 4.69396395e-05 1.78734430e-05
 6.68446924e-05 1.59316018e-05 2.67794541e-05 3.35782605e-05
 2.04153178e-05 1.02601028e-04 1.49299709e-05 2.50974700e-05
 1.09640792e-04 4.56729543e-04 1.69521315e-05 7.82122588e-05
 9.06334276e-05 2.25603199e-04 1.58796304e-05 4.07455914e-05
 1.44858515e-05 1.97103964e-04 3.92242463e-05 2.14925435e-05
 2.01601854e-05 1.57520642e-05 1.14313495e-04 2.90331237e-05
 1.44291554e-04 2.15615238e-04 3.12064739e-05 4.46585667e-04
 2.03675986e-04 4.18700639e-05 2.75817038e-04 2.86669615e-04
 4.78515016e-05 1.76816212e-04 2.13172581e-04 2.02211337e-04
 3.27372684e-05 1.72403366e-05 1.62434303e-05 3.26522243e-05
 2.49226571e-05 1.41551243e-05 2.55297777e-04 2.39352001e-05
 1.48591008e-05 1.77080794e-05 1.41844173e-04 7.28828262e-05
 1.27079565e-04 1.09125803e-04 5.03886517e-05 1.61253135e-05
 2.59356273e-04 3.43578317e-05 1.49417826e-04 1.00711158e-04
 1.49819423e-05 5.42553252e-04 2.48706857e-05 2.26875554e-03
 4.71257916e-04 8.13966893e-05 7.39080805e-05 4.21195267e-04
 3.22033802e-05 2.35572262e-04 2.46580753e-05 2.20656465e-04
 2.95670119e-05 1.99759231e-05 4.83650737e-04 2.61520173e-04
 1.14686745e-04 5.78111151e-05 1.14729267e-04 1.89081467e-05
 1.68529133e-04 1.66998339e-04 1.72875834e-05 2.23552691e-04
 1.04831546e-03 6.28268293e-04 5.47693697e-04 1.98365452e-04
 2.78094331e-05 6.26397322e-05 5.01098959e-05 1.08133621e-04
 9.64258784e-05 5.27179162e-05 6.81203545e-05 1.25246392e-04
 7.48104933e-04 8.99908719e-05 6.32440181e-04 1.75379911e-05
 9.17437261e-05 3.56807405e-05 3.17781595e-05 2.56077349e-05
 1.14162306e-04 3.40275770e-04 1.91113077e-05 2.73133423e-05
 2.53143326e-04 4.32118714e-05 1.93848663e-04 3.59594963e-05
 1.95918070e-04 1.34687236e-03 1.60180634e-04 2.35761249e-05
 6.63717525e-04 4.14731913e-05 1.89790168e-05 1.82136195e-05
 1.86530142e-05 2.08773909e-04 2.22569958e-04 3.77780235e-04
 4.02589500e-05 5.98474497e-05 1.02081314e-04 3.75233635e-05
 4.72098908e-04 5.47306274e-05 1.23058868e-04 1.49281755e-03
 8.34802707e-05 1.13349662e-04 2.02093220e-04 2.57681376e-03
 2.15686108e-04 5.79150579e-05 4.43079958e-05 2.98197820e-04
 2.61132750e-05 8.44677276e-05 5.68189335e-05 3.62051794e-05
 7.14342410e-04 1.95589233e-03 1.87621542e-04 2.56549816e-05
 1.75568898e-05 1.43630100e-05 9.49763483e-04 5.73769175e-04
 3.36840932e-04 1.75474405e-05 1.04953916e-04 6.89329984e-05
 6.42224981e-05 2.66504705e-04 6.18412623e-05 5.68283828e-05
 2.05906032e-04 1.20568964e-04 2.07554943e-05 2.06421021e-05
 6.66509807e-05 3.16127959e-05 1.37913244e-05 7.39458779e-05
 3.90399840e-05 2.61038257e-05 2.60187815e-05 2.02953110e-04
 4.78609509e-05 1.26876404e-04 8.87908046e-05 4.99917791e-05
 2.68890665e-04 4.74404549e-05 1.45269562e-04 1.67092832e-04]
19:43:02.340 INFO eynollah - Job done in 4651.560835599899s

This log output is not from the OOM, but another run I did on a different machine to investigate the problem. If I interpret the cont_page part correctly, the image is blown up to [ 4404, 27685], which would certainly explain the OOM error on the other machine.

Reproduce with ocrd-eynollah-segment -I MAX -O TEST-SEGMENT -P models /path/to/models.

The text was updated successfully, but these errors were encountered:

mikegerber · 2022-02-21T10:30:40Z

While eynollah should handle this gracefully, we should also consider how to handle irrelevant images that are already marked as such in the METS structMap. In this case possibly spine and colour_checker (could also be SBB defined types):

  <mets:structMap TYPE="LOGICAL">
    <mets:div ADMID="AMD" CONTENTIDS="http://resolver.staatsbibliothek-berlin.de/SBB000205BC00000000" DMDID="DMDLOG_0000" ID="LOG_0000" LABEL="Disputationum Medicarum Undecima, De Chirurgia" ORDERLABEL="Disputationum Medicarum Undecima, De Chirurgia" TYPE="monograph">
      <mets:div ID="LOG_0001" TYPE="binding">
        <mets:div ID="LOG_0002" TYPE="cover_front"/>
        <mets:div ID="LOG_0003" TYPE="paste_down"/>
        <mets:div ID="LOG_0004" TYPE="endsheet">
          <mets:div ID="LOG_0005" TYPE="contents"/>
        </mets:div>
      </mets:div>
      <mets:div ID="LOG_0006" TYPE="title_page"/>
      <mets:div DMDID="DMDLOG_0001" ID="LOG_0007" LABEL="Quaestio Prima. [bis] 44." TYPE="section"/>
      <mets:div ID="LOG_0008" TYPE="binding">
        <mets:div ID="LOG_0009" TYPE="endsheet"/>
        <mets:div ID="LOG_0010" TYPE="paste_down"/>
        <mets:div ID="LOG_0011" TYPE="cover_back"/>
        <mets:div ID="LOG_0012" TYPE="spine"/>
      </mets:div>
      <mets:div ID="LOG_0013" TYPE="colour_checker"/>
    </mets:div>

(Full document: PPN894261851.zip)

@bertsky @kba @cneud What are your thoughts on this?

bertsky · 2022-02-21T11:23:08Z

Yes, it should be possible to skip pages marked as certain types in the logical structmap – not just in any one processor, but as a general mechanism for workflows in OCR-D.

For the concrete set of supported page types, we should stick to DFG Strukturdatenset, which is strangely missing colour_checker.

This set is also partially supported by ocrd-anybaseocr-layout-analysis:

{'annotation': 0, 'binding': 1, 'chapter': 2, 'colour_checker': 3, 'contained_work': 4, 'contents': 5, 'cover': 6, 'edge': 7, 'endsheet': 8, 'epicedia': 9, 'illustration': 10, 'index': 11, 'musical_notation': 12, 'page': 13, 'paste_down': 14, 'preface': 15, 'provenance': 16, 'section': 17, 'sermon': 18, 'table': 19, 'title_page': 20}

For the general mechanism, I suggest something along the lines of our --page-id CLI option's existing numerical range syntax, but more elaborate. For example, one could define filter operators that can look into the structmap, perhaps XPath expressions with predefined functions?

mikegerber · 2022-02-21T11:28:15Z

Yes, it should be possible to skip pages marked as certain types in the logical structmap – not just in any one processor, but as a general mechanism for workflows in OCR-D.

For the concrete set of supported page types, we should stick to DFG Strukturdatenset, which is strangely missing colour_checker.

100% agree! Should we take this to an OCR-D core or spec issue? I have some additional thoughts to discuss (like: What happens with skipped pages in the output?)

bertsky · 2022-02-21T11:37:19Z

Should we take this to an OCR-D core or spec issue?

Yes, we should elevate this to OCR-D/spec.

I have some additional thoughts to discuss (like: What happens with skipped pages in the output?)

There is already some discussion on skip strategies for API changes in spec...

cneud · 2023-08-17T23:29:18Z

With the current version including #67 I was able to

process FILE_0017_MAX.tif successfully without memory explosion
process the whole document PPN894261851 using the -di flag without running into memory issues

Is there anything relevant from here that is still needed for OCR-D/spec#172 (comment) or can we close this?

mikegerber · 2023-10-19T10:01:08Z

With the current version including #67 I was able to
* process `FILE_0017_MAX.tif` successfully without memory explosion7

* process the whole document PPN894261851 using the `-di` flag without running into memory issues
Is there anything relevant from here that is still needed for OCR-D/spec#172 (comment) or can we close this?

I wouldn't know, the current version is not working for OCR-D and so I can't reproduce until it's fixed. (Yes, there is a elaborate workaround but I am not willing to invest the time to reproduce with a lengthy changeset (#86) missing.)

mikegerber added the bug Something isn't working label Feb 15, 2022

mikegerber mentioned this issue Feb 22, 2022

Skipping OCR processing based on logical mets:structMap OCR-D/spec#192

Open

vahidrezanezhad pushed a commit that referenced this issue May 19, 2023

issue #67 solved

45c40a5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory usage explosion with very narrow images (e.g. book spine) #67

Memory usage explosion with very narrow images (e.g. book spine) #67

mikegerber commented Feb 15, 2022 •

edited

Loading

mikegerber commented Feb 21, 2022 •

edited

Loading

bertsky commented Feb 21, 2022

mikegerber commented Feb 21, 2022

bertsky commented Feb 21, 2022

cneud commented Aug 17, 2023

mikegerber commented Oct 19, 2023

Memory usage explosion with very narrow images (e.g. book spine) #67

Memory usage explosion with very narrow images (e.g. book spine) #67

Comments

mikegerber commented Feb 15, 2022 • edited Loading

mikegerber commented Feb 21, 2022 • edited Loading

bertsky commented Feb 21, 2022

mikegerber commented Feb 21, 2022

bertsky commented Feb 21, 2022

cneud commented Aug 17, 2023

mikegerber commented Oct 19, 2023

mikegerber commented Feb 15, 2022 •

edited

Loading

mikegerber commented Feb 21, 2022 •

edited

Loading