Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"The configured limit of 1,000 object references was reached" after importing large variable XLSX #4458

Open
1 task done
vera opened this issue Aug 28, 2024 · 6 comments
Open
1 task done
Assignees
Labels

Comments

@vera
Copy link

vera commented Aug 28, 2024

This issue is unique

  • I have used the search tool and did not find an issue describing my bug.

Version information

5.2.3

Expected behavior

When uploading a large variable XLSX ("Variables" sheet has 18,000+ rows, "Categories" sheet has 66,000+ rows) for a collected dataset and then publishing the dataset, we expect all variables and if applicable their categorical values to be shown in MICA.

Actual behavior

When publishing the dataset, a "504 Gateway timeout" error message appears in the admin UI and the following warning appears in the MICA logs:

2024-08-28 11:37:19.044 WARN 59 --- [87606386-132162] n.s.e.pool.sizeof.ObjectGraphWalker : The configured limit of 1,000 object references was reached while attempting to calculate the size of the object graph. Severe performance degradation could occur if the sizing operation continues. This can be avoided by setting the CacheManger or Cache <sizeOfPolicy> elements maxDepthExceededBehavior to "abort" or adding stop points with @IgnoreSizeOf annotations. If performance degradation is NOT an issue at the configured limit, raise the limit value using the CacheManager or Cache <sizeOfPolicy> elements maxDepth attribute. For more information, see the Ehcache configuration documentation.

Reproduction steps

  1. Log in to MICA
  2. Open a collected dataset
  3. Upload a file in MICA/OPAL format with a lot of variables (in our case, 18,000+ Variables and 66,000+ Categories rows)
  4. Edit the "Study Table" section: under Data source > Path, enter the path to the file
  5. "Publish"

Operating System (OS)

No response

Browser

No response

Contact info

NFDI4Health

cc @johannes-darms

@ymarcon
Copy link
Member

ymarcon commented Aug 28, 2024

Can you share this xlsx file? (at least privately)

@vera
Copy link
Author

vera commented Aug 28, 2024

Yes, can I send it to your email address ([email protected])?

@vera
Copy link
Author

vera commented Aug 28, 2024

Thank you, sent!

@ymarcon
Copy link
Member

ymarcon commented Aug 29, 2024

I was able to create the tables in my local opal. As mica and opal share the same excel reader library, this is not a problem with your file nor with the file reader.

Have you tried to increase the memory of mica? The JAVA_OPTS env variable, see https://micadoc.obiba.org/en/latest/admin/installation.html

@vera
Copy link
Author

vera commented Aug 30, 2024

I have increased the memory setting, but it doesn't seem to totally fix the problem. Right now, I am not seeing the error message in the log anymore, but I am still seeing the 504 Gateway Timeout.

Specifically, the request to PUT /ws/draft/collected-dataset/3601/_publish?cascading=NONE fails with 504 after 30 seconds. Is there anything I could do to increase the request speed or the timeout duration?

@vera
Copy link
Author

vera commented Sep 3, 2024

We were now able to fix the timeout (it was related to Jetty). We are now seeing the following behaviour: for a short time while the _publish request was still loading, we saw approx. 1,000 variables (out of 18,000) in the frontend (at /search#lists?type=variables&query=dataset(in(Mica_dataset.id,3601)),variable(limit(0,20))). After the publish was finished, it jumped back down to 89 variables shown. Do you have an idea what might cause this?

By the way, we are using the direct upload to MICA that was developed by you, without OPAL. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants