Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Data Transfer] memory footprint #164

Open
JulienPeloton opened this issue Feb 9, 2023 · 0 comments
Open

[Data Transfer] memory footprint #164

JulienPeloton opened this issue Feb 9, 2023 · 0 comments

Comments

@JulienPeloton
Copy link
Member

JulienPeloton commented Feb 9, 2023

The Data Transfer service partitions data by class or time. As we poll by batch, this leads to many (potentially small) files. It has been reported that after doing post-processing the data size on disk gets down by a factor of 3 (thanks @bregeon !). We need to inspect to see if the gain comes from the compression or the merge (my guess is compression -- which is uncompress by default).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Data Transfer
Development

No branches or pull requests

1 participant