-
Notifications
You must be signed in to change notification settings - Fork 801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parquet Modular Encryption support #3511
Comments
We do not currently support this, but would welcome contributions to add support for it. |
@tustvold I'm interested in doing implementation work for this. I'd love to have a dedicated chat about it with a maintainer or community member who has context for this issue and could get me involved with contributor discussion spaces! |
I'm afraid I don't really have any context on this, as it isn't a part of the standard I am familiar with. Implementing this will likely involve interpreting the spec at https://github.com/apache/parquet-format and applying it to the Rust reader. If this is anything like other aspects of parquet, this will also involve a fair amount of spelunking in existing implementations to clarify ambiguity. The actual encryption part can probably use something like https://docs.rs/ring/latest/ring/ as an optional dependency, but I'm just guessing here that the encryption is something standard. I'm sorry I can't be of more help, I'd love to see this implemented and am happy to help review code contributions, but I don't really have the bandwidth at the moment to actively help with the actual implementation effort. |
Thanks for the quick response! Parquet encryption uses two extremely standard primitives (which ring has perfectly fine implementations of). In principle, the encryption step is a very simple post-processing step, but I definitely anticipate the existing implementations having some weird quirks. Given your resources, I'll just try to roll an implementation and submit it for review. |
Thank you, I'm happy to review code, especially if it is well tested |
Hi @bhoberman , I've started looking into building a Rust implementation of PME, but fortunately have found this thread quickly. |
Hey @ggershinsky thanks for reaching out! I'll contact you privately with more details. TL;DR for those using this thread as a status indicator: this was going to be a work project for me, and we decided after the research phase that it made the most sense to bind to Arrow C++ for our use-case and staffing. That said, I'd love to contribute some personal time to this project should @ggershinsky or someone else be willing to drive it. |
Hey @tustvold, @ggershinsky and I have met and are starting on an implementation of this together. Would it be possible for us to get invites to the Apache slack (as mentioned in the README) for easier coordination than email/GitHub? |
Sure, if you join the discord you can then DM me your email addresses |
hey @tustvold and @bhoberman did you end up connecting or make any progress on the rust implementation? I checked the discord but didn't see any messages around encryption there. We're working on something that would depend on this and would love to help contribute if there's something already partially implemented. |
We (@G-Research) would also like to see Parquet encryption support added and can contribute to this effort too, maybe we can work together on this @matthewgapp? |
Hi @matthewgapp @adamreeve . Ben (@bhoberman) and I have worked on this for a while, but had to switch to other projects. Feel free to use the early draft (branch and an internal patch) any way you like. |
@matthewgapp, are you on the Arrow Rust discord? I'm adamrnz there if you want to discuss this further. It looks like I could also invite you to the ASF Slack workspace if that's easier |
hey @adamreeve and @ggershinsky apologies for the delayed response here. Adam, I'll message you on discord. Happy to do slack if that's better |
thanks @ggershinsky for the draft and patch! will let you know if questions |
Hi all. I'm @adamreeve's colleague and I happen to have time available to do some work on this. Could I help with any potential open tasks @matthewgapp or is it better I pick up the @ggershinsky's draft branch? |
❤️ I believe the Apache DataFusion Comet project may be interested in this feature too -- I believe its lack is one reason the project has its own parquet decoder Perhaps @andygrove or @viirya @sunchao or @kazuyukitanimura have more details they can share cc @etseidl who may also be interested |
Related issue in Comet: apache/datafusion-comet#1040 |
Which part is this question about
Documentation
Describe your question
Is Parquet Modular Encryption supported by this library?
Additional context
I have found some mentions to AES and encryption here and there on the documentations and code base, however there is no example of it. I am strugling to make it work, so I'm starting to think this is not fully supported yet.
The text was updated successfully, but these errors were encountered: