Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement Arrow based intermediate records #45

Open
syucream opened this issue Jul 9, 2020 · 0 comments
Open

Reimplement Arrow based intermediate records #45

syucream opened this issue Jul 9, 2020 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@syucream
Copy link
Contributor

syucream commented Jul 9, 2020

retry to implement Arrow record typed intermediate representation, once more! I think we can gradually switch to that by below steps:

  • prototyping for PoC: (various inputs) -> map's -> arrow -> map's -> json -> parquet

    • implement Arrow -> JSON conversion in Go
    • integrate it easily
  • remove parquet writing side Go intermediates: (various inputs) -> map's -> arrow -> json -> parquet

  • remove input side Go intermediates: (various inputs) -> arrow -> json -> parquet

    • It requires input -> arrow formatter for each input types
  • ideal: (various inputs) -> arrow -> parquet

    • It's so complicated because of arrow -> parquet (a part depends on parquet-go)
    • It'll require some improvements of Arrow Go implementation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant