Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Optimize Parquet Reading by Enhancing Predicate Pushdown to the Page Level #4587

Closed
wants to merge 1 commit into from

Conversation

Aiden-Dong
Copy link

Purpose

Linked issue: #4586

Tests

API and Format

Documentation

@JingsongLi
Copy link
Contributor

This modification is not that simple, it requires us to modify the column reader.

@Aiden-Dong Aiden-Dong changed the title Optimize Parquet Reading by Enhancing Predicate Pushdown to the Page Level [Core] Optimize Parquet Reading by Enhancing Predicate Pushdown to the Page Level Nov 25, 2024
@Aiden-Dong
Copy link
Author

This modification is not that simple, it requires us to modify the column reader.

I found that it works in my local tests, and the column pages obtained based on FilterRowGroup are filtered. Is there any error I might have missed?

@JingsongLi
Copy link
Contributor

#3610

@JingsongLi
Copy link
Contributor

See implementation in Spark.

@JingsongLi
Copy link
Contributor

Close this now, feel free to re-open if you have more questions.

@JingsongLi JingsongLi closed this Nov 26, 2024
@Aiden-Dong
Copy link
Author

See implementation in Spark.

ok tks, I'll go check how parquet is implemented when reading in Spark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants