-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-43680: [Integration] Unskip nanoarrow in IPC integration tests #43715
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there are a lot of nanoarrow-related failures in the Integration build
8cce89d
to
94dd868
Compare
Just a few notes from some investigating:
--- a/dev/archery/archery/integration/tester_nanoarrow.py
+++ b/dev/archery/archery/integration/tester_nanoarrow.py
@@ -56,8 +56,8 @@ class NanoarrowTester(Tester):
'JSON_PATH': json_path,
'COMMAND': command,
**{
- f'QUIRK_{q}': 1
- for q in quirks or []
+ f'QUIRK_{q}': "1"
+ for q in quirks or ""
},
|
|
de12a7a
to
b855bb3
Compare
After apache/arrow-nanoarrow#627 and apache/arrow-nanoarrow#626, I believe all the integration tests targeting the monorepo implementations will pass. There are still failures for nanoarrow producing/nanoarrow consuming, but I think those are better solved there once this PR is merged since we have the CI job set up to pull from arrow main and build a fresh nanoarrow for each commit on a PR. # From a checkout of this PR
export gold_dir=$(pwd)/../arrow-testing/data/arrow-ipc-stream/integration
export ARROW_NANOARROW_PATH=$(pwd)/../arrow-nanoarrow/build
archery integration --with-nanoarrow 1 --run-ipc \
--gold-dirs=$gold_dir/0.14.1 \
--gold-dirs=$gold_dir/0.17.1 \
--gold-dirs=$gold_dir/1.0.0-bigendian \
--gold-dirs=$gold_dir/1.0.0-littleendian \
--gold-dirs=$gold_dir/2.0.0-compression \
--gold-dirs=$gold_dir/4.0.0-shareddict
|
...and there are still Rust failures due to google/flatbuffers#8150 (originally reported as apache/arrow-rs#5052 ). |
This failure is due to nanoarrow's reliance on post-0.15 stream continuations
and this one is because pre 1.0 unions had null bufffers, which adds one more buffer per union field than we expect
|
Similar to #626 but for union type_id arrays. This should fix the two remaining failures in the integration tests between the IPC implementations in the Arrow repository and the nanoarrow IPC implementation apache/arrow#43715.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
There are still some remaining issues in nanoarrow--nanoarrow and nanoarrow--rust to sort out; however, I think it will be easier to sort those out with this merged (at least from nanoarrow's end).
'COMMAND': command, | ||
**{ | ||
f'QUIRK_{q}': "1" | ||
for q in quirks or "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit (Python strings are sequences, but still)
for q in quirks or "" | |
for q in quirks or () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 will merge after CI is green again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, just a nit
Co-authored-by: Antoine Pitrou <[email protected]>
CI failure unrelated. Merging |
nanoarrow / rust integration tests are failing now it seems: apache/arrow-rs#6448 Do we know if the underlying issue is tracked anywhere? |
I believe it's apache/arrow-rs#5052 ! |
Rationale for this change
Nanoarrow can now read and write IPC files as of apache/arrow-nanoarrow#585 so it should no longer be skipped as a producer/consumer
What changes are included in this PR?
Nanoarrow's tester is updated to point to the new integration executable and to report nanoarrow as a consumer/producer of IPC files.
Notably the
null_trivial
case is skipped even though nanoarrow nominally supports it since it represents a corner case in which nanoarrow's flatbuffers library will not accept some vectors produced by other flatbuffers libraries dvidelabs/flatcc#287Are these changes tested?
Yes
Are there any user-facing changes?
No