Add tests for record batch size splitting logic in FlightClient #3481

alamb · 2023-01-06T18:29:42Z

Which issue does this PR close?

re #3478

Rationale for this change

There is ongoing drama downstream in IOx related to maximum message sizes.

This PR adds some tests for important cases to document the current behavior (and hopefully make fixing #3478 easier)

What changes are included in this PR?

Tests that encode RecordBatches and check how far off the encoded size is from the requested "flight_max_data_size"

Are there any user-facing changes?

No

alamb · 2023-01-06T18:30:21Z

arrow-flight/src/encode.rs

+        ])
+        .unwrap();
+
+        verify_encoded_split(batch, 112).await;


This is pretty good -- only 112 bytes above desired max (I think that is mostly the various alignment and padding overhead)

alamb · 2023-01-06T18:30:40Z

arrow-flight/src/encode.rs

+        .unwrap();
+
+        // 5k over limit (which is 2x larger than limit of 5k) -- not great :(
+        verify_encoded_split(batch, 5800).await;


This is pretty bad -- over 2x larger than the largest message size limit

tustvold

Might be worth working in the issue somewhere to make clear these are tests of the current behaviour, not necessarily the long-term desired behaviour

tustvold · 2023-01-06T18:49:29Z

arrow-flight/src/encode.rs

+            }
+        }
+
+        // ensure that the specified overage is exactly the maxmium than necessary


This comment is a little funky

agreed -- clarified

alamb · 2023-01-06T18:57:41Z

Might be worth working in the issue somewhere to make clear these are tests of the current behaviour, not necessarily the long-term desired behaviour

Good idea -- added

ursabot · 2023-01-06T19:22:48Z

Benchmark runs are scheduled for baseline = b4d5705 and contender = 7805a81. 7805a81 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Add tests for record batch size splitting logic in FlightClient

2862fcf

github-actions bot added the arrow-flight Changes to the arrow-flight crate label Jan 6, 2023

alamb commented Jan 6, 2023

View reviewed changes

alamb marked this pull request as ready for review January 6, 2023 18:31

alamb mentioned this pull request Jan 6, 2023

Improve ability of FlightDataEncoder to respect max_flight_data_size for certain data types (strings, dictionaries, etc) #3478

Open

cargo clippy --fix

b53b257

tustvold approved these changes Jan 6, 2023

View reviewed changes

fix: Improve comments

8575cd8

alamb merged commit 7805a81 into apache:master Jan 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tests for record batch size splitting logic in FlightClient #3481

Add tests for record batch size splitting logic in FlightClient #3481

alamb commented Jan 6, 2023

alamb Jan 6, 2023

alamb Jan 6, 2023

tustvold left a comment

tustvold Jan 6, 2023

alamb Jan 6, 2023

alamb commented Jan 6, 2023

ursabot commented Jan 6, 2023

Add tests for record batch size splitting logic in FlightClient #3481

Add tests for record batch size splitting logic in FlightClient #3481

Conversation

alamb commented Jan 6, 2023

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

alamb Jan 6, 2023

Choose a reason for hiding this comment

alamb Jan 6, 2023

Choose a reason for hiding this comment

tustvold left a comment

Choose a reason for hiding this comment

tustvold Jan 6, 2023

Choose a reason for hiding this comment

alamb Jan 6, 2023

Choose a reason for hiding this comment

alamb commented Jan 6, 2023

ursabot commented Jan 6, 2023