You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, if the streaming transformer is configured with 5 minute windows, then it emits batches at exactly 12:00, 12:05, 12:10 etc. If there are, say, 50 instances of the streaming transformer running in parallel, then we get 50 batches all emitted at exactly the same time. This creates a backlog for the loader, which the loader slowly handles over the course of a few minutes.
It would be slightly better if the 50 instances emit batches at slight offsets to each other. For example, instance 1 emits batches at 12:01, 12:06, 12:11, and instance 2 emits batches at 12:02, 12:07, 12:12. This way, the loader receives a more steady stream of batches to load, and it could reduce the overall latency of events reaching the warehouse.
This is best implemented by letting the transformer randomly choose the time of its first window when it first starts up.
See also #1197, which is the main reason we're going to need flexible emit times.
The text was updated successfully, but these errors were encountered:
Currently, if the streaming transformer is configured with 5 minute windows, then it emits batches at exactly 12:00, 12:05, 12:10 etc. If there are, say, 50 instances of the streaming transformer running in parallel, then we get 50 batches all emitted at exactly the same time. This creates a backlog for the loader, which the loader slowly handles over the course of a few minutes.
It would be slightly better if the 50 instances emit batches at slight offsets to each other. For example, instance 1 emits batches at 12:01, 12:06, 12:11, and instance 2 emits batches at 12:02, 12:07, 12:12. This way, the loader receives a more steady stream of batches to load, and it could reduce the overall latency of events reaching the warehouse.
This is best implemented by letting the transformer randomly choose the time of its first window when it first starts up.
See also #1197, which is the main reason we're going to need flexible emit times.
The text was updated successfully, but these errors were encountered: