You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
By design, the collector immediately responds to an http requests, and then asynchronously attempts to sink the payload to the output queue (pubsub/kinesis). One problem with this design, is that the collector can run out of memory if it accepting http requests more quickly than it can sink events. In particular, this can happen when the sink is unhealthy, and we have seen it happen during startup when the sink is not warmed up.
To avoid running out of memory, we need the collector to stop accepting new http requests when the number of in-memory events reaches a limit.
We can achieve this with a Semaphore, so akka's server threads get blocked trying to acquire memory-permits until memory-permits are available.
Algorithm
The semaphore's permits represent bytes held in memory, which we treat as a scarce resource.
Start with a semaphore with a large number of permits, representing a fairly large amount of memory in bytes. By default 1 quarter of maximum heap. Configurable.
For each request, try to acquire enough permits to hold that event in memory.
If it cannot acquire the permits within 10 seconds (configurable) then return a 503. This is unfortunate, but it represents the case where the collector is buffering too many events already, probably because of an unhealthy sink. It is more important that the collector stays alive rather than risk OOM.
If it can acquire the permits within 10 seconds then add the event to the buffer to send later.
When the event is flushed to the sink, then release the permits.
The text was updated successfully, but these errors were encountered:
By design, the collector immediately responds to an http requests, and then asynchronously attempts to sink the payload to the output queue (pubsub/kinesis). One problem with this design, is that the collector can run out of memory if it accepting http requests more quickly than it can sink events. In particular, this can happen when the sink is unhealthy, and we have seen it happen during startup when the sink is not warmed up.
To avoid running out of memory, we need the collector to stop accepting new http requests when the number of in-memory events reaches a limit.
We can achieve this with a Semaphore, so akka's server threads get blocked trying to acquire memory-permits until memory-permits are available.
Algorithm
The semaphore's permits represent bytes held in memory, which we treat as a scarce resource.
Start with a semaphore with a large number of permits, representing a fairly large amount of memory in bytes. By default 1 quarter of maximum heap. Configurable.
For each request, try to acquire enough permits to hold that event in memory.
If it cannot acquire the permits within 10 seconds (configurable) then return a 503. This is unfortunate, but it represents the case where the collector is buffering too many events already, probably because of an unhealthy sink. It is more important that the collector stays alive rather than risk OOM.
If it can acquire the permits within 10 seconds then add the event to the buffer to send later.
When the event is flushed to the sink, then release the permits.
The text was updated successfully, but these errors were encountered: