Skip to content

Commit

Permalink
finagle-doc: ThreadingModel docs clean up
Browse files Browse the repository at this point in the history
Problem / Solution

Cleaning up Threading Model docs.

JIRA Issues: CSL-8053

Differential Revision: https://phabricator.twitter.biz/D330112
  • Loading branch information
vkostyukov authored and jenkins committed Jun 19, 2019
1 parent 2a53531 commit 6c63dc7
Showing 1 changed file with 32 additions and 32 deletions.
64 changes: 32 additions & 32 deletions doc/src/sphinx/ThreadingModel.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,61 +2,61 @@ Threading Model
===============

Similar to other non-blocking event-driven frameworks, Finagle leverages a fixed size thread pool
(IO worker pool) shared between all clients and servers running within the same JVM process. While
(I/O worker pool) shared between all clients and servers running within the same JVM process. While
unlocking tremendous scalability potential (less resources are spent to process more events), this
approach comes at the expense of services' responsiveness, being notably sensitive to blocking
operations. Put this way, when blocked, IO threads can't process the incoming events (new requests,
operations. Put this way, when blocked, I/O threads can't process the incoming events (new requests,
new connections) thereby degrading the system's overall velocity.

Somewhat conservative defaults for Finagle's worker pool size only exaggerate the problem: there are
very few (two per each logical CPU core with the floor of 8 workers) threads in the pool and
blocking even one of them may affect multiple clients and servers. While it's highly recommended
sticking to these defaults (numerous experiments showed they are good), it's possible to override
the worker pool size with a command line flag.
few (two per each logical CPU core with the floor of 8 workers) threads in the pool and blocking
even one of them may affect multiple clients and servers. While it's recommended sticking to these
defaults (numerous experiments showed they are good), it's possible to override the worker pool size
with a command line flag.

The following sets the worker pool size to 24.

.. code-block:: scala
-com.twitter.finagle.netty4.numWorkers=24
IO Threads Affinity
I/O Thread Affinity
-------------------

As evident by the flag name, Finagle's IO threads are allocated and managed within its underlying
As evident by the flag name, Finagle's I/O threads are allocated and managed within its underlying
networking library, Netty_. Built on top of native non-blocking I/O APIs (epoll_, kqueue_), Netty
dictates an affinity between IO threads and the list of opened file descriptors (network sockets)
they are watching over. Once network connection is established, it's assigned a serving IO thread
that can never be reassigned. This is why blocking even a single IO thread halts progress in
dictates an affinity between I/O threads and the list of opened file descriptors (network sockets)
they are watching over. Once a network connection is established, it's assigned a serving I/O thread
that can never be reassigned. This is why blocking even a single I/O thread halts progress in
multiple independent connections (clients, servers).

IO Threads and User Code
------------------------
I/O Threads and User Code
-------------------------

Although not immediately obvious, all users' code that's triggered by receiving a message is run on
Finagle IO threads, unless explicitly executed somewhere else (e.g., application's own thread pool, a
Finagle I/O threads, unless explicitly executed somewhere else (e.g., application's own thread pool, a
:ref:`FuturePool <future_pools>`). This includes server's `Service.apply` and all `Future` callbacks
but excludes application startup code (executed in the main thread) and the work triggered by a
timeout expiring (executed in the timer thread).

This design is on par with asynchronous nature of :doc:`Twitter Futures <developers/Futures>`, which
purely a coordination mechanism that does not describe any execution environment (callbacks are run
in whatever thread satisfies a `Promise`). Given it's typically IO threads that satisfy promises
(pending future RPC responses), they are also on the hook for running promises' callbacks. Sticking
with the caller thread reduces context switches at the expense of making IO threads vulnerable to
users blocking (or just slow) code.
This design is on par with the asynchronous nature of :doc:`Twitter Futures <developers/Futures>`,
which are purely a coordination mechanism that does not describe any execution environment
(callbacks are run in whatever thread satisfies a `Promise`). Given it's typically I/O threads that
satisfy promises (pending future RPC responses), they are also on the hook for running promises'
callbacks. Sticking with the caller thread reduces context switches at the expense of making I/O
threads vulnerable to users blocking (or just slow) code.

Blocking Examples
-----------------

Not only blocking I/O (e.g., calling a JDBC driver, using JDK's file API) can block Finagle threads.
CPU intensive computations are equally (if not more) dangerous: keeping IO threads busy processing
Not only blocking I/O (e.g., calling a JDBC driver, using the JDK's File API) can block Finagle
threads. CPU intensive computations are equally dangerous: keeping I/O threads busy processing
application-level work means they underserve critical RPC events.

Consider a CPU-bound server application running an algorithm with profoundly bad asymptotic
complexity. Scala's `permutations` operation can serve a good example with its O(n!) worst case
running time. As implemented below, such a server would have a hard time staying responsive as its
IO threads would be constantly busy (blocked) calculating permutations as opposed to serving
I/O threads would be constantly busy (blocked) calculating permutations as opposed to serving
networking events.

.. code-block:: scala
Expand All @@ -76,7 +76,7 @@ networking events.
A comparably bad thing can happen to clients too. Running `permutations` as part of any callback
(`flatMap`, `onSuccess`, `onFailure`, etc) on a `Future` returned from a client has the exact same
effect - it saturates an IO thread.
effect - it saturates an I/O thread.

.. code-block:: scala
Expand All @@ -90,33 +90,33 @@ Identifying Blocking
--------------------

General-purpose JVM profilers and even jstack_ can be quite effective in identifying bottlenecks on
the request path (within IO threads). There are, however, two metrics that can provide a valuable
the request path (within I/O threads). There are, however, two metrics that can provide a valuable
insight with no need for external tools.

- `blocking_ms` - a counter of total time spent in `Await.result` and `Await.ready` blocking an IO
- `blocking_ms` - a counter of total time spent in `Await.result` and `Await.ready` blocking an I/O
thread. Refer to `this blog post <https://finagle.github.io/blog/2016/09/01/block-party/>`_ on
what to do when this counter is not zero.

- `pending_io_events` - a gauge of the number of pending IO events enqueued in all event loops
serving this client or server. When this metric climbs up, it indicates IO queues are clogged
and IO threads overloaded.
- `pending_io_events` - a gauge of the number of pending I/O events enqueued in all event loops
serving this client or server. When this metric climbs up, it indicates I/O queues are clogged
and I/O threads overloaded.

Whereas getting rid of `Await` on the request path is generally advised, there is no guidance that
could be provided with regards to what is a healthy number of pending IO events. Clearly, striving
could be provided with regards to what is a healthy number of pending I/O events. Clearly, striving
for "zero" or "near zero" might be a reasonable strategy if taken not as the gold standard but a
friendly recommendation. Depending on the workload, even double-digit values could be acceptable for
some applications.

Offloading
----------

Shifting users' work off of IO threads can go a long way in improving an application's
Shifting users' work off of I/O threads can go a long way in improving an application's
responsiveness, minding the increase in context switches as well as associated cost of managing
additional JVM threads. However, run your own tests to determine if offloading is good for your
service given its traffic profile and resource allocation.

:ref:`FuturePools <future_pools>` provide a convenient API to wrap any expression with a `Future`
that's scheduled in the underlying ExecutorService_. They come in handy for offloading the IO
that's scheduled in the underlying ExecutorService_. They come in handy for offloading the I/O
threads in Finagle while preserving the first-class support to Twitter Futures (interrupts, locals).

Offloading can be done on per-method (endpoint) basis:
Expand Down

0 comments on commit 6c63dc7

Please sign in to comment.