-
Notifications
You must be signed in to change notification settings - Fork 212
Sieve of Eratosthenes
Using channels to generate a sequence of prime numbers is an example often seen in presentations about CSP and channels. This turns out to be slightly tricky to implement. This page captures the process of developing a reasonable prime number generator using core.async channels.
We begin with an example in Clojure(Script) that filters out non-prime values using a naive implementation. Look for the chan-of-primes-naive
function below.
One go-loop
creates a channel of integers beginning with the value of 2
, the initial seed value in our sequence of prime numbers. The other go-loop
populates a channel of prime numbers. Notice how it recursively pipes the values of each preceding channel into the newly created channel. Each channel is thereby fed integers from the preceding channel, but through a filtering transducer only retains those integers that are not multiples of the newly known prime value.
Each time through the loop means that the next prime was found (i.e, it wasn't filtered out by the preceding series of transducers as a multiple of a known prime, therefore it must be the next prime value). When that happens, yet another new channel is created with a new filter for the new prime, and the previous channel is piped into this new channel. The result is a series of piped-together channels, each one further constrained than the previous though a combination of its own filter and the effective sum of all the filters in all the channels hooked together in the series.
(defn chan-of-primes-naive []
(let [ints (chan)
primes (chan)]
(go-loop [n 2]
(>! ints n)
(recur (inc n)))
(go-loop [cur-ch ints]
(let [prime (<! cur-ch)
new-ch (chan 1 (filter #(not= 0 (mod % prime))))]
(>! primes prime)
(pipe cur-ch new-ch)
(recur new-ch)))
primes))
The previous code works, but notice that it doesn't take advantage of the composability of transducers. To do that, within our main go-loop we can create a new channel of integers that begins with the previous prime number and has a transducer that combines the new prime filter with the previous filter (using the standard comp
function), like this:
(defn chan-of-ints [xform start-n]
(let [ints (chan 1 xform)]
(go-loop [n start-n]
(>! ints n)
(recur (inc n)))
ints))
(defn chan-of-primes-doomed []
(let [primes (chan)]
(go-loop [cur-xf (map identity)
cur-ch (chan-of-ints cur-xf 2)]
(let [prime (<! cur-ch)
new-xf (comp cur-xf (filter #(not= 0 (mod % prime))))
new-ch (chan-of-ints new-xf prime)]
(>! primes prime)
(recur new-xf new-ch)))
primes))
Notice how this version, which recursively composes a new transducer, also avoids the need to pipe values from the previous channel to the new channel. To see the difference this makes in performance you can do some simple timings from a REPL using code like the following:
(defn consume [n ch]
(dorun (repeatedly n #(<!! ch))))
(time (consume 2000 (chan-of-primes-naive)))
(time (consume 2000 (chan-of-primes-doomed)))
Note also that the order of composition of the transducer is significant. If the order is reversed, as in this example line of code, performance is significantly worse:
new-xf (comp (filter #(not= 0 (mod % prime))) cur-xf)
Now, if we lived in a world where computers had infinite stack sizes, then the previous example wouldn't have been "doomed". But we don't, so it is, and you ought to know why.
The reason is simple. Although composition (through the use of comp
) is powerful and useful in most cases, in this one, where we are using a recursive loop, the composed function is growing and growing on the stack until the point that you get an unwelcome stack overflow error. Doomed.
Here is our third and final attempt. It is neither naive, nor doomed. Which means it is much faster than our first attempt, but won't blow the call stack like our second attempt. And it is still concise and readable.
(defn chan-of-ints [xform start-n]
(let [ints (chan 1 xform)]
(go-loop [n start-n]
(>! ints n)
(recur (inc n)))
ints))
(defn new-prime? [n knowm-primes]
(every? #(not= 0 (mod n %)) knowm-primes))
(defn chan-of-primes []
(let [primes (chan)]
(go-loop [cur-xf (map identity)
cur-ch (chan-of-ints cur-xf 2)
knowns [2]]
(let [prime (<! cur-ch)
knowns (conj knowns prime)
new-xf (filter #(new-prime? % knowns))
new-ch (chan-of-ints new-xf prime)]
(>! primes prime)
(recur new-xf new-ch knowns)))
primes))
The main change here is that we have traded the composability of functions for the conjoinability of data. Specifically, instead of comp
ing the transducer until we run out of room on the call stack, we conj
each newly discovered prime number onto a vector of known primes. We then pass these known primes to our simplified transducer filter function so it can check each integer being fed into the channel against the entire collection of known primes. Once again, simple data trumps all the rest.
And that's how you can generate an infinite sequence of prime numbers using core.async
channels. Keep in mind, this is not a particularly good way to get prime numbers. There are far better ways to do that. There are only two justifications for the existence of this example. The first reason is that everyone else is doing it, so we should as well. A classic justification if ever there was one. The second reason is simply to illustrate an approach to solving a problem using particular tools. So if you have problems to solve where channels and go-loops are your best choice of tools, the hope is that you will learn a thing or two from this not-so-practical example that you can apply to your own, real-world, practical problems.