While the Ruby 1.8 and 1.9 internal threading models differ significantly, the exposed concurrency model is fundamentally the same. Unlike Ruby 1.8, Ruby 1.9 threads do map to native threads, unfortunately the 1.9 interpreter forces user created threads to acquire a global mutex lock before executing. The upshot is that 1.9 thread execution is serialized and unable to benefit from multiple cores.
In-process, thread based concurrency is not the only way to effectively chew cores of course. The traditional Ruby approach to utilizing processing power is to spin up multiple isolated process. Rails apps, backed by mongrel or Phusion Passenger processes are the canonical examples. When using this approach, inter-process communication is relatively rare, and done indirectly through shared resources such as a database, memcached instances, or files. Copious real world deployment success stories have proven the efficacy of this approach for occasional inter-process communication. That said, there is a class of application which requires communication between code executing on separate cores at a volume that is only viable via in-process shared memory access. For the time being, JRuby is the best and and only option for building this class of application in Ruby.
While any in-process concurrency strategy necessitates some form of shared memory access, decades of deadlocks, race conditions, and contention issues have led many developers (including this one) to conclude that explicit management of shared memory is best avoided. Software languages and libraries can be used to raise the level of abstraction in concurrent software, relegating synchronization of shared memory access to system provided infrastructure. One common alternative to explicit, lock based synchronization is message passing based concurrency. In message passing based concurrency, agents or actors execute code concurrently, passing information to each other via messages. It should be noted that part of what makes this strategy reasonable is a requirement (language enforced or convention based) that information sent as part of a message be immutable. Mutating messaged contents effectively negates the benefits of message passing, as the mutations themselves will require synchronization.
Broadly, message based concurrency implementation can be divided into two styles, Actor Model and Process Calculus. The differences between the two styles are well summarized here.
My friend and colleague Joel Friedman and I have published a gem Mailbox (code here) that provides concise message passing infrastructure in JRuby. This post looks at how both styles of message passing concurrency can be expressed with Mailbox.
To kick things off, here is a simple example of an Actor model like usage:
Inclusion of the Mailbox module ensures an instance specific thread will be created for any new instance of the Actor class. Work items are read off of an instance specific queue or “mailbox”. Items are posted to the instances mailbox via calls to mailslot methods.
A less trivial example:
Here the method log acts as an entry point to the Logger instance’s mailbox . That is to say, classes that use an instance of the Logger class can send information as a message via the log method. This message will be processed asynchronously on the thread owned by the Logger class, leaving the caller’s thread free to continue execution.
Note that it is possible to define multiple methods on the Logger class all of which enqueue work to be processed on the one thread owned by the Logger class:
Conceptually, think of this as two mailslots on the Logger class, each of which deliver messages to the Logger instance’s one mailbox. This approach forsakes some of the encapsulation of pattern matching based Actor model implementations for a reduction in code indirection. While callers now know a bit more about the Logger class, the intent of the call is a bit easier to reason about. Note that it is still possible to use a pattern matching approach when desirable:
Mailbox can also be used to take advantage of inherent data parallelism:
While Mailbox works well for this sort of thing, if data parallelism is your primary interest, it’s worth considering Peach as well.
Process Calculus based approaches to message based concurrency add a layer of indirection between messages senders and message receivers. Rather than sending messages directly to a known endpoint, agents (usually referred to as Processes) publish messages on a channel. Processes interested in handling messages subscribe to this same channel. One advantage to this approach is that it lends itself to a loosely coupled design. Processes can publish and receive on the same channel, choosing to act or not act on a given message, without creating any dependencies between the various Processes. This approach is fully supported by Mailbox.
Here a new Jretlang channel (more on Jretlang below) is created and passed into the Logger’s constructor where it is registered. Registering a channel associates the channel with any mailslot method defined with the same channel key (here the log method). Publishing a message on the channel results in an invocation of the mailslot method.
Any object with a reference to a channel can publish on it. Note how the ping pong game below is started by code exterior to the players, which publishes the first “ping” on ping_channel:
The choice between simple mailslots and channel based uses of mailbox can be made on a case by case basis. Where communication is strictly between one actor and another, I’d usually prefer the reduced complexity and indirection of a simple mailslot. When communication is one to many, I’d seriously consider taking advantage of the loose coupling afforded by a channel based approach.
Mailbox’s implementation leans heavily on Jretlang, a gem published by Garreth Reeves that provides a lightweight ruby wrapper around the Jetlang concurrency library authored by Mike Rettig and Peter Royal.
As always, I hope this post and Mailbox is of some use to you. Comments and feedback are greatly appreciated. Joel and I would like to thank Mike Rettig and Peter Royal for providing a great messaging library and Gareth Reeves for making it a pleasure to use in JRuby.
Our thanks also to Nate Austin, Steve Deobald, Jay Fields, Ajit George, Paul Gross, Shane Harvie, John Hume, Gareth Jones, Steve McLarnon, Gareth Reeves, Brian Tatnall, and Zak for their input on this post and the Mailbox gem.