On Fri, 12 Oct 2007 15:18:26 +0900, M. Edward (Ed) Borasky wrote:
- There are basically two ways to do concurrency/parallelism: shared
memory and message passing. I’m not sure what the real tradeoffs are
– I’ve pretty much had my brain hammered with “shared memory bad –
message passing good”, but there obviously must be some
counter-arguments, or shared memory wouldn’t exist.
(Putting on my fake beard) In the old days…
Shared memory is (probably) always the fastest solution; in fact, on
some
OS’s, local message passing is implemented as a layer on top of shared
memory.
But, of course, if you implement concurrency in terms of shared memory,
you
have to worry about lock contention, queue starvation, and all the other
things that generally get handled for you if you use a higher-level
messaging protocol. And your software is now stuck with assuming that
the
sender and receiver are on the same machine; most other messaging
libraries
will work equally well on the same or different machines.
Back when machines were much slower, I had an application that already
used
shared memory for caching, but was bogging down on system-level message
passing calls. There were three worker servers that would make requests
to
the cache manager to load a record from the database into the
shared-memory
cache for the worker to use. (This serialized database access and
reduced
the number of open file handles.)
So I changed them to stop using the kernel-level message-queuing
routines;
instead, they’d store their requests in a linked list that was kept in a
different shared memory region. The cache manager would unlink the
first
request from the list, process it, and link that same request structure
back onto the “reply” list with a return code. The requests/replies
were
very small, stayed in processor cache, etc., and there was much less
context-switching in and out of kernel mode since the queuing was now
all
userland. This also saved a lot of memory allocate/frees, another
expensive operation at the time; most message-passing involves at least
one
full copy of the message.
Occasionally, the request list would empty out, in which case we had to
use
the kernel to notify the cache manager to wake up and check its list,
but
that was a rare occasion, and “notify events” on that system were still
much cheaper than a queue message.
I would doubt that any of this type of optimization applies to Ruby on
modern OS’s, however.