[kepler-dev] mysterious merge ... (or not)
ludaesch at ucdavis.edu
Wed Apr 13 20:41:42 PDT 2005
Here are some earlier notes I made when trying to understand better
what the issues of fair (and not so fair) merge are.
[[Digression: *after* you're done with reading this email, you might
find it an amusing that I had written the first version of it while
sitting in an airport shuttle waiting at Terminal A (Sacramento) for a
*potential* passenger that *might* arrive on the next plane. The
driver was in a dilemma: should he stay at Terminal A continuing to
wait for a passenger who might never show up, or should he drive to
Terminal B, picking up any waiting passengers there (and then maybe go
back to A!?) ... well, that's a variant of the mysterious merge ...]]
OK, so here is the merge problem (as I understand it):
Given are two input "assembly lines" L1 and L2, on which tokens arrive
every now and then. There is a third line L3 on which a "merge" of the
incoming token streams of L1 and L2 is to be placed by an actor, and in
a way, that preserves the ordering of each incoming stream L1, L2:
L2 | [_]------->
------->[_] / \
Let's say we want the merge actor in the middle to be 'deterministic'.
That is, for any pair of sequences showing up on L1 and L2, the output
sequence L3 will always be the same. This is usually a desirable
property and often care is taken in the underlying models of computation
(e.g. PN = Process Network) to guarantee it.
For example, in PN an actor shouldn't be able to test for the presence
of a token (else it's very easy to accidently write a non-deterministic
actor); e.g. this can be achieved by always returning 'true' when
testing token presence. Only when a token is to be actually consumed,
and there isn't a token, then the actor will be put to "rest" by the
director .. until the token really arrives (if it ever does).
So what are examples of determinstic merge actors? Here is one:
deterministic_merge : L1, L2 --> L3
What is the problem with this? This actor is deterministic but NOT
'fair'. For example assume we have these sequences:
L1 = [x1,x2]
L2 = [y1,y2,y3, ...]
The deterministic actor above will output
L3 = [x1,y1,x2,y2] and then STALL, waiting FOREVER on x3 at L1.
(The passenger that never arrives..)
There is no way for the merge actor to forsee whether and when the next
token might arrive.
Alternative solution: use a *non-deterministic* (but fair) actor.
How can it work?
Simply merge the tokens according to their time of arrival.
Highly non-deterministic, but quite useful to guarantee fairness.
Indeed it seems that there is a trade-off between the two.
You can't always have them together.
But NOTE: In practical settings, one may have a timeout associated
with each channel, so that after not having seen a token after a
sufficiently long time on some channel (say L1), it is "known" that no
token will arrive on L1, thereby allowing the actor to process the
tokens from L2. If the "time-out assumption" is valid (no token can
arrive after the time-out), then this yields a fair (and still
deterministic!) merge actor. However, if the time-out assumption is
violated (which even happens in practice), then the actor becomes
non-deterministic again, since the merge actor can prematurely pick
the "wrong" token (say if tokens came with an increasing serial number
and the task was to merge the tokens in-order.)
PS for a figure of the poor merge actor in action see
(last 2.5 pages)
PPS The original PIW had been modeled deterministically in Haskell;
see this Tech Report from August 2003:
In there is a deterministic merge using the "zip" operator.
PPPS I wrote this originally while being a "token" that was merged
into the "shuttle actor" to Davis; the driver did in fact decide to
move to the second terminal (there were indeed people!) and then came
back to the first terminal to pick up yet another person... It took a
while to get home on that day..
More information about the Kepler-dev