[kepler-dev] mysterious merge ... (or not)

Thu Apr 14 09:47:31 PDT 2005

BTW: I forgot to mention what the definition of a fair merge is. My
working definition is: A merge operator is fair, if on for any given
input token sequences, it eventually produces an (interleaved)
sequence of *all* input tokens in the output sequence.

When both input sequences/streams have equal length, there is usually
no problem. 

When they have different lengths, things get more complicated, since a
deterministic merge shouldn't depend on the arrival order and can't
forsee whether tokens will arrive or not. 

Hence one might need to resort to a non-determinstic (e.g.,
first-come, first-served) merge to be fair..

The relation to Xiaowen's email seems to be this:
- She ran into a merge problem and along the way found a bug(?) in the
PN director and proposed a fix to ptolemy-hackers
- I'm also curious why the particular merge cannot be solved 
(a) deterministically (we had a deterministic workflow at first;
see the PPS below; but maybe it didn't have the case that Xiaowen is
dealing with now)
(b) via a simple non-deterministic but fair merge actor. Seems the PN
director prevents one from writing such an actor. If one "hacks" a
a workaround, this might break something in the model of computation
of PN!?

Bertram

Bertram Ludaescher writes:
 > 
 > Dear all:
 > 
 > Here are some earlier notes I made when trying to understand better
 > what the issues of fair (and not so fair) merge are. 
 > 
 > [[Digression: *after* you're done with reading this email, you might
 > find it an amusing that I had written the first version of it while
 > sitting in an airport shuttle waiting at Terminal A (Sacramento) for a
 > *potential* passenger that *might* arrive on the next plane. The
 > driver was in a dilemma: should he stay at Terminal A continuing to
 > wait for a passenger who might never show up, or should he drive to
 > Terminal B, picking up any waiting passengers there (and then maybe go
 > back to A!?) ... well, that's a variant of the mysterious merge ...]]
 > 
 > OK, so here is the merge problem (as I understand it): 
 > 
 > Given are two input "assembly lines" L1 and L2, on which tokens arrive
 > every now and then. There is a third line L3 on which a "merge" of the
 > incoming token streams of L1 and L2 is to be placed by an actor, and in
 > a way, that preserves the ordering of each incoming stream L1, L2:
 > 
 > L1
 > ------->[_]  0
 >             \|/        L3
 > L2           |    [_]------->
 > ------->[_] / \ 
 >            merge
 >            actor  
 > 
 > Let's say we want the merge actor in the middle to be 'deterministic'.
 > That is, for any pair of sequences showing up on L1 and L2, the output
 > sequence L3 will always be the same.  This is usually a desirable
 > property and often care is taken in the underlying models of computation
 > (e.g. PN = Process Network) to guarantee it.
 > 
 > For example, in PN an actor shouldn't be able to test for the presence
 > of a token (else it's very easy to accidently write a non-deterministic
 > actor); e.g. this can be achieved by always returning 'true' when
 > testing token presence. Only when a token is to be actually consumed,
 > and there isn't a token, then the actor will be put to "rest" by the
 > director .. until the token really arrives (if it ever does).
 > 
 > So what are examples of determinstic merge actors? Here is one: 
 > deterministic_merge : L1, L2 --> L3 
 > {   consume(L1,X); 
 >     consume(L2,Y);
 >     produce(L3,X);
 >     produce(L3,Y) }
 > 
 > What is the problem with this? This actor is deterministic but NOT
 > 'fair'. For example assume we have these sequences:
 > 
 > L1 = [x1,x2]
 > L2 = [y1,y2,y3, ...]
 > 
 > The deterministic actor above will output 
 > L3 = [x1,y1,x2,y2] and then STALL, waiting FOREVER on x3 at L1. 
 > (The passenger that never arrives..)
 > 
 > There is no way for the merge actor to forsee whether and when the next
 > token might arrive. 
 > 
 > Alternative solution: use a *non-deterministic* (but fair) actor. 
 > How can it work? 
 > 
 > Simply merge the tokens according to their time of arrival. 
 > Highly non-deterministic, but quite useful to guarantee fairness.
 > Indeed it seems that there is a trade-off between the two.
 > You can't always have them together.
 > 
 > But NOTE: In practical settings, one may have a timeout associated
 > with each channel, so that after not having seen a token after a
 > sufficiently long time on some channel (say L1), it is "known" that no
 > token will arrive on L1, thereby allowing the actor to process the
 > tokens from L2.  If the "time-out assumption" is valid (no token can
 > arrive after the time-out), then this yields a fair (and still
 > deterministic!) merge actor.  However, if the time-out assumption is
 > violated (which even happens in practice), then the actor becomes
 > non-deterministic again, since the merge actor can prematurely pick
 > the "wrong" token (say if tokens came with an increasing serial number
 > and the task was to merge the tokens in-order.)
 > 
 > Bertram
 > 
 > PS for a figure of the poor merge actor in action see
 > http://users.sdsc.edu/~ludaesch//ECS289F-W05/ECS289F-W05-17-misc.pdf
 > (last 2.5 pages)
 > 
 > PPS The original PIW had been modeled deterministically in Haskell;
 > see this Tech Report from August 2003:
 > http://kbi.sdsc.edu/SciDAC-SDM/scidac-tn-map-constructs.pdf
 > 
 > In there is a deterministic merge using the "zip" operator. 
 > 
 > PPPS I wrote this originally while being a "token" that was merged
 > into the "shuttle actor" to Davis; the driver did in fact decide to
 > move to the second terminal (there were indeed people!) and then came
 > back to the first terminal to pick up yet another person... It took a
 > while to get home on that day..
 > 
 > _______________________________________________
 > Kepler-dev mailing list
 > Kepler-dev at ecoinformatics.org
 > http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/kepler-dev