[kepler-dev] mysterious merge ... (or not)
Bertram Ludaescher
ludaesch at ucdavis.edu
Thu Apr 14 09:47:31 PDT 2005
BTW: I forgot to mention what the definition of a fair merge is. My
working definition is: A merge operator is fair, if on for any given
input token sequences, it eventually produces an (interleaved)
sequence of *all* input tokens in the output sequence.
When both input sequences/streams have equal length, there is usually
no problem.
When they have different lengths, things get more complicated, since a
deterministic merge shouldn't depend on the arrival order and can't
forsee whether tokens will arrive or not.
Hence one might need to resort to a non-determinstic (e.g.,
first-come, first-served) merge to be fair..
The relation to Xiaowen's email seems to be this:
- She ran into a merge problem and along the way found a bug(?) in the
PN director and proposed a fix to ptolemy-hackers
- I'm also curious why the particular merge cannot be solved
(a) deterministically (we had a deterministic workflow at first;
see the PPS below; but maybe it didn't have the case that Xiaowen is
dealing with now)
(b) via a simple non-deterministic but fair merge actor. Seems the PN
director prevents one from writing such an actor. If one "hacks" a
a workaround, this might break something in the model of computation
of PN!?
Bertram
Bertram Ludaescher writes:
>
> Dear all:
>
> Here are some earlier notes I made when trying to understand better
> what the issues of fair (and not so fair) merge are.
>
> [[Digression: *after* you're done with reading this email, you might
> find it an amusing that I had written the first version of it while
> sitting in an airport shuttle waiting at Terminal A (Sacramento) for a
> *potential* passenger that *might* arrive on the next plane. The
> driver was in a dilemma: should he stay at Terminal A continuing to
> wait for a passenger who might never show up, or should he drive to
> Terminal B, picking up any waiting passengers there (and then maybe go
> back to A!?) ... well, that's a variant of the mysterious merge ...]]
>
> OK, so here is the merge problem (as I understand it):
>
> Given are two input "assembly lines" L1 and L2, on which tokens arrive
> every now and then. There is a third line L3 on which a "merge" of the
> incoming token streams of L1 and L2 is to be placed by an actor, and in
> a way, that preserves the ordering of each incoming stream L1, L2:
>
> L1
> ------->[_] 0
> \|/ L3
> L2 | [_]------->
> ------->[_] / \
> merge
> actor
>
> Let's say we want the merge actor in the middle to be 'deterministic'.
> That is, for any pair of sequences showing up on L1 and L2, the output
> sequence L3 will always be the same. This is usually a desirable
> property and often care is taken in the underlying models of computation
> (e.g. PN = Process Network) to guarantee it.
>
> For example, in PN an actor shouldn't be able to test for the presence
> of a token (else it's very easy to accidently write a non-deterministic
> actor); e.g. this can be achieved by always returning 'true' when
> testing token presence. Only when a token is to be actually consumed,
> and there isn't a token, then the actor will be put to "rest" by the
> director .. until the token really arrives (if it ever does).
>
> So what are examples of determinstic merge actors? Here is one:
> deterministic_merge : L1, L2 --> L3
> { consume(L1,X);
> consume(L2,Y);
> produce(L3,X);
> produce(L3,Y) }
>
> What is the problem with this? This actor is deterministic but NOT
> 'fair'. For example assume we have these sequences:
>
> L1 = [x1,x2]
> L2 = [y1,y2,y3, ...]
>
> The deterministic actor above will output
> L3 = [x1,y1,x2,y2] and then STALL, waiting FOREVER on x3 at L1.
> (The passenger that never arrives..)
>
> There is no way for the merge actor to forsee whether and when the next
> token might arrive.
>
> Alternative solution: use a *non-deterministic* (but fair) actor.
> How can it work?
>
> Simply merge the tokens according to their time of arrival.
> Highly non-deterministic, but quite useful to guarantee fairness.
> Indeed it seems that there is a trade-off between the two.
> You can't always have them together.
>
> But NOTE: In practical settings, one may have a timeout associated
> with each channel, so that after not having seen a token after a
> sufficiently long time on some channel (say L1), it is "known" that no
> token will arrive on L1, thereby allowing the actor to process the
> tokens from L2. If the "time-out assumption" is valid (no token can
> arrive after the time-out), then this yields a fair (and still
> deterministic!) merge actor. However, if the time-out assumption is
> violated (which even happens in practice), then the actor becomes
> non-deterministic again, since the merge actor can prematurely pick
> the "wrong" token (say if tokens came with an increasing serial number
> and the task was to merge the tokens in-order.)
>
> Bertram
>
> PS for a figure of the poor merge actor in action see
> http://users.sdsc.edu/~ludaesch//ECS289F-W05/ECS289F-W05-17-misc.pdf
> (last 2.5 pages)
>
> PPS The original PIW had been modeled deterministically in Haskell;
> see this Tech Report from August 2003:
> http://kbi.sdsc.edu/SciDAC-SDM/scidac-tn-map-constructs.pdf
>
> In there is a deterministic merge using the "zip" operator.
>
> PPPS I wrote this originally while being a "token" that was merged
> into the "shuttle actor" to Davis; the driver did in fact decide to
> move to the second terminal (there were indeed people!) and then came
> back to the first terminal to pick up yet another person... It took a
> while to get home on that day..
>
> _______________________________________________
> Kepler-dev mailing list
> Kepler-dev at ecoinformatics.org
> http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/kepler-dev
More information about the Kepler-dev
mailing list