[kepler-dev] Re: [kepler-cvs] kepler/src/org/sdm/spa ArrayToSequence.java

Tue Jul 6 16:56:45 PDT 2004

Hi Bertram,

Thanks for the links.

In this case, I believe that the workflow would be much more efficient 
if the tokens were processed one by one rather than all in parallel 
because the processing involves querying web services, which is fairly 
expensive.  Since we only want the first couple of tokens that pass a 
certain criteria, there's no need to process all the tokens, and that's 
why I needed a count actor.

With a map actor, I'm not aware of a way to get it to process tokens 
sequentially and also stop processing the rest of them if the previous 
ones fit a certain criteria.  Do you know whether that's possible?

I agree with you completely that there are other places in the workflow 
that could benefit from using a map actor.

Thanks for the suggestion, and I'll add links to some of the papers at 
http://kbi.sdsc.edu/SciDAC-SDM/

Xiaowen

On Jul 6, 2004, at 4:20 PM, Bertram Ludaescher wrote:

>
> Hi Xiaowen, Tobin, and friends of PIW:
>
> I'm not sure whether this solves your problems, but in an earlier
> report, Ilkay and I looked at more detail at the PIW as it was first
> put together for SSDBM'03. The finding was that there that the complex
> control flow could be significantly simplified by using higher order
> functions such as
> 	map :: (a->b) -> [a] -> [b]
> that can effectively act as means to iterate over lists of tokens.
>
> This has been documented in a SPA report:
>
> [LA03] On Providing Declarative Design and Programming Constructs for
> Scientific Workflows based on Process Networks,
> B. Ludäscher I. Altintas, Technical Note, SciDAC-SPA-TN-2003-01, 2003.
> http://kbi.sdsc.edu/SciDAC-SDM/scidac-tn-map-constructs.pdf
> (some slides are also here:
> http://kbi.sdsc.edu/SciDAC-SDM/spa-ptolemy-extensions.ppt)
>
> Also some of the possibilities of controlling iterations through 'map'
> vs forwarding of 'length tokens' vs 'delimiter tokens' are briefly
> mentioned there (if I remember right)
>
> As the next Kepler release will be based on Ptolemy 4.0, the new map
> feature therein will allow us to redo the PIW much cleaner than the
> original one.
>
> hope this helps
>
> Bertram
>
> PS Xiaowen: There several reports etc at
> 	http://kbi.sdsc.edu/SciDAC-SDM/
> that you might want to add to
> 	https://www-casc.llnl.gov/sdm/publications.php
>
>
>
>
>>>>>> "x" == xiaowen  <xin2 at llnl.gov> writes:
> x>
> x> Hi Tobin,
> x> I'll freely admit that org.sdm.spa.Count is something of a hack.  
> Let me
> x> explain what I want it to do, then ask for suggestions of how it 
> could
> x> be better implemented, whether with ptolemy.actors.lib.Accumulator
> x> instead or using another method =)
> x>
> x> Inside the PIW workflow, there's a point where we want to select the
> x> first two elements of a sequence that fit a certain criteria.  The
> x> general flow goes like this:
> x>
> x> [sequence of tokens, outputted one by one]
> x>
> -->
> x>
> x> [Discard if we've already chosen two]
> x>
> -->
> x>
> x> [Discard if it doesn't fit other criteria]
> x>
> -->
> x>
> x> [Update count if the token passes]
> x>
> x>
> x> This process takes place for multiple sets of sequences.
> x>
> x>
> x> Because of this, we need an actor that's capable of keeping track 
> of a
> x> count, and being able to reset the count when a new sequence starts.
> x>
> x>
> x> Let me trace through what org.sdm.spa.Count does at run time.
> x>
> x> When an element arrives to be processed, org.sdm.spa.Count outputs 
> the
> x> current count, whereupon a decision is made whether to discard the
> x> token.  If the token isn't discarded, then it's submitted to further
> x> actors that determine whether it fits other criteria.  At the end of
> x> this, org.sdm.spa.Count is notified of whether the token passed, 
> and if
> x> it did, then the internal count is updated.
> x>
> x> We don't want the processing of the tokens to overlap because 
> processing
> x> one token to figure out whether it passes is fairly expensive in 
> this
> x> workflow.  So the workflow must wait until it knows whether the 
> previous
> x> token passed the test and the count has been updated before it 
> starts
> x> processing the next token.
> x>
> x> So there's a one-to-one correspondence between the output count 
> tokens
> x> sent by org.sdm.spa.Count and the input update tokens.  However, the
> x> output token is sent _before_ it receives the input token.  And the
> x> sending and receiving of them are kept synchronized so that a count 
> is
> x> _not_ sent out until it receives notification of the fate of the
> x> previous token.  This is how the actor helps to ensure that the
> x> processing of one element doesn't occur until we're done processing 
> the
> x> previous element.  Since we're operating in the PN domain, actors 
> that
> x> expect a count from org.sdm.spa.Count will hang until they receive 
> it.
> x>
> x> Thus org.sdm.spa.Count serves a two-fold purpose:
> x>
> x> 1. keep the count
> x> 2. keep the synchronization
> x>
> x>
> x> ptolemy.actors.lib.Accumulator expects both the reset token and the
> x> input token before it sends out an output token.  This is not
> x> interchangeable with org.sdm.spa.Count because I need it to output a
> x> count _before_ it receives the input token.  Perhaps the way to use
> x> ptolemy.actor.lib.Accumulator would be to send it a token in the 
> input
> x> port when a sequence starts, and discard the last notification for 
> each
> x> sequence.  It would be a bit messy, but could work.  What do you 
> think?
> x>
> x>
> x> Hopefully, I got the main idea across and why I needed this actor.  
> If
> x> something's not clear please ask.  Also, will you please suggest
> x> alternative ways of implementing this?
> x>
> x>
> x> Thanks!
> x> Xiaowen
> x>
> x>
> x> Tobin Fricke wrote:
>>> On Fri, 2 Jul 2004, Stephen Andrew Neuendorffer wrote:
>>>
>>>
>>>> Out of curiosity: Why do you think you need this?
>>>> An array in Ptolemy always has elements of the same type.
>>>
>>>
>>> Likewise, how does orb.sdm.spa.Count differ from
>>> ptolemy.actors.lib.Accumulator?
>>>
>>> I think the behavior that lead to org.sdm.apa.ArrayToSequence is 
>>> similar
>>> to the problem I have with my ObjectToRecord actor (described 
>>> earlier).
>>> It is "tempting" to circumvent the type system, but I'd rather not do
>>> that.
>>>
>>> Tobin
>>>
>>> _______________________________________________
>>> kepler-dev mailing list
>>> kepler-dev at ecoinformatics.org
>>> http://www.ecoinformatics.org/mailman/listinfo/kepler-dev
> x>
> x> _______________________________________________
> x> kepler-dev mailing list
> x> kepler-dev at ecoinformatics.org
> x> http://www.ecoinformatics.org/mailman/listinfo/kepler-dev