[kepler-dev] Re: [kepler-cvs] kepler/src/org/sdm/spa ArrayToSequence.java

Bertram Ludaescher ludaesch at sdsc.edu
Thu Jul 8 04:15:22 PDT 2004


>>>>> "x" == xiaowen  <xin2 at llnl.gov> writes:
x> 
x> Hi Bertram,
x> Thanks for the links.
x> 
x> In this case, I believe that the workflow would be much more efficient 
x> if the tokens were processed one by one rather than all in parallel 
x> because the processing involves querying web services, which is fairly 
x> expensive.  Since we only want the first couple of tokens that pass a 
x> certain criteria, there's no need to process all the tokens, and that's 
x> why I needed a count actor.
x> 
x> With a map actor, I'm not aware of a way to get it to process tokens 
x> sequentially and also stop processing the rest of them if the previous 
x> ones fit a certain criteria.  Do you know whether that's possible?

The way map works, it returns always the same number of tokens/list
elements that it has been fed, so does not act as a filter.

However to look at your "count problem" the functional programming
(FP) way might prove useful here too. E.g. in FP languages one often
uses a "take" function (e.g., 'take N Xs' would return the first N
elements of the list Xs). A generalization could be the use of a
higher-order filter actor that accepts as one of its inputs a filter
predicate.. (with the hope that the signature and implementation
effort of the filter predicate would be simpler than writing a special 
filter actor).

Hmm.. sounds like an interesting topic for our meeting next week..

Bertram

x> 
x> I agree with you completely that there are other places in the workflow 
x> that could benefit from using a map actor.
x> 
x> Thanks for the suggestion, and I'll add links to some of the papers at 
x> http://kbi.sdsc.edu/SciDAC-SDM/
x> 
x> 
x> Xiaowen
x> 
x> On Jul 6, 2004, at 4:20 PM, Bertram Ludaescher wrote:
x> 
>> 
>> Hi Xiaowen, Tobin, and friends of PIW:
>> 
>> I'm not sure whether this solves your problems, but in an earlier
>> report, Ilkay and I looked at more detail at the PIW as it was first
>> put together for SSDBM'03. The finding was that there that the complex
>> control flow could be significantly simplified by using higher order
>> functions such as
>> map :: (a->b) -> [a] -> [b]
>> that can effectively act as means to iterate over lists of tokens.
>> 
>> This has been documented in a SPA report:
>> 
>> [LA03] On Providing Declarative Design and Programming Constructs for
>> Scientific Workflows based on Process Networks,
>> B. Ludäscher I. Altintas, Technical Note, SciDAC-SPA-TN-2003-01, 2003.
>> http://kbi.sdsc.edu/SciDAC-SDM/scidac-tn-map-constructs.pdf
>> (some slides are also here:
>> http://kbi.sdsc.edu/SciDAC-SDM/spa-ptolemy-extensions.ppt)
>> 
>> Also some of the possibilities of controlling iterations through 'map'
>> vs forwarding of 'length tokens' vs 'delimiter tokens' are briefly
>> mentioned there (if I remember right)
>> 
>> As the next Kepler release will be based on Ptolemy 4.0, the new map
>> feature therein will allow us to redo the PIW much cleaner than the
>> original one.
>> 
>> hope this helps
>> 
>> Bertram
>> 
>> PS Xiaowen: There several reports etc at
>> http://kbi.sdsc.edu/SciDAC-SDM/
>> that you might want to add to
>> https://www-casc.llnl.gov/sdm/publications.php
>> 
>> 
>> 
>> 
>>>>>>> "x" == xiaowen  <xin2 at llnl.gov> writes:
x> 
x> Hi Tobin,
x> I'll freely admit that org.sdm.spa.Count is something of a hack.  
>> Let me
x> explain what I want it to do, then ask for suggestions of how it 
>> could
x> be better implemented, whether with ptolemy.actors.lib.Accumulator
x> instead or using another method =)
x> 
x> Inside the PIW workflow, there's a point where we want to select the
x> first two elements of a sequence that fit a certain criteria.  The
x> general flow goes like this:
x> 
x> [sequence of tokens, outputted one by one]
x> 
--> 
x> 
x> [Discard if we've already chosen two]
x> 
--> 
x> 
x> [Discard if it doesn't fit other criteria]
x> 
--> 
x> 
x> [Update count if the token passes]
x> 
x> 
x> This process takes place for multiple sets of sequences.
x> 
x> 
x> Because of this, we need an actor that's capable of keeping track 
>> of a
x> count, and being able to reset the count when a new sequence starts.
x> 
x> 
x> Let me trace through what org.sdm.spa.Count does at run time.
x> 
x> When an element arrives to be processed, org.sdm.spa.Count outputs 
>> the
x> current count, whereupon a decision is made whether to discard the
x> token.  If the token isn't discarded, then it's submitted to further
x> actors that determine whether it fits other criteria.  At the end of
x> this, org.sdm.spa.Count is notified of whether the token passed, 
>> and if
x> it did, then the internal count is updated.
x> 
x> We don't want the processing of the tokens to overlap because 
>> processing
x> one token to figure out whether it passes is fairly expensive in 
>> this
x> workflow.  So the workflow must wait until it knows whether the 
>> previous
x> token passed the test and the count has been updated before it 
>> starts
x> processing the next token.
x> 
x> So there's a one-to-one correspondence between the output count 
>> tokens
x> sent by org.sdm.spa.Count and the input update tokens.  However, the
x> output token is sent _before_ it receives the input token.  And the
x> sending and receiving of them are kept synchronized so that a count 
>> is
x> _not_ sent out until it receives notification of the fate of the
x> previous token.  This is how the actor helps to ensure that the
x> processing of one element doesn't occur until we're done processing 
>> the
x> previous element.  Since we're operating in the PN domain, actors 
>> that
x> expect a count from org.sdm.spa.Count will hang until they receive 
>> it.
x> 
x> Thus org.sdm.spa.Count serves a two-fold purpose:
x> 
x> 1. keep the count
x> 2. keep the synchronization
x> 
x> 
x> ptolemy.actors.lib.Accumulator expects both the reset token and the
x> input token before it sends out an output token.  This is not
x> interchangeable with org.sdm.spa.Count because I need it to output a
x> count _before_ it receives the input token.  Perhaps the way to use
x> ptolemy.actor.lib.Accumulator would be to send it a token in the 
>> input
x> port when a sequence starts, and discard the last notification for 
>> each
x> sequence.  It would be a bit messy, but could work.  What do you 
>> think?
x> 
x> 
x> Hopefully, I got the main idea across and why I needed this actor.  
>> If
x> something's not clear please ask.  Also, will you please suggest
x> alternative ways of implementing this?
x> 
x> 
x> Thanks!
x> Xiaowen
x> 
x> 
x> Tobin Fricke wrote:
>>>> On Fri, 2 Jul 2004, Stephen Andrew Neuendorffer wrote:
>>>> 
>>>> 
>>>>> Out of curiosity: Why do you think you need this?
>>>>> An array in Ptolemy always has elements of the same type.
>>>> 
>>>> 
>>>> Likewise, how does orb.sdm.spa.Count differ from
>>>> ptolemy.actors.lib.Accumulator?
>>>> 
>>>> I think the behavior that lead to org.sdm.apa.ArrayToSequence is 
>>>> similar
>>>> to the problem I have with my ObjectToRecord actor (described 
>>>> earlier).
>>>> It is "tempting" to circumvent the type system, but I'd rather not do
>>>> that.
>>>> 
>>>> Tobin
>>>> 
>>>> _______________________________________________
>>>> kepler-dev mailing list
>>>> kepler-dev at ecoinformatics.org
>>>> http://www.ecoinformatics.org/mailman/listinfo/kepler-dev
x> 
x> _______________________________________________
x> kepler-dev mailing list
x> kepler-dev at ecoinformatics.org
x> http://www.ecoinformatics.org/mailman/listinfo/kepler-dev



More information about the Kepler-dev mailing list