[kepler-dev] Re: sampling

Wed May 26 12:02:11 PDT 2004

This sounds like an interesting example, but I am not sure I completely 
understand the problem ...

 > I want to be able to sample in any dimension (space: x,y,z, time t, 
or > other dimensions).

So, 'sample dimension' would be a parameter for the workflow?

 > We should be able to figure out which fields in a parsed dataset are
 > which, but I need to also allow for the selection of other 
dimensions, > which will have to be identified by the user.

I am not sure what this implies ... are you meaning that you can 
determine the 'sample dimension' from the columns of a database? And 
that one of these dimensions can be selected as the parameter value for 
the workflow? Or, are these independent: the columns and the 'sample 
dimension'?

 > I thought about setting this up so that we could use your eml
 > ingestion actor to parse a file, then send it to the sampler.
 > However, that requires mapping specific eml outport ports to specific
 > sampler input ports, which will not be known until runtime.  What is
 > the best way to set it up so that the user can send a file to the
 > sampler actor, see some information about the fields, and either
 > select or parameterize the correct fields to be sampled?

What do you mean by "parameterize the correct fields" here?

 >> Seems like we could have the SMS figure out, at the
 >> beginning of the run, what actors need to be parameterized based on
 >> runtime data, and could prompt the user for those inputs.

In this case, by analyzing the semantic annotations, we would know that 
the sample actor requires a 'sample dimension' parameter prior to 
executing the workflow.   So, prior to running the workflow, this port 
(parameter, or whatever), would need to be bound, and bound to something 
that provides a 'sample dimension'. If a user has found data to be 
pushed through the workflow, SMS could tell the user at workflow setup 
time the ports that are required, possibly offering suggestions as to 
how the ports could be bound via the given data. Note that this is a 
static analysis based on semantic annotations, i.e., the analysis is not 
done at runtime by analyzing the data.

For this example, SMS could state that the worlflow requires a 'sample 
dimension', and present the set of sample dimensions that can be found 
in the data provided by the user (by analyzing the semantic annotations 
over the data), allowing the user to select one of the possible 
dimensions. (I am not sure what the 'other dimensions' might be that you 
list above ...) SMS could then figure out how to feed the information 
correctly to the port, which in this case, doesn't sound that hard.

Is this what you were thinking?

shawn

Chad Berkley wrote:

> I think this is the ideal solution, but I don't think we're anywhere 
> close to having that kind of functionality (Shawn and Bertram, correct 
> me if I'm wrong).  If we want garp and WFs like it to work in the 
> meantime, we're still going to need user input functionality.  Efrat 
> just pointed out on IRC that we can do the basics with the browserUI 
> actor, which may be a good stopgap.
> 
> chad
> 
> On May 26, 2004, at 11:17 AM, Deana Pennington wrote:
> 
>> Is this something we could work towards with the SMS workflow 
>> analysis?  Seems like we could have the SMS figure out, at the 
>> beginning of the run, what actors need to be parameterized based on 
>> runtime data, and could prompt the user for those inputs.  That way, 
>> it wouldn't have to pause in the middle, which is really not a great 
>> idea.  Some models run for days, and you don't want them pausing in 
>> the middle of the night, waiting for user input.
>>
>> I think we should just add this to Shawn/Bertram's list of things to 
>> do...doesn't the SMS fix everything??? :-)
>>
>> Deana
>>
>>
>> Chad Berkley wrote:
>>
>>> Hi Deana,
>>>
>>> See my comments below:
>>>
>>> On May 26, 2004, at 1:24 AM, Deana Pennington wrote:
>>>
>>>> Chad,
>>>>
>>>> I worked on this for several hours yesterday, and still have some 
>>>> work to
>>>> do today.  It would be easy to figure this out just for the mammal
>>>> project...its turning out to be more difficult to think through a 
>>>> generic
>>>> sampling routine.  I think I have it figured out, now, though, and am
>>>> writing up some instructions for you.  I'll send those by the end of 
>>>> the
>>>> day.  Then we can talk about it on IRC, or by phone.
>>>
>>>
>>>
>>> Cool.  I'll be on IRC all day so just let me know when you're ready 
>>> to chat.
>>>
>>>>
>>>> Question:  I want to be able to sample in any dimension (space: x,y,z,
>>>> time t, or other dimensions).  We should be able to figure out which
>>>> fields in a parsed dataset are which, but I need to also allow for the
>>>> selection of other dimensions, which will have to be identified by the
>>>> user.  I thought about setting this up so that we could use your eml
>>>> ingestion actor to parse a file, then send it to the sampler.  However,
>>>> that requires mapping specific eml outport ports to specific sampler 
>>>> input
>>>> ports, which will not be known until runtime.  What is the best way 
>>>> to set
>>>> it up so that the user can send a file to the sampler actor, see some
>>>> information about the fields, and either select or parameterize the
>>>> correct fields to be sampled?
>>>
>>>
>>>
>>> I've been tossing this idea of stopping the execution to ask for 
>>> input around in my head since the SEV meeting.  I haven't really come 
>>> up with a good solution other than writing an extension to the pause 
>>> actor that allows an input dialog to be popped up.  How to configure 
>>> that dialog or get the actual information you need is a tough 
>>> question since I'd like to make it generic enough to work with 
>>> workflows other than GARP.  Edward or Christopher, do you have any 
>>> examples of workflows that stop execution and ask users for input 
>>> then continue executing based on that input?  I haven't seen any.  
>>> Does anyone have any good ideas on how this could be done generically?
>>>
>>> The flow as I see it is:
>>> 1) execution is paused
>>> 2) a dialog that is partially preconfigured at design-time and fully 
>>> configured at run-time is presented to the user
>>> 3) the user makes a choice, altering the exec-time parameters of the 
>>> rest of the workflow
>>>
>>> Note that in 2, the run-time configuration may include such things as 
>>> dialog widget configuration with run-time produced data (e.g. 
>>> populating a list box with a run-time data stream).  The design-time 
>>> configuration would include issues such as choosing the input/output 
>>> ports and configuring what the logic of the dialog is (this may be 
>>> tricky).
>>>
>>> I think this is probably a necessary bit of functionality since I 
>>> have seen a couple different eco workflows prototypes that want to do 
>>> this.
>>>
>>> comments?
>>>
>>> chad
>>>
>>>>
>>>> Deana
>>>>
>>>>
>>>> Chad Berkley wrote:
>>>>
>>>>> Hi Deana,
>>>>>
>>>>> I was going to start working on the sampling actor for garp.  could 
>>>>> you
>>>>> refresh my memory as to how to that should work.  I have the inputs as
>>>>> a species and a scaling metric and the outputs as the intrinsic and
>>>>> extrinsic data.  aren't there different sampling techniques?  I'd like
>>>>> to build one generic sampling actor that can use one of a number of
>>>>> different techniques.  I'm on IRC now if you want to chat about 
>>>>> this in
>>>>> real-time.
>>>>>
>>>>> thanks,
>>>>> chad
>>>>
>>>>
>>>
>>
>> -- 
>> ********
>>
>> Deana D. Pennington, PhD
>> Long-term Ecological Research Network Office
>>
>> UNM Biology Department
>> MSC03  2020
>> 1 University of New Mexico
>> Albuquerque, NM  87131-0001
>>
>> 505-272-7288 (office)
>> 505 272-7080 (fax)
>>
> 
> _______________________________________________
> kepler-dev mailing list
> kepler-dev at ecoinformatics.org
> http://www.ecoinformatics.org/mailman/listinfo/kepler-dev