[seek-dev] Link-Up Meeting bits

Bertram Ludaescher ludaesch at sdsc.edu
Wed Oct 20 01:57:50 PDT 2004


A few updates on the ongoing Link-Up meeting (SDSC Auditorium):

(1) Soaplab services and the Command Line actor
(2) Thursday HackThonLet:  Kepler/SRB <-> Taverna/Soaplab
(3) Data-dependent (parameter-independent) service specialization
(4) Process Control and Data Summary tabs for Kepler (a la Taverna)


(1) Soaplab services and the Command Line actor:

Soaplab was mentioned in the meeting several times.  In chatting with
Ilkay a while ago, I had suggested the need for a "kind-of WSDL for
the Command-Line actor", and generally to have a "specialization
feature" for the Command-Line (CMD) actor similar to the instantiation
feature available for the web service (WS) actor.
   The benefits should be obvious: after instantiating a generic WS or
CMD actor, it becomes typed (i.e., a specific i/o signature) according
to the WSDL. The so XML Schema-typed (and possibly additionally
semantically semantically) service can then be stored back in the user
library as a "custom actor" (adding a custom icon and a yet to be
defined overall color-coding will make any new custom actor more
recognizable and spiffy too ;-)

   It turns out that Soaplab folks have already come up with a way to
"standarize" command line tool access -- inherited from EMBOSS
bioinformatics tools:
  http://twiki.mygrid.org.uk/twiki/pub/Mygrid/AllHands2003Local/115.pdf

We should take a close look at Soaplab (http://industry.ebi.ac.uk/soaplab/)
and the underlying ACD/EMBOSS stuff--it might very well provide us
with the desired "WSDL for comand line jobs".

   SDSC folks: in addition to the unique opportunity to learn more
about Soaplab from our Link-Up participants (Tom is a co-author of the
above paper), this might also be a good topic for one of our "Kepler
bits" weekly meetings.

(2) A speculation about the "interoperability HackaThonLet"
(another neologism ;-) on Thursday: It seems attractive to try ...

  (a) to "plug-in" a Soaplab service from Taverna into Kepler, and
  (b) to plug-in a Kepler actor into Taverna 

For (b) I'd like to suggest one of the recent SRB actors -- there are
already some folks who are interested in invoking SRB capabilities
from within Taverna (-> Stefan Egglestone). Since Kepler now has such
(JARGON-based) SRB actors, we might kill two birds with one stone.
Similarly for (a) while learning about Taverna we might as well learn
about Soaplab along the way.

Ilkay, Efrat: 
Maybe you can already link-up with the Taverna folks on this on
Wednesday so things are rolling nicely on Thursday!?

Also if Yang has some ideas on this, Wedneday is the day to ask her
(since she leaves Wedneday PM)

(3) Another interesting comment made today was that some (web)
services (or command line invocations) are specialized NOT by setting
certain parameters (e.g., via command line switches, or selection of a
WSDL operation), but just by the actual data being supplied. For
example, several NCBI web service operations seem to be very generic
in their parameters, and the actual type of operation cannot be seen
from the signature. Say if you had a service function: 
	SomeAnalysis :: DNAseq -> [DNAseq]
you might be looking at an overloaded/polymorphic family of functions
with the specific operation depending on the type of the input (e.g.,
cDNA vs rDNA vs mDNA etc).

(a) This creates an interesting challenge for the semantic typing of such
services (the type would be a kind of conditional:
 SomeAnalysis :: 
    if in_type = cDNA then out_type = [cDNA], Analysis_type = foo
    if in_type = mDNA then out_type = [mDNA], Analysis_type = bar )

While this situation seems still doable (I wonder what our PL typing
experts call this), things get really nasty with (web) services that
embed their own "hidden" language within a completely generic (and
thus almost useless) signature as follows:

(b)	getInfo :: QueryString --> Result

Yes, such generically typed services do exist...  While handy and
flexible for the practitioner, a (static) type system can no longer do
its job here.

An interesting workaround for (a) and maybe even (b) was mentioned
today: just create a *set* of specialized signatures (a kind of union
type for the function signature), all pointing to the *same* (web)
service implementation. This indeed resembles the way polymorphic
functions deal with the situation (the actual type depends on the
types of the host object and function arguments)

(4) Last not least, looking at some of the Taverna capabilities
suggests several Kepler extensions to make the system more user
friendly. Among them:

(a) "Control Central": Taverna has a "process table view" that shows
the status of each process (running, waiting?, failed, completed,
...). Through "listen to director" and "listen to actor" and other
means, Ptolemy II should have lots of information available
already. For workflow debugging purposes it would be very useful to
have a table view that can display all the processes in a single
window (with clickable columns to sort by process name, status, type
etc).  This should be particulary useful when having hundreds or
thousands of Grid jobs (that would be monitored by proxy actors)

(b) "Data Central": Taverna has a "tabbed view" of all the data
products (files) being produced by a workflow. This seems to be very
useful as well since many scientific workflows are data centric and
produce lots of different intermediate files. Having all data files
(inputs and outputs) browsable from one place (tabbed view: one tab
per file, or table view: one clickable row per file) would solve some
of the problems with keeping track of many pop-up displays (or an
overly large Kepler/Ptolemy Run-window). 

A simple implementation might be to have a special actor port property
("data view on/off") which is by default on, and which says whether
the tokens on a channel are shown in the "Data Central" view. 

Bertram 



More information about the Seek-dev mailing list