[kepler-dev] [Bug 1342] - need R actor

bugzilla-daemon at ecoinformatics.org bugzilla-daemon at ecoinformatics.org
Tue Feb 22 15:32:04 PST 2005


http://bugzilla.ecoinformatics.org/show_bug.cgi?id=1342





------- Additional Comments From jones at nceas.ucsb.edu  2005-02-22 15:32 -------
A limitied R actor is now working on Windows and Linux and Mac (although limited
graphical capabilities on Mac because of lack of jpg support) that uses the
commandline actor to execute an R script.  It however requires a custom workflow
to stage the inputs and outputs properly for each script. We need to eliminate
the need for this 'plumbing' in R to support the inputs and outputs that vary
with each R script.

One approach is to have an interface def language for R scripts that basically
provides a formal documentation of the R script inputs, outputs, and function. 
This could be very similar in nature to what Chad and I did in Monarch
previously.  Basically a 'WSDL' for R.  The interface would list each of the
inputs that the script expects, its formal type, and how it expects to access it
(e.g., from a particular file, or through a particular stream).  It would do the
same for the outputs, listing each output with its type and how delivered (e.g.,
file with name '/tmp/graph1.jpg' as output).  The 'R actor' would then parse
this information and expose a dynamic set of prots with appropriate types, just
as the WSDL actor does.  A user could then use an R script just by having the R
actor read this interface description.  The interface description could even
contain the R script itself, which would make it a self-contained unit that
could be transported as an actor (only needing the 'R' actor to interpret and
run it).  This is basically the framework we had in Monarch.

This system also would enable us to have an 'R import dialog' in which someone
can list the inputs and outputs of an R script, associate them with files or
streams, and paste in the R script, which would then generate an R interface def
and allow it to be stored in the library tree as a specialized R actor.  This
interface already partly exists in the 'New actor' functionality that chad
developed for Kepler, in that it allows the i/o signature of an actor to be
stubbed out.

In this scenario, the R actor would get re-written to only contain one
parameter, which is a pointer to the R-script interface file.  From that file it
can get everything it needs to ingest the inputs, run the script (through
commandline or otherwise), and expose the outputs on the output ports with the
proper types.  This is exactly analogous to how the EML actor works and the WSDL
actor works, so we have existing code that shows how this can be done.



More information about the Kepler-dev mailing list