[kepler-dev] Thoughts on an "R' Actor

Jing Tao tao at nceas.ucsb.edu
Thu Jun 10 15:27:43 PDT 2004


Hi, Dan:

I am playing SAS now and it is a candidate of SQL engine for "data query" 
feature in kepler.

If we want integrate R into kepler and R can act as a SQL engine, it will 
be great. As you mentioned, it is free compare to SAS. Can we write 
script to load data file intto R as relational tables and run sql query 
base on those tables?

Thanks.

Jing

On Thu, 10 Jun 2004, Dan Higgins wrote:

> Date: Thu, 10 Jun 2004 15:04:08 -0700
> From: Dan Higgins <higgins at nceas.ucsb.edu>
> To: kepler-dev at ecoinformatics.org
> Subject: [kepler-dev] Thoughts on an "R' Actor
> 
> Hi All,
> 
> I have been working on trying to understand some of the details of the R 
> system (http://www.r-project.org/) and how it might be integrated into 
> Kepler. For those who are unfamiliar with R, "R is a system for 
> statistical computation and graphics. It consists of a language plus a 
> run-time environment with graphics, a debugger, access to certain system 
> functions, and the ability to run programs stored in script files." 
> (from the "R FAQ"). R is a powerful system for statistical and other 
> calculations. It is comparable to Matlab or SAS but has the advantage of 
> being free, easily extended, and available for PCs, Macs (OS X), and 
> Unix systems. There are also numerous extensions from a variety of 
> sources. It thus appears to be fairly widely accepted and used by 
> numerous researchers.
> 
> A first-cut on building an R actor would seem to be to use a local 
> version of R (since it can be freely installed on almost any computer) 
> and run it as a sub-process to Kepler. An obvious method for doing this 
> is to use one of the CommandLine/Exec actors.
> 
> I say 'one of ...' because there are at least 2 existing actors for 
> running arbitrary subprocesses from within Kepler/Ptolemy.  The 
> "CommandLine" actor can be found in the the Kepler graph editor tree 
> under "actors/kepler/spa/CommandLine". The author listed in the source 
> is Ilkay Altintas, and this actor runs under the 3.0.2 version of 
> Ptolemy/Kepler. A second similar actor, called "Exec" is included with 
> the Ptolemy 4.0Beta release under "MoreLibraries/Esoteric/Exec". The 
> Exec actor was written for Ptolemy 4 by Chris Brooks and (I think) uses 
> some new features that are not available in version 3.0.2. 
> [Specifically, there is an "Expert Mode" for setting additional parameters.]
> 
> 
> 
> 
> 
> Both the CommandLine and Exec actors use the Java 'exec' method to 
> launch a subprocess. They differ in the details, however. CommandLine 
> actually  launches a command processor ('cmd.exe/command.exe' on Windows 
> and 'sh' on Mac/Linux) so that  the command entered by a user is 
> essentially identical to that entered in a terminal window to launch a 
> process. This can include I/O redirection like "< myfile.in". In the 
> Exec actor, the command follows the underlying  Java method more closely 
> and has ports for input and output streams. The command string cannot 
> include redirection. Both actors wait for the subprocess to finish 
> before their 'fire' action completes.
> 
> Now consider just how we might integrate R into Kepler. R can be run in 
> an interactive mode (start up; type a command; see response; type 
> another command) or in a batch mode (start R with a script file which 
> has a series of command and write the results to an output file). 
> Creating an R workflow in the batch mode is fairly easy. A screen shot 
> of a workflow which uses the CommandLine actor to run R to create a jpeg 
> plot and then display it shown below.
> 
> 
> 
> The script file used in the example is:
> 
> x <- seq(-10, 10, length = 50)
> y <- x
> rotsinc <- function(x, y) {
>     sinc <- function(x) {
>         y <- sin(x)/x
>         y[is.na(y)] <- 1
>         y
>     }
>     10 * sinc(sqrt(x^2 + y^2))
> }
> sinc.exp <- expression(z == Sinc(sqrt(x^2 + y^2)))
> z <- outer(x, y, rotsinc)   
> jpeg(filename = "RTest.jpg", width = 480, height = 480, pointsize = 12,
>      quality = 75, bg = "white")
> par(bg = "white")
> persp(x, y, z, theta = 30, phi = 30, expand = 0.5, col = "lightblue")
> 
> It can be seen in this batch approach that one can get the results from 
> an R calculation from the output stream or from a file created by R that 
> is then read by other Kepler actors. A problem comes up, however, if one 
> considers how to dynamically input instructions/data to R. In batch 
> mode, this could require the dynamic creation of script files, although 
> it would be nicer if ports for inputing data/instructions existed for an 
> R actor. One thus has the question of how to import information from 
> other parts of a workflow to an R actor.
> 
> And what about using R in an interactive mode? Both the CommandLine 
> actor and the Exec actor start a subprocess and then wait for it to 
> finish. This means that the R code is loaded, executed, and then removed 
> from memory.  For an interactive environment (or for the case where the 
> R calculation is repeatedly executed). it would be desirable to only 
> load R once!  There doesn't seem to any reason why  the R process has to 
> be stopped between firings. One could keep the process in memory (a 
> static variable?) and simply read the input stream, execute it, write 
> the output to the output stream, and then wait for the next input as 
> part of a fire event.  [Or perhaps there needs to be some class level R 
> actor and a set of instances that do certain calculations by 
> communicating with the class actor???]
> 
> In any case, it is possible to simulate an interactive R session using 
> save/load workspace options when starting and ending an R session. But 
> it would be useful if the CommandLine actor had an 'inport' port to 
> receive commands. Also, it might be useful if the Exec actor really had 
> input and output streams instead of the String tokens currently used (to 
> handle long inputs).
> 
> That ends these semi-random thoughts for now.
> 
> Any comments or suggestions?
> 
> Dan
> 
> 

-- 
Jing Tao
National Center for Ecological
Analysis and Synthesis (NCEAS)
735 State St. Suite 204
Santa Barbara, CA 93101




More information about the Kepler-dev mailing list