[kepler-dev] Thoughts on an "R' Actor
Jing Tao
tao at nceas.ucsb.edu
Thu Jun 10 15:27:43 PDT 2004
Hi, Dan:
I am playing SAS now and it is a candidate of SQL engine for "data query"
feature in kepler.
If we want integrate R into kepler and R can act as a SQL engine, it will
be great. As you mentioned, it is free compare to SAS. Can we write
script to load data file intto R as relational tables and run sql query
base on those tables?
Thanks.
Jing
On Thu, 10 Jun 2004, Dan Higgins wrote:
> Date: Thu, 10 Jun 2004 15:04:08 -0700
> From: Dan Higgins <higgins at nceas.ucsb.edu>
> To: kepler-dev at ecoinformatics.org
> Subject: [kepler-dev] Thoughts on an "R' Actor
>
> Hi All,
>
> I have been working on trying to understand some of the details of the R
> system (http://www.r-project.org/) and how it might be integrated into
> Kepler. For those who are unfamiliar with R, "R is a system for
> statistical computation and graphics. It consists of a language plus a
> run-time environment with graphics, a debugger, access to certain system
> functions, and the ability to run programs stored in script files."
> (from the "R FAQ"). R is a powerful system for statistical and other
> calculations. It is comparable to Matlab or SAS but has the advantage of
> being free, easily extended, and available for PCs, Macs (OS X), and
> Unix systems. There are also numerous extensions from a variety of
> sources. It thus appears to be fairly widely accepted and used by
> numerous researchers.
>
> A first-cut on building an R actor would seem to be to use a local
> version of R (since it can be freely installed on almost any computer)
> and run it as a sub-process to Kepler. An obvious method for doing this
> is to use one of the CommandLine/Exec actors.
>
> I say 'one of ...' because there are at least 2 existing actors for
> running arbitrary subprocesses from within Kepler/Ptolemy. The
> "CommandLine" actor can be found in the the Kepler graph editor tree
> under "actors/kepler/spa/CommandLine". The author listed in the source
> is Ilkay Altintas, and this actor runs under the 3.0.2 version of
> Ptolemy/Kepler. A second similar actor, called "Exec" is included with
> the Ptolemy 4.0Beta release under "MoreLibraries/Esoteric/Exec". The
> Exec actor was written for Ptolemy 4 by Chris Brooks and (I think) uses
> some new features that are not available in version 3.0.2.
> [Specifically, there is an "Expert Mode" for setting additional parameters.]
>
>
>
>
>
> Both the CommandLine and Exec actors use the Java 'exec' method to
> launch a subprocess. They differ in the details, however. CommandLine
> actually launches a command processor ('cmd.exe/command.exe' on Windows
> and 'sh' on Mac/Linux) so that the command entered by a user is
> essentially identical to that entered in a terminal window to launch a
> process. This can include I/O redirection like "< myfile.in". In the
> Exec actor, the command follows the underlying Java method more closely
> and has ports for input and output streams. The command string cannot
> include redirection. Both actors wait for the subprocess to finish
> before their 'fire' action completes.
>
> Now consider just how we might integrate R into Kepler. R can be run in
> an interactive mode (start up; type a command; see response; type
> another command) or in a batch mode (start R with a script file which
> has a series of command and write the results to an output file).
> Creating an R workflow in the batch mode is fairly easy. A screen shot
> of a workflow which uses the CommandLine actor to run R to create a jpeg
> plot and then display it shown below.
>
>
>
> The script file used in the example is:
>
> x <- seq(-10, 10, length = 50)
> y <- x
> rotsinc <- function(x, y) {
> sinc <- function(x) {
> y <- sin(x)/x
> y[is.na(y)] <- 1
> y
> }
> 10 * sinc(sqrt(x^2 + y^2))
> }
> sinc.exp <- expression(z == Sinc(sqrt(x^2 + y^2)))
> z <- outer(x, y, rotsinc)
> jpeg(filename = "RTest.jpg", width = 480, height = 480, pointsize = 12,
> quality = 75, bg = "white")
> par(bg = "white")
> persp(x, y, z, theta = 30, phi = 30, expand = 0.5, col = "lightblue")
>
> It can be seen in this batch approach that one can get the results from
> an R calculation from the output stream or from a file created by R that
> is then read by other Kepler actors. A problem comes up, however, if one
> considers how to dynamically input instructions/data to R. In batch
> mode, this could require the dynamic creation of script files, although
> it would be nicer if ports for inputing data/instructions existed for an
> R actor. One thus has the question of how to import information from
> other parts of a workflow to an R actor.
>
> And what about using R in an interactive mode? Both the CommandLine
> actor and the Exec actor start a subprocess and then wait for it to
> finish. This means that the R code is loaded, executed, and then removed
> from memory. For an interactive environment (or for the case where the
> R calculation is repeatedly executed). it would be desirable to only
> load R once! There doesn't seem to any reason why the R process has to
> be stopped between firings. One could keep the process in memory (a
> static variable?) and simply read the input stream, execute it, write
> the output to the output stream, and then wait for the next input as
> part of a fire event. [Or perhaps there needs to be some class level R
> actor and a set of instances that do certain calculations by
> communicating with the class actor???]
>
> In any case, it is possible to simulate an interactive R session using
> save/load workspace options when starting and ending an R session. But
> it would be useful if the CommandLine actor had an 'inport' port to
> receive commands. Also, it might be useful if the Exec actor really had
> input and output streams instead of the String tokens currently used (to
> handle long inputs).
>
> That ends these semi-random thoughts for now.
>
> Any comments or suggestions?
>
> Dan
>
>
--
Jing Tao
National Center for Ecological
Analysis and Synthesis (NCEAS)
735 State St. Suite 204
Santa Barbara, CA 93101
More information about the Kepler-dev
mailing list