[kepler-dev] Kepler Alpha2 Installer Problems

Mon Sep 20 13:16:58 PDT 2004

I think that we ultimately want to be able to move actors (code, binary, 
and metadata) around, not just reference remotely stored Moml.  The 
email I sent the other day described just such a system, where the actor 
code and description (moml, etc) are passed in a signed jar file and can 
be loaded dynamically at runtime.  This lets us move code to the data, 
which is very significant for our use cases.

Christopher Brooks wrote:
> In principle, you can download .class files over the web and invoke
> those classes.  However, this is a huge security issue.
> It is also a best practice if the jar file is signed.
> 
> Running in the model in the sandbox might be appropriate, see
> $PTII/doc/sandbox.htm
> 
> In Ptolemy Classic, we found that that the latency time of passing
> tokens between actors running on remote processes was rather high.
> This prevented high speed simulation.  However, for Kepler
> applications transmitting data once every 'while', transmitting
> tokens might be ok.

Yeah, its really just a matter of actor granularity. Many of our cases 
are far longer running and require some very large data tokens.  So 
moving the actor and its code to the node where the data resides is 
gonna be pretty efficient for many of our cases.  For example, in the 
ecological niche modeling workflow, we want to run a model for 1000 
species, where it takes hours to run each species.  To run this on 1000 
(or 250, or 16, etc) nodes would be an amazing time saver, and 
transferring a few 100K or even a few MB of actor code to the individual 
nodes would be an insignificant cost compared to the overall computation 
time.  This is a pretty different scenario from many of the traditional 
Ptolemy models that simulate short-duration models.

The scientists are experimenting with the models themselves, and will 
want to be substituting in new algorithms for the old, so its easy to 
see the need for code to propogate out to a series of grid nodes as a 
new actor is developed.  Making such 'pluggable' actors that can be 
dynamically distributed and loaded would be fundamental to solving many 
of our more complex modeling needs.

Security is certainly an issue, but we are also talking about doing this 
in an autheticated grid context where users submitting the jobs are 
authenticated using PKI certificates.  The sandbox would still be 
recommended, but I think we have more latitude to run more risky code in 
an authenticated environment than if it were just anyone submitting. In 
addition, at least there would be an audit trail for any problems that 
arise.

Matt

> 
> -Christopher
> --------
> 
>     Hi All,
>     
>         Shawn's comment and some other discussions bring up some questions 
>     that I thought I would post for comments/discussion.
>     
>         Ptolemy has an example workflow (Network Integration) where the 
>     workflow is a URL rather than a local file. This allows one to post 
>     workflows anywhere on a network. But as I understand MOML, this only 
>     works for composite workflows; i.e. XML/MOML descriptions. Each 
>     workflow, when executed, is ultimately broken down  into atomic actors, 
>     written in Java and executed by calling the LOCAL class loader. In other 
>     words, there is no mechanism for using remote atomic actors. Is that 
>     correct?
>     
>         So, do we want to have a remote classloader for Kepler (load a 
>     remote atomic actor to the local system), or do we want to execute 
>     actors remotely and just transfer input and output tokens? (both?/neither?)
>    .
>     
>     Dan
>     
>     Shawn Bowers wrote:
>     
>     >
>     >
>     > Something to keep on the back-burner that I think would be really 
>     > useful, and even fit into SEEKs vision with EcoGrid and the more 
>     > recent conversations concerning a file-management subsystem, would be 
>     > to ship a version of Kepler that doesn't include the many 
>     > actors/workflows that have been developed.
>     >
>     > Instead, there should be some mechanism that would allow a user to 
>     > download, install, and run specific workflows/actors or packages of 
>     > these that they find relevant.
>     >
>     > This would make it much smaller and even streamline much of the build 
>     > process ... There are a ton of applications (even for Java) that have 
>     > this type of "add-in" capability; and I believe in many ways Ptolemy 
>     > already supports this.
>     >
>     > shawn
>     >
>     >
>     >
>     > Matt Jones wrote:
>     >
>     >> Well, I think we need to work on the size of the installer.  We 
>     >> should remove the GARP test data (get it from EcoGrid instead) and 
>     >> trim down the installer as much as possible.  If we can make the 
>     >> installer itself much smaller then machines with less memory would be 
>     >> able to better handle it.  I think we do need a traditional 
>     >> installer, as the zip/batch file approach is simply too 
>     >> unconventional for our target audience.
>     >>
>     >> Matt
>     >>
>     >> Dan Higgins wrote:
>     >>
>     >>> Hi All,
>     >>>
>     >>>    Mark Schildhauer recently reported problems with the Kepler 
>     >>> Alpha2 installers for Windows platforms. On a machine with 256MB of 
>     >>> RAM and 1-2GB of disk space, he received messages of 'not enough 
>     >>> space available' when running the installers.
>     >>>
>     >>>    Installer tests that I (and others) have done have worked 
>     >>> successfully, but they were done on Windows machines with at least 
>     >>> 10GB of free disk space and 512MB of RAM. Has anyone else 
>     >>> experienced difficulties with the installers (as downloaded from the 
>     >>> Kepler website)?
>     >>>
>     >>>    An alternative which worked for Mark is to use the zip file that 
>     >>> is listed as a Linux installer on the Kepler distribution page. 
>     >>> Expanding this zip file creates a directory image of Kepler that 
>     >>> really is platform independent. Running 'kepler.bat' will start 
>     >>> Kepler on a Windows machine, while running 'kepler.sh' will launch 
>     >>> it on Linux. {This assumes that Java is already installed on your 
>     >>> system.)
>     >>>
>     >>>    Given the large size of the kepler distribution and problems with 
>     >>> installers, perhaps we should consider just distributing zipped 
>     >>> images and ask that user install Java separately?
>     >>>
>     >>> Dan
>     >>>
>     >>
>     
>     
>     -- 
>     *******************************************************************
>     Dan Higgins                                  higgins at nceas.ucsb.edu
>     http://www.nceas.ucsb.edu/    Ph: 805-892-2531
>     National Center for Ecological Analysis and Synthesis (NCEAS) 
>     735 State Street - Room 205
>     Santa Barbara, CA 93195
>     *******************************************************************
>     
>     _______________________________________________
>     kepler-dev mailing list
>     kepler-dev at ecoinformatics.org
>     http://www.ecoinformatics.org/mailman/listinfo/kepler-dev
> --------
> _______________________________________________
> kepler-dev mailing list
> kepler-dev at ecoinformatics.org
> http://www.ecoinformatics.org/mailman/listinfo/kepler-dev

-- 
-------------------------------------------------------------------
Matt Jones                                     jones at nceas.ucsb.edu
http://www.nceas.ucsb.edu/    Fax: 425-920-2439    Ph: 907-789-0496
National Center for Ecological Analysis and Synthesis (NCEAS)
University of California Santa Barbara
Interested in ecological informatics? http://www.ecoinformatics.org
-------------------------------------------------------------------