[kepler-dev] autowrapping scientific code

Tue Jan 13 13:38:16 PST 2004

Chad,

I've been thinking about and looking into the problem of wrapping 
existing simulation models in Kepler, given your experience wrapping 
GARP for use there.  It seems that there are two components we need to 
deal with: 1) compiling the code (C/C++ usually) on the right platform 
and getting it to run, and 2) refactoring/splitting the code so that it 
makes sense from a component reuse perspective.

For (1), it seems like we should be able to extend your Java actor 
skeleton tool to link to existing code, compile that code dynamically, 
and make available the actor in the workflow GUI.  There's an 
interesting paper on this from the Triana team 
(http://trianacode.org/triana/papers/pdf/MedliHIPS2003.pdf) that I think 
you should read if you haven't.  All of the triana papers listed on 
their site are relvant to us.  If we can accomplish this, then we would 
be much more able to create workflows where the computation can move to 
the data (by sending code that gets compiled on the grid node where the 
computation should run).  Seems like the autoconf approach to 
portability would help us a lot, so code that is autoconf enabled (if 
any ecologists use it!) should compile much more smoothly than GARP did. 
  Of course, there's still a bunch of data passing/port communication 
issues in this model.  Regardless, we need for people to be able to use 
their existing models in Kepler without a month of compiling and 
integration work.  That'll be a challenge given what you've experienced 
with GARP, but it is necessary.

For (2) the problem is much harder.  For a monolithic model like GARP, 
we need to figure out how to factor out the data preprocessing from the 
actual algorithm, so that people can use the same preprocessing steps 
and substitute the interesting computational bits in the workflow.  As 
ecologists tend to not write using modular and OO techniques in their 
models, this is likely to be very challenging.  I don't have any good 
solutions in mind right off, but we should probably start reading up on 
the literature.  Maybe the Ptolemy group at Berkley has some insight 
into this from their code generation work.

Thought I'd bring this up for you to think about some.  Could you write 
up a brief summary of the difficulties in getting GARP to work on linux 
and windows in Kepler, and maybe speculate which of these issues might 
be general problems we'll encounter a lot, and which are idiosyncratic 
to GARP?  Just a summary would be useful.  We can chat about it later,

Matt
-- 
-------------------------------------------------------------------
Matt Jones                                     jones at nceas.ucsb.edu
http://www.nceas.ucsb.edu/    Fax: 425-920-2439    Ph: 907-789-0496
National Center for Ecological Analysis and Synthesis (NCEAS)
University of California Santa Barbara
Interested in ecological informatics? http://www.ecoinformatics.org
-------------------------------------------------------------------