[kepler-dev] where do actor packages meant to work with Kepler 1.0 go in the repository?

Timothy McPhillips tmcphillips at mac.com
Wed Jun 25 19:59:17 PDT 2008


Dear All,

With the Kepler 1.0 release out and the migration to svn complete I  
am about to begin migrating into the Kepler repository some code  
we've been developing here at UCD.  Some of these checkins will be  
relatively straightforward, but other code will present some  
integration, source code management, and build system challenges I'd  
really like to get everyone's input and help with.


*** Upcoming checkins ***

1.  First, I will be committing a new version of the COMAD framework  
to src/org/kepler/comad.  The current version in Kepler is included  
under src/org/nddp.  The major difference with the new version is  
that it includes comprehensive support for recording provenance from  
COMAD workflows and storing the results as provenance-annotated  
'trace' files.

2.  Second, we would like to begin committing the code specific to  
the pPOD extension to Kepler (phylogenetics actors and data types for  
the AToL community).  This integration will be trickier.  Please see  
"complications" below.

3.  Third, when the COMAD and pPOD sources have been successfully  
integrated with the main Kepler system, I will delete the code under  
src/org/nddp, this package having been superseded by the two sets of  
checkins above.


*** Complications  ***

Ok, here's the conundrum.  The pPOD extension is mostly a set of  
actors for phylogenetics analyses.  Prior to the Kepler 1.0 release  
we provided a preview release of the pPOD actors (based on the COMAD  
framework) to the AToL community (see a poster and presentation about  
the Kepler/ppod preview).  This was not meant for production use, but  
rather as a way to get feedback from the community on our work.   
However, in the very near future we would like to release this code  
as a package of actors and community-specific customizations to the  
Kepler GUI.   Note that this release will not be targeted only at an  
abstract community of folks who might possibly try it out, but also  
at particular scientists who have asked for specific workflows based  
on the pPOD package and who need them very badly for their own  
research.)

As you might expect, we do *not* want to:

N1.  Wait for the next release of Kepler before distributing these  
actors.

N2.  Release a package of actors that has been tested only against   
the revision of Kepler that we happen to find at the trunk of the  
Kepler repository on a particular day.

Instead, we want to (read 'must') release a package of actors that  
works with Kepler 1.0.  When another release of Kepler is made  
(Kepler 1.1, say), we will need to release a version of the pPOD  
actors that works with *that* version of Kepler.

There are number of questions implied by this need to release a  
package of actors designed to work with a particular, supported  
release of Kepler:

Q1. How do we build the pPOD actors against Kepler 1.0 rather than  
against the current revision in the repository?

Q2.  Where should we put the pPOD actors and support classes in the  
Kepler repository directory structure?  If we put them in, say, src/ 
org/ppod (under kepler), then all of this code will be seen by the  
build system and all kepler developers will find themselves building  
code that is not necessarily meant to be compiled against the current  
revision of Kepler in the repository.  If any classes the pPOD  
package depends on elsewhere in Kepler are renamed, moved, removed,  
or changed significantly, the pPOD code will no longer build against  
the current revision of Kepler in the repository.  The build will break.

A further complication is that once the 1.0-compatible version of the  
pPOD package is out, we likely will want to begin building against  
the latest revision of Kepler in the repository until the next  
version of Kepler is out, at which point we'll want to build against  
*that* supported version of Kepler.

In short, we need to be able to choose which revision of Kepler in  
the repository we build the pPOD package against, without disturbing  
folks working with the current revision of Kepler.


*** Solutions? ***

S1.  On the surface this problem sounds like one that could be solved  
with branches.  And I believe it possibly could, and that branches  
likely will be involved in some way (for example, we will still want  
to provide bug fixes to the 1.0-compatible version of the pPOD actors  
when we're working on the 1.1-compatible release).

However, is this how we want to treat the Kepler 1.0 release branch  
in the repository?  Do we want to be checking hundreds of source  
files into the Kepler 1.0 release branch indefinitely, and almost  
certainly introducing new bugs and instabilities there?

S2.  An alternative would be to create a separate source tree for  
pPOD in the svn repository (as a peer to kepler, for example).  The  
build system would not include this 'extension' when compiling the  
Kepler java sources.  Instead, we might build the pPOD package  
against the Kepler 1.0 release jar itself.  This approach might scale  
better and enable everyone providing their users 1.0-compatible  
actors to share their work with the community through the Kepler  
repository without requiring that their code work with the current  
revision of Kepler.

The problem with this second approach is that it might not be so easy  
to build against the current revision of Kepler, should one choose to  
do so (and as we likely will between releases of the pPOD package).   
At least, not without building the Kepler jar each time.

S3.  A variant of S2 would be to enable developers to tell the build  
system to optionally include extensions in the build.  With the  
extensions stored as peers to the kepler directory in the repository,  
developers could choose which extensions to check out, and specify  
different branches or tags for Kepler and for each extension checked  
out.

This final approach is the one we currently use at UCD.  We store our  
Kepler extensions in our own repository at UCD and use a tweaked  
version of Kepler's standard build.xml file that supports extensions  
in this way. One nice feature of this approach is that it does not  
require all extensions to be stored in the same repository.  The  
problem with our current approach is that it is harder for us to  
share our extensions with the rest of the community.  We'd much  
rather use the Kepler repository, but at the moment we can't as  
explained above.  In short...


*** Help! ****

Does anyone else have similar needs?  Does anyone have any thoughts  
or advice?  Are there other solutions?  The UCD team needs a solution  
soon because we want to share our source code widely with others in  
the community, *and* we need to make a release of the pPOD actors (as  
well as a number of other extensions we're currently working on  
outside the Kepler repository for the same reasons) to users  
(including people working here at the UCD Genome Center) in the very  
near future (a few weeks from now).

Thanks very much for your thoughts and help!

Tim

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/kepler-dev/attachments/20080625/d0ee2d78/attachment.htm 


More information about the Kepler-dev mailing list