[kepler-dev] Kepler startup times

Thu Jun 7 15:13:28 PDT 2007

Hi all,

    In a previous email I outlined some issues with Kepler startup times 
and gave some times for code inside the KSWLibraryBuilder class where 
KAR files get cached and the ActorLibrary is built. In summary, some 
times given there were

Current KSWLibraryBuilder.buildLibrary   - 56.0 sec (coldStart with no 
cache)

Current KSWLibraryBuilder.buildLibrary   - 13.0 sec (warmStart with cache)

Further review of the code shows some interesting things.

When there is no cache, each of the KAR files is unzipped and an 
ActorCacheObject is created and added to Kepler's cache. The code for 
doing this reads the actor's MOML inside the KAR (actually it uses the 
MOML parser to read it). But it then writes it to the cache by 
converting the Java/Ptolemy objects back to MOML XML strings! Since the 
MOML parser also calls a ClassLoader to load the Java classes it finds, 
all the classes for all the actors get loaded then too. But there is no 
need to do it then since we just end up writing the MOML to the cache.

The other time consuming action is building the actual ActorLibrary. 
This action ends up reading the ActorCacheObjects from the cache and 
converts these MOML strings to ActorMetadata objects which contain 
various information from the MOML and a Ptolemy entity that ends up in 
the actor tree. This also requires apply ptolemy's MOML parser. So all 
the classes for all the actors get loaded here even when we avoid 
loading them when the KAR files are loaded!

I was pretty much convinced that a lot of our startup time is due to 
loading of class files for all the actors. I thus tried modifying both 
the reading of the KAR files and the creation of the actor tree to not 
require applying the MOML parser. Instead, I first modified the 
ActorCacheObject so that it just directly saved the MOML XML to the 
cache. It also retrieves its name, id, and semantic info from the MOML 
XML file. It then builds a Component Entity for the actor tree model 
that just has the name and id rather than all of the Actor information 
stored in the ActorCacheObject's Moml. [This avoids ever creating an 
ActorMetadata object.] I have an (offline) version of Kepler that works 
well enough to see what might be gained by this approach. ['Well enough' 
will be discussed below.]

Modified KSWLibraryBuilder.buildLibrary   - ~15 sec (coldStart with no 
cache)

ModifiedKSWLibraryBuilder.buildLibrary   - 5-6 sec (warmStart with cache)

These results would seem to indicate that loading all the class files 
(our huge jar assortment) is indeed a major cause of slowness!

Now for some caveats. The actor tree my test code builds does show the 
same nodes as the current CVS version of Kepler in terms of the text 
that appears (at least for Entities) but icons don't appear, so their 
are still some bugs to work out there. Also, I have yet to modify the 
code that 'drops' an object from the tree to the main canvas. I think 
the main task there is to use the actor id (which is in my tree) to look 
up the full set of MOML at drop time and then instantiate it with the 
Moml parser.

The other thing that recently came to mind is Chad's recent code to 
display the documentation from the tree. That code would also need to be 
modified to look up the documentation nodes from the id since it is not 
in my tree node objects.

So there is still a fair amount of work to have a complete demonstration 
but there is some indication that we can reduce both cold and warm 
starttime by avoiding loading everything at startup

Any thoughts or comments?

Dan

-- 
*******************************************************************
Dan Higgins                                  higgins at nceas.ucsb.edu
http://www.nceas.ucsb.edu/    Ph: 805-893-5127
National Center for Ecological Analysis and Synthesis (NCEAS) Marine Science Building - Room 3405
Santa Barbara, CA 93195
*******************************************************************