[kepler-dev] ObjectManager/DataCacheManager anaysis

Wed Nov 2 18:39:55 PST 2005

Chad,

Sorry for the late reply.

I was just wondering what the relation was between a CacheObject and a
KeplerLSID.  For example, does each CacheObject instance have at least
one KeplerLSID object?  If so, it seems reasonable to add the method

   KeplerLSID getKeplerLSID()

to CacheObject. Also, it might be useful to be able to obtain all
KeplerLSID's with a corresponding CacheObject instance in the
ObjectCache. So, e.g., you might consider adding the methods

   Iterator<KeplerLSID> getLSIDsForCacheObjects()
   boolean hasCacheObjectWithLSID(KeplerLSID)

to the ObjectCache class.

Finally, what would the relation be now to the ActorMetadata class and
the CacheObject class?

Thanks,
-shawn

Chad Berkley wrote:
 > Hey,
 >
 > Last week I was tasked with looking into the current ObjectManager and
 > more specifically the DataCacheManager implementation to figure out
 > whether we should stick with the DataCacheManager as the underlying
 > cache system for kepler, or whether we should re-write it so that it
 > acts more in conjunction with the ObjectManager.  After looking at the
 > current code and the recommendations made by Kevin and others on the
 > optimal configuration, I think we should re-write the cache.  Below is
 > an outline of how I think it should be redesigned.  I think the
 > re-design has a much simpler API and a more logical process flow.  In
 > writing this, I've taken into consideration the original OM design on
 > the wiki, comments made by Kevin and others, Shawn and my experiences
 > trying to integrate SMS and my own experience writing the current OM on
 > top of the original cache.
 >
 > Objects:
 >
 > ObjectCache
 > -----------
 > ObjectCache getInstance() //singleton
 > void insertObject(CacheObject)
 > CacheObject removeObject(KeplerLSID)
 > CacheObject getObject(KeplerLSID)
 > CacheObject getTempObject() //request a single session temp object
 > void requestPurge(KeplerLSID) //request an object be purged
 > void requestPurgeExtension(KeplerLSID) //an object being purged can
 >                                         //request that it not get purged
 > void purgeAll() //clear the cache
 >
 >
 > abstract CacheObject
 > -----------
 > void addAttribute(String name, Object value)
 > Object getAttribute(String name)
 > Object removeAttribute(String name)
 > void addCAcheObjectListener() //listeners for cache events
 > abstract void serialize()
 > abstract Object getObject()
 >
 >
 > interface CacheObjectListener
 > -------------------
 > void objectAdded(CacheEvent)
 > void objectRemoved(CacheEvent)
 > void objectPurged(CacheEvent)
 >
 >
 > CacheEvent
 > ----------
 > CacheObject getSource()
 >
 >
 > The classes that would extend CacheObject are:
 > KARCacheObject extends JarCacheObject
 > DataCacheObject
 > ActorCacheObject
 > XMLMetadataCacheObject
 > JarCacheObject
 > NativeLibraryCacheObject
 > WorkflowCacheObject
 > FileCacheObject
 >
 > The listener interface will allow CacheObjects to have automatic actions
 > take place when they are added, removed or purged from the cache.  This
 > will allow, for instance, the KARCacheObject to process a kar file upon
 > being added or an ActorCacheObject to add itself to the tree
 > automatically.  This will keep the cache item specific code inside each
 > cache item instead of locating it in the cache itself.  The listener
 > will also allow items such as DataCacheObjects to request to not be
 > purged if they are large or recently used.  I think (correct me if i'm
 > wrong) this will also allow cache objects that are going to take a while
 > to retrieve (like DataCacheObject) to multi-thread themselves and not
 > stop the user from performing other tasks while the object is downloading.
 >
 > The current cache uses an xml file to store an index of cache items.  I
 > would, instead, like to use the embedded database for this.  I think it
 > will allow more flexibility in indexing the cache as well as speed up
 > loading of cache items.  Because of the BLOB/CLOB problem, I think the
 > cache objects should still be stored on disk with a pointer from the
 > database.
 >
 > This is going to require some reworking of existing code.  Basically the
 > current ObjectManager interface will go away and be replaced by this
 > cache.  This shouldn't be too big of a deal because the only place the
 > OM is being used is in the kar support classes.  This code can be
 > re-worked into the KARCacheObject class.  The one place that I'm
 > uncertain of the work required is in the various data actors.  I know
 > Jing has a bunch of code that uses the cache for the EML and other
 > datasource actors.  This will have to be re-written.
 >
 > Please take a look at this and let me know if I've forgotten anything.
 > Unless there is something hugely wrong with what I've written, I'd
 > rather not have a long, drawn-out discussion about this since it needs
 > to get implemented soon if we are going to make our Dec. 9 deadline.
 > Please reply with any comments within the next day or so.
 >
 > thanks,
 > chad
 > _______________________________________________
 > Kepler-dev mailing list
 > Kepler-dev at ecoinformatics.org
 > http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/kepler-dev