[kepler-dev] Caching of Data in Kepler
Rod Spears
rods at ku.edu
Tue Sep 14 07:31:54 PDT 2004
Yesterday morning I put together a caching mechanism for the Ecogrid
DataSources.
It is a hybred memory cache and file cache with threading. Here is how
it works:
The CacheManager maintains a list of cached items. The base class is
abstract enabling the implementing classes to implement "how" the data
is obtained. The base class is responsible for threading, loading old
data and saving out new data.
When a request is made the cache item is created on its own thread and
begins to download the data, in the mean time it marks itself as "busy."
When it finishes it notifies any listeners that it is done and marks
itself "complete"
The cache manager serializes itself out as an XML file, each entry in
the cache is saved in a separate file thus making it simple and flexible.
The items keep track of their creation date and I could easily add the
capability for them to automatically retrieve a newer version of their
contents. The impl I have now keeps track of the ecogrid info necessary
to retrieve the data.
So at the moment when a DataSource needs its data it just asks for it,
then the cache will get it and notify them when it is there, it is all
very transparent to the DS. The big difference is that it is more
asynchronous than before.
I also created a quick little "Data Cache Viewer" that displays the
entries in the cache "catalog"
Under the File menu item you can:
* Refresh a selected cached item
* Refresh all the items
* Delete a single cache item
* Delete all the cache items
I could easily add to the viewer a way to view the actual contents of a
cached item (if we need it). Some scientists may want that....
After I got this working with the EML200DataSource, Jing informs me that
Monarch has a "generic" memory cache and file cache. I haven't had the
time yet to review the impls. Now, we can go with this specific impl
that is tailored to our DataSource objects, or I could adapt it to use
Monarch's file cache for serializing the output.
Any thoughts? Or maybe we just use this for now and look at the issue
again after our Oct. deadline.
Also, it seems that we will also want to cache some of the metadata
(table entity info) so a user could actually run "offline" if they
wanted to or needed to.
Rod
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mercury.nceas.ucsb.edu/kepler/pipermail/kepler-dev/attachments/20040914/e71a2c6a/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dcv2.jpg
Type: image/jpeg
Size: 20644 bytes
Desc: not available
URL: <http://mercury.nceas.ucsb.edu/kepler/pipermail/kepler-dev/attachments/20040914/e71a2c6a/attachment.jpg>
More information about the Kepler-dev
mailing list