[kepler-dev] eml2 dataources/memory problems in Kepler

Kevin Ruland kruland at ku.edu
Sun Jan 8 17:44:12 PST 2006


There are two problems:  The ecogrid get operation retrieves a byte[] 
for the entire dataset.  This dataset is constituted into a single 
byte[] before being buffered to disk.  Needless to say, this can be a 
very large chunk of memory and since it's dynamically resized, you can 
only imagine the thrashing the jvm needs to do.

After this byte[] is read into memory, in the alpha8 version of the 
code, IIRC, the bytes continue to be held in memory even if they are not 
really needed in that form.

The head version (which still has a couple of threading issues -- i have 
a partially completed fix at work) no longer holds onto the byte[].  
However, it does still require the byte[] for the ecogrid get 
operation.  With this new revelation, I will rewrite the get client to 
stream directly to disk.

I now have a copy of JProfiler and can run these same experiments with 
more careful analysis of memory utilization.  What was your quick search 
term, IPCC?



Dan Higgins wrote:

>Hi All,
>    FYI, I was doing some tests with eml2 datasources using the new 
>alpha8 release version of Kepler. In particular, I was searching for 
>IPCC climate data and dragging the data sources to the work area. This 
>works pretty well on a new machine at the office with 2 GBytes of RAM 
>(Windows OS), but the time it takes for the datasource to turn from red 
>to yellow is VERY SLOW (tems of minutes) on a portable with only 512M of 
>RAM (especially if one tries to put 2 or more datasources in the work 
>area). [Note that such sources will work, given enough time, and once in 
>the cache, performance is reasonable.]
>    If you watch the available memory readings in the Windows Task 
>Manager while waiting for the datasource to load, you can see the reason 
>for this slow performance. There is just not enough physical RAM on a 
>512Meg machine! Java is set to ask for 512M of RAM,  but it is thrashing 
>the disk (virtual memory) due to other processes using memory. If we 
>want to run Kepler on a 512M machines, it looks like we need to reduce 
>its memory requirements!
>Dan Higgjns
>Kepler-dev mailing list
>Kepler-dev at ecoinformatics.org

More information about the Kepler-dev mailing list