[kepler-dev] proposed changes to .kepler

Thu Nov 19 09:27:41 PST 2009

Hi Ben,  I would agree that this is the most sensible approach.  Have  
a user defined directory for persistent data that is version specific  
and have a .kepler directory that is transient.

-Aaron

On Nov 19, 2009, at 8:05 AM, ben leinfelder  
<leinfelder at nceas.ucsb.edu> wrote:

> In response to "What constitutes a version of Kepler" --
> What if we offloaded the burden of figuring that out so that the  
> "correct" .kepler is selected by the person actually running an  
> instance of Kepler?
> Think about how Eclipse uses workspaces: It's up to me to select the  
> appropriate one at startup.
> 1) When I upgrade Eclipse (i.e. Kepler) I can either create a new  
> workspace (.kepler) or opt to "convert" the existing one  
> (upgrade .kepler)
> 2) If I've installed plugins (i.e. Kepler modules) and they are  
> referenced in my workspace (.kepler) but then I attempt to run a  
> version of Eclipse (Kepler) that lacks a plugin that error is my  
> fault - I can either install the plugin (Kepler module) or use a  
> different workspace (.kepler)
>
> We'd of course loose the ability for different installations to  
> share across .kepler directories, but that simplifies things for us.
>
> At the very least, I think it would be a good feature [that  
> potentially saves us a lot of trouble] to specify an  
> alternative .kepler location on startup.
>
> -ben (expanded from Oliver's input)
>
> On Nov 18, 2009, at 7:23 PM, David Welker wrote:
>
>> If we want a more sophisticated solution, as Ben is suggesting, I  
>> think a good place to start thinking about the data that we  
>> interact with Kepler is this chart.
>>
>> Basically, the problem that Ben is getting at is what if we have  
>> persistent data that is inconsistent across versions? There are two  
>> ways to handle this problem:
>>
>> 1.) Make sure that there is no persistent data that is incompatible  
>> across versions.
>> 2.) Store inconsistent persistent data in different folders.
>>
>> The issue with solution 2 is that is probably unwieldy. What  
>> constitutes a different version of Kepler? I would argue that a  
>> different version of Kepler does not constitute merely a release,  
>> but rather any unique combination of modules. That is, if I am  
>> running Kepler 2.0 with the WRP suite, that is a different version  
>> than Kepler 2.0, which is a different version than Kepler 2.0 with  
>> the tagging suite, which is different than Kepler 2.0 with the PPOD  
>> suite, which is different than Kepler 2.0 with a custom ad hoc  
>> suite, which is different than Kepler 1.0 with the WRP suite... You  
>> get the idea. The number of different versions is unwieldy. For  
>> that reason, I feel that solution 2.) is unworkable (although at  
>> one time I did lean towards that myself).
>>
>> So, we need to implement solution 1. Basically, we should make sure  
>> that no input files from .kepler are capable of causing any version  
>> of Kepler to crash. To do this, we might include meta-data within  
>> the data files that identifies the sort of data that is stored in a  
>> particular file or section of a file. Then, a particular version of  
>> Kepler could be made to be smart enough to know what sorts of data  
>> it can handle and which sort of data it can't. This also has the  
>> advantage of avoiding the situation where multiple versions of  
>> Kepler are unable to share persistent data that is in fact  
>> compatible.
>>
>> Back in the day, Tim and I made a chart that explores the sort of  
>> data that we interact with in Kepler. It might be useful to think  
>> about. Check that out here:
>>
>> https://kepler-project.org/developers/teams/framework/classification-of-persistent-system-state
>>
>>
>> On Nov 18, 2009, at 8:41 PM, ben leinfelder wrote:
>>
>>> A few questions:
>>> 1) What other files in .kepler will be considered "persistent"
>>> 2) What is the plan for different versions of Kepler running with  
>>> the same .kepler?
>>>    a) do we have another directory branch for each version:
>>>    ~/.kepler/kepler-2.0/...
>>>    ~/.kepler/kepler-2.1/...
>>> 3) When thinking about multiple instances of Kepler on one machine  
>>> are we aiming to support:
>>>    a) different versions being run at different times
>>>    b) the same version being run concurrently
>>>    c) different versions being run concurrently
>>>
>>> Depending on what we support, we'll have to revisit the embedded  
>>> HSQLDB that is used for the cache and provenance in addition to  
>>> these directory structures.
>>>
>>> -ben
>>>
>>> On Nov 18, 2009, at 2:08 PM, Derik Barseghian wrote:
>>>
>>>> Kepler devs,
>>>>
>>>> After some discussion with Aaron, Ben, Dan, and Chad, I'm  
>>>> wondering if anyone objects to dividing .kepler into two  
>>>> different areas -- there would be areas for 1) persistent items  
>>>> (e.g. provenance database) and 2) temporary items (e.g. cache).  
>>>> This would make it more apparent which things could be deleted  
>>>> without serious ramification (temp/), and the idea would be items  
>>>> in peristent/ should stick around and be dealt with during kepler  
>>>> upgrades for backwards compatibility.
>>>>
>>>> Also, I think we should utilize the InstanceAuthNamespace in  
>>>> these paths, so that items from different Kepler instances are  
>>>> separated.
>>>>
>>>> This could look like (imagine multiple namespace dirs):
>>>>
>>>> a)
>>>> .kepler/perisistent/gamma.msi.ucsb.edu.OpenAuth.1278/
>>>> .kepler/temp/gamma.msi.ucsb.edu.OpenAuth.1278/
>>>>
>>>> or b)
>>>> .kepler/gamma.msi.ucsb.edu.OpenAuth.1278/persistent/
>>>> .kepler/gamma.msi.ucsb.edu.OpenAuth.1278/temp/
>>>>
>>>> or c)
>>>> .kepler_temp/gamma.msi.ucsb.edu.OpenAuth.1278/
>>>> .kepler_persistent/gamma.msi.ucsb.edu.OpenAuth.1278/
>>>>
>>>> I prefer a).
>>>>
>>>> This partly came out of a discussion of bug 4514. I think the  
>>>> configuration files could be stored beneath these new paths,  
>>>> probably in persistent.
>>>> http://bugzilla.ecoinformatics.org/show_bug.cgi?id=4514
>>>>
>>>> A better solution might be to just have a .kepler to store  
>>>> temporary things, and to store persistent items in an OS- 
>>>> appropriate location, but I think this might be a larger change  
>>>> than we want to take on at the moment, as we try to get 2.0 out  
>>>> of the door.
>>>>
>>>> Let me know what you think,
>>>> Derik
>>>> _______________________________________________
>>>> Kepler-dev mailing list
>>>> Kepler-dev at kepler-project.org
>>>> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-dev
>>>
>>> _______________________________________________
>>> Kepler-dev mailing list
>>> Kepler-dev at kepler-project.org
>>> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-dev
>>
>
> _______________________________________________
> Kepler-dev mailing list
> Kepler-dev at kepler-project.org
> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-dev
>