[kepler-dev] proposed changes to .kepler

Bertram Ludaescher ludaesch at ucdavis.edu
Thu Nov 19 08:25:26 PST 2009


Hi Ben et al:

I think we should distinguish different types of users here:

Sometimes, Kepler developers are also users, and for those, I'm less
concerned.They can hack the system any way they like ;-)
However, I don't think we should burden our principle users, i.e.,
scientists (biologists, ecologists, etc) with low-level issues such as
figuring out which version of Kepler works with what directory etc.
So we need to do the necessary engineering to get this right, at least for
all public Kepler releases, including Kepler 2.0 of course...

Bertram

On Thu, Nov 19, 2009 at 8:05 AM, ben leinfelder
<leinfelder at nceas.ucsb.edu>wrote:

> In response to "What constitutes a version of Kepler" --
> What if we offloaded the burden of figuring that out so that the "correct"
> .kepler is selected by the person actually running an instance of Kepler?
> Think about how Eclipse uses workspaces: It's up to me to select the
> appropriate one at startup.
> 1) When I upgrade Eclipse (i.e. Kepler) I can either create a new workspace
> (.kepler) or opt to "convert" the existing one (upgrade .kepler)
> 2) If I've installed plugins (i.e. Kepler modules) and they are referenced
> in my workspace (.kepler) but then I attempt to run a version of Eclipse
> (Kepler) that lacks a plugin that error is my fault - I can either install
> the plugin (Kepler module) or use a different workspace (.kepler)
>
> We'd of course loose the ability for different installations to share
> across .kepler directories, but that simplifies things for us.
>
> At the very least, I think it would be a good feature [that potentially
> saves us a lot of trouble] to specify an alternative .kepler location on
> startup.
>
> -ben (expanded from Oliver's input)
>
> On Nov 18, 2009, at 7:23 PM, David Welker wrote:
>
>  If we want a more sophisticated solution, as Ben is suggesting, I think a
>> good place to start thinking about the data that we interact with Kepler is
>> this chart.
>>
>> Basically, the problem that Ben is getting at is what if we have
>> persistent data that is inconsistent across versions? There are two ways to
>> handle this problem:
>>
>> 1.) Make sure that there is no persistent data that is incompatible across
>> versions.
>> 2.) Store inconsistent persistent data in different folders.
>>
>> The issue with solution 2 is that is probably unwieldy. What constitutes a
>> different version of Kepler? I would argue that a different version of
>> Kepler does not constitute merely a release, but rather any unique
>> combination of modules. That is, if I am running Kepler 2.0 with the WRP
>> suite, that is a different version than Kepler 2.0, which is a different
>> version than Kepler 2.0 with the tagging suite, which is different than
>> Kepler 2.0 with the PPOD suite, which is different than Kepler 2.0 with a
>> custom ad hoc suite, which is different than Kepler 1.0 with the WRP
>> suite... You get the idea. The number of different versions is unwieldy. For
>> that reason, I feel that solution 2.) is unworkable (although at one time I
>> did lean towards that myself).
>>
>> So, we need to implement solution 1. Basically, we should make sure that
>> no input files from .kepler are capable of causing any version of Kepler to
>> crash. To do this, we might include meta-data within the data files that
>> identifies the sort of data that is stored in a particular file or section
>> of a file. Then, a particular version of Kepler could be made to be smart
>> enough to know what sorts of data it can handle and which sort of data it
>> can't. This also has the advantage of avoiding the situation where multiple
>> versions of Kepler are unable to share persistent data that is in fact
>> compatible.
>>
>> Back in the day, Tim and I made a chart that explores the sort of data
>> that we interact with in Kepler. It might be useful to think about. Check
>> that out here:
>>
>>
>>
>> https://kepler-project.org/developers/teams/framework/classification-of-persistent-system-state
>>
>>
>> On Nov 18, 2009, at 8:41 PM, ben leinfelder wrote:
>>
>>  A few questions:
>>> 1) What other files in .kepler will be considered "persistent"
>>> 2) What is the plan for different versions of Kepler running with the
>>> same .kepler?
>>>        a) do we have another directory branch for each version:
>>>        ~/.kepler/kepler-2.0/...
>>>        ~/.kepler/kepler-2.1/...
>>> 3) When thinking about multiple instances of Kepler on one machine are we
>>> aiming to support:
>>>        a) different versions being run at different times
>>>        b) the same version being run concurrently
>>>        c) different versions being run concurrently
>>>
>>> Depending on what we support, we'll have to revisit the embedded HSQLDB
>>> that is used for the cache and provenance in addition to these directory
>>> structures.
>>>
>>> -ben
>>>
>>> On Nov 18, 2009, at 2:08 PM, Derik Barseghian wrote:
>>>
>>>  Kepler devs,
>>>>
>>>> After some discussion with Aaron, Ben, Dan, and Chad, I'm wondering if
>>>> anyone objects to dividing .kepler into two different areas -- there would
>>>> be areas for 1) persistent items (e.g. provenance database) and 2) temporary
>>>> items (e.g. cache). This would make it more apparent which things could be
>>>> deleted without serious ramification (temp/), and the idea would be items in
>>>> peristent/ should stick around and be dealt with during kepler upgrades for
>>>> backwards compatibility.
>>>>
>>>> Also, I think we should utilize the InstanceAuthNamespace in these
>>>> paths, so that items from different Kepler instances are separated.
>>>>
>>>> This could look like (imagine multiple namespace dirs):
>>>>
>>>> a)
>>>> .kepler/perisistent/gamma.msi.ucsb.edu.OpenAuth.1278/
>>>> .kepler/temp/gamma.msi.ucsb.edu.OpenAuth.1278/
>>>>
>>>> or b)
>>>> .kepler/gamma.msi.ucsb.edu.OpenAuth.1278/persistent/
>>>> .kepler/gamma.msi.ucsb.edu.OpenAuth.1278/temp/
>>>>
>>>> or c)
>>>> .kepler_temp/gamma.msi.ucsb.edu.OpenAuth.1278/
>>>> .kepler_persistent/gamma.msi.ucsb.edu.OpenAuth.1278/
>>>>
>>>> I prefer a).
>>>>
>>>> This partly came out of a discussion of bug 4514. I think the
>>>> configuration files could be stored beneath these new paths, probably in
>>>> persistent.
>>>> http://bugzilla.ecoinformatics.org/show_bug.cgi?id=4514
>>>>
>>>> A better solution might be to just have a .kepler to store temporary
>>>> things, and to store persistent items in an OS-appropriate location, but I
>>>> think this might be a larger change than we want to take on at the moment,
>>>> as we try to get 2.0 out of the door.
>>>>
>>>> Let me know what you think,
>>>> Derik
>>>> _______________________________________________
>>>> Kepler-dev mailing list
>>>> Kepler-dev at kepler-project.org
>>>> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-dev
>>>>
>>>
>>> _______________________________________________
>>> Kepler-dev mailing list
>>> Kepler-dev at kepler-project.org
>>> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-dev
>>>
>>
>>
> _______________________________________________
> Kepler-dev mailing list
> Kepler-dev at kepler-project.org
> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mercury.nceas.ucsb.edu/kepler/pipermail/kepler-dev/attachments/20091119/7083ebe5/attachment.html>


More information about the Kepler-dev mailing list