[kepler-dev] proposed changes to .kepler
Aaron Schultz
aschultz at nceas.ucsb.edu
Thu Nov 19 22:13:02 PST 2009
Hi Matt,
Yes definitely, the persistent data directory would need to be
upgradeable between versions, but the transient directory would not.
-Aaron
On Nov 19, 2009, at 9:45 PM, Matt Jones <jones at nceas.ucsb.edu> wrote:
> I concur with Bertram. We need to do the engineering so that these
> decisions need not be made by users, so that upgrades occur
> seamlessly, and so upgrades automatically migrate work that they
> have done in the older version so that it is compatible in the newer
> version. If we are doing our jobs right, users should not end up
> with work (actors, workflow, metadata, user-defined ontologies,
> etc.) stranded in an older version, and they should lose no
> information, including the Kepler instance information, if they
> completely replace their old install with a new install.
>
> Matt
>
> On Thu, Nov 19, 2009 at 8:27 AM, Aaron Schultz <aschultz at nceas.ucsb.edu
> > wrote:
> Hi Ben, I would agree that this is the most sensible approach.
> Have a user defined directory for persistent data that is version
> specific and have a .kepler directory that is transient.
>
> -Aaron
>
>
> On Nov 19, 2009, at 8:05 AM, ben leinfelder
> <leinfelder at nceas.ucsb.edu> wrote:
>
> In response to "What constitutes a version of Kepler" --
> What if we offloaded the burden of figuring that out so that the
> "correct" .kepler is selected by the person actually running an
> instance of Kepler?
> Think about how Eclipse uses workspaces: It's up to me to select the
> appropriate one at startup.
> 1) When I upgrade Eclipse (i.e. Kepler) I can either create a new
> workspace (.kepler) or opt to "convert" the existing one
> (upgrade .kepler)
> 2) If I've installed plugins (i.e. Kepler modules) and they are
> referenced in my workspace (.kepler) but then I attempt to run a
> version of Eclipse (Kepler) that lacks a plugin that error is my
> fault - I can either install the plugin (Kepler module) or use a
> different workspace (.kepler)
>
> We'd of course loose the ability for different installations to
> share across .kepler directories, but that simplifies things for us.
>
> At the very least, I think it would be a good feature [that
> potentially saves us a lot of trouble] to specify an
> alternative .kepler location on startup.
>
> -ben (expanded from Oliver's input)
>
> On Nov 18, 2009, at 7:23 PM, David Welker wrote:
>
> If we want a more sophisticated solution, as Ben is suggesting, I
> think a good place to start thinking about the data that we interact
> with Kepler is this chart.
>
> Basically, the problem that Ben is getting at is what if we have
> persistent data that is inconsistent across versions? There are two
> ways to handle this problem:
>
> 1.) Make sure that there is no persistent data that is incompatible
> across versions.
> 2.) Store inconsistent persistent data in different folders.
>
> The issue with solution 2 is that is probably unwieldy. What
> constitutes a different version of Kepler? I would argue that a
> different version of Kepler does not constitute merely a release,
> but rather any unique combination of modules. That is, if I am
> running Kepler 2.0 with the WRP suite, that is a different version
> than Kepler 2.0, which is a different version than Kepler 2.0 with
> the tagging suite, which is different than Kepler 2.0 with the PPOD
> suite, which is different than Kepler 2.0 with a custom ad hoc
> suite, which is different than Kepler 1.0 with the WRP suite... You
> get the idea. The number of different versions is unwieldy. For that
> reason, I feel that solution 2.) is unworkable (although at one time
> I did lean towards that myself).
>
> So, we need to implement solution 1. Basically, we should make sure
> that no input files from .kepler are capable of causing any version
> of Kepler to crash. To do this, we might include meta-data within
> the data files that identifies the sort of data that is stored in a
> particular file or section of a file. Then, a particular version of
> Kepler could be made to be smart enough to know what sorts of data
> it can handle and which sort of data it can't. This also has the
> advantage of avoiding the situation where multiple versions of
> Kepler are unable to share persistent data that is in fact compatible.
>
> Back in the day, Tim and I made a chart that explores the sort of
> data that we interact with in Kepler. It might be useful to think
> about. Check that out here:
>
> https://kepler-project.org/developers/teams/framework/classification-of-persistent-system-state
>
>
> On Nov 18, 2009, at 8:41 PM, ben leinfelder wrote:
>
> A few questions:
> 1) What other files in .kepler will be considered "persistent"
> 2) What is the plan for different versions of Kepler running with
> the same .kepler?
> a) do we have another directory branch for each version:
> ~/.kepler/kepler-2.0/...
> ~/.kepler/kepler-2.1/...
> 3) When thinking about multiple instances of Kepler on one machine
> are we aiming to support:
> a) different versions being run at different times
> b) the same version being run concurrently
> c) different versions being run concurrently
>
> Depending on what we support, we'll have to revisit the embedded
> HSQLDB that is used for the cache and provenance in addition to
> these directory structures.
>
> -ben
>
> On Nov 18, 2009, at 2:08 PM, Derik Barseghian wrote:
>
> Kepler devs,
>
> After some discussion with Aaron, Ben, Dan, and Chad, I'm wondering
> if anyone objects to dividing .kepler into two different areas --
> there would be areas for 1) persistent items (e.g. provenance
> database) and 2) temporary items (e.g. cache). This would make it
> more apparent which things could be deleted without serious
> ramification (temp/), and the idea would be items in peristent/
> should stick around and be dealt with during kepler upgrades for
> backwards compatibility.
>
> Also, I think we should utilize the InstanceAuthNamespace in these
> paths, so that items from different Kepler instances are separated.
>
> This could look like (imagine multiple namespace dirs):
>
> a)
> .kepler/perisistent/gamma.msi.ucsb.edu.OpenAuth.1278/
> .kepler/temp/gamma.msi.ucsb.edu.OpenAuth.1278/
>
> or b)
> .kepler/gamma.msi.ucsb.edu.OpenAuth.1278/persistent/
> .kepler/gamma.msi.ucsb.edu.OpenAuth.1278/temp/
>
> or c)
> .kepler_temp/gamma.msi.ucsb.edu.OpenAuth.1278/
> .kepler_persistent/gamma.msi.ucsb.edu.OpenAuth.1278/
>
> I prefer a).
>
> This partly came out of a discussion of bug 4514. I think the
> configuration files could be stored beneath these new paths,
> probably in persistent.
> http://bugzilla.ecoinformatics.org/show_bug.cgi?id=4514
>
> A better solution might be to just have a .kepler to store temporary
> things, and to store persistent items in an OS-appropriate location,
> but I think this might be a larger change than we want to take on at
> the moment, as we try to get 2.0 out of the door.
>
> Let me know what you think,
> Derik
> _______________________________________________
> Kepler-dev mailing list
> Kepler-dev at kepler-project.org
> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-dev
>
> _______________________________________________
> Kepler-dev mailing list
> Kepler-dev at kepler-project.org
> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-dev
>
>
> _______________________________________________
> Kepler-dev mailing list
> Kepler-dev at kepler-project.org
> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-dev
>
> _______________________________________________
> Kepler-dev mailing list
> Kepler-dev at kepler-project.org
> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mercury.nceas.ucsb.edu/kepler/pipermail/kepler-dev/attachments/20091119/684942f1/attachment.html>
More information about the Kepler-dev
mailing list