[kepler-dev] Meeting about .kepler and related issues

Aaron Schultz aschultz at nceas.ucsb.edu
Wed Jun 3 10:35:11 PDT 2009


David thanks for clearing that up, I now understand what you mean by a 
Kepler installation being in a user unwritable area.

We have the three solutions that I gave previously for determining the 
location of run-time module writable areas:

(1) creating a subdirectory in the project root folder for modules to 
write to
(2) Storing the path to the run-time module writable area somewhere 
within the project root folder
(3) having a database where each Kepler installation connects to and 
selects the path to the run-time module writable area

In the case of installations in user unwritable areas solution (1) is 
not a viable option
Solution (2) is viable as long as the file storing the path does not 
ever need to be updated by the user
Solution (3) is viable and could actually be configured to be 
user-specific by storing the username along with the path information 
inside a database that is common to all Kepler instances.

Currently solution (2) is implemented and writes the 
InstanceAuthNamespace file in the project root folder the first time 
Kepler is run.
This file never needs to be updated by the user.  This solution also 
provides the ability for the administrator to choose whether or not 
multiple Kepler installations should use the same run-time module 
writable areas (by copying the InstanceAuthNamespace file to the project 
root folder of each installed instance of Kepler).

Solution (3) would require multiple installations of the hsql database 
engine, one common hsql installation to store the path information (and 
potentially the user information) and one hsql installation for each 
Kepler instance on the machine (and potentially more hsql installations 
for each user-specified run-time module writeable area).

Aaron


David Welker wrote:
> Hi Aaron,
>
> It is more clear to me what you are thinking now. You are thinking the 
> reason we cannot write to parts of Kepler is because those parts will 
> be packaged as jars. But that isn't the issue. The issue is that 
> sometimes Kepler is going to be installed in a read-only area. This 
> has been done at Stanford and here at UC Davis when Bertram had Kepler 
> installed Kepler for his students. In a recent bug report, this was 
> the case for someone affiliated with the European Southern 
> Observatory. Allowing system administrators to install Kepler in a 
> read-only area is a really important installation scenario.
>
> So, now do you see the reason you cannot write things to the 
> kepler.modules directory? You cannot assume you have write access. It 
> has nothing to do with jar files.
>
> I am completely in agreement with you that the cache should be a real 
> cache. But, keep in mind that .kepler directory does not need to be 
> the cache. Instead, it could be a directory that happens to store the 
> cache in a subdirectory and other data in another subdirectory. The 
> cache could be stored in a separate subdirectory than where other user 
> data associated with Kepler is stored. Alternatively, we could have 
> .kepler-cache store the cache, while .kepler-data store data or 
> something like that.
>
> David
>>
>>
>> Sure sounds good, 2pm to 3pm is fine for me.  I will have to leave at 
>> 3pm.
>>
>> Any solution is fine with me as long as each module of each Kepler 
>> installation has a place on disk that it can write to that does not 
>> clash with other modules from other installations.
>>
>> Whether that solution is creating a subdirectory in the project root 
>> folder for modules to write to
>> Storing the path to a location on disk that a module can write to (as 
>> the InstanceAuthNamespace file is essentially doing now)
>> or having a database where each kepler installation connects to and 
>> selects the directory name that it is meant to use for writing to
>> or some other solution....
>>
>> whether or not the cache is stored in that directory, just depends on 
>> what you want your cache to be doing,
>>
>> is it really the library where you want to keep information that 
>> exists nowhere else?
>> or is it just a useful tool for making the program run faster?
>> If it is just there to make things faster then it is simply a 
>> reflection of what is on disk
>> and therefore can be rebuilt and deleted without effort (eliminating 
>> duplication)
>>
>> regardless of the cache, each module in each kepler installation 
>> needs a unique location on disk it can use for writing and reading 
>> that is not inside it's own module directory (since this will someday 
>> be jarred and become read-only through the publication process - from 
>> how I understand that it will work now).
>>
>> In eclipse the writeable directory for modules (aka bundles) is the  
>> "eclipse/configuration/<bundle name>/" directory
>> the modules (aka bundles) themselves are stored in
>> "eclipse/plugins/<bundle name>.jar"
>> where only the jar is a read-only entity and preferably signed with a 
>> security certificate
>>
>> I would point out that the core configuration file for an eclipse 
>> instance does not reside inside a module but inside the project root 
>> folder
>> then each bundle can contribute configuration information to the 
>> application as a whole.
>>
>> I see no reason why we cannot write things to the project root 
>> directory at the base system level.
>>
>> Aaron
>>
>>
>> Matt Jones wrote:
>>> Hi All,
>>>
>>> I can not make a meeting this week, but I will be very interested in
>>> the proposal that will come out of this discussion, and will comment
>>> on it when I see a concrete proposal.  In the meantime, I'd second the
>>> idea that we should not be writing new or temporary files to the
>>> kepler installation directories -- any temporary or instance-specific
>>> files that need to be preserved across upgrades probably should reside
>>> in .kepler.  I don't fully understand what is being proposed yet, but
>>> with my current limited perspective I can see some issues with having
>>> multiple subdirectories within .kepler for each version -- for one
>>> thing it would represent a lot of unnecessary duplication.  For
>>> example, there would be multiple copies of the cache database and
>>> cached data files, as two examples. Looking forward to seeing your
>>> proposal when I return next week.
>>>
>>> Matt
>>>
>>> On Tue, Jun 2, 2009 at 3:52 PM, David Welker 
>>> <david.v.welker at gmail.com> wrote:
>>>  
>>>> Matt, Tim, Chad, Aaron and other interested parties:
>>>>
>>>> Would you be available to meet on Thursday at 2 pm to discuss the 
>>>> issues
>>>> identified below?
>>>>
>>>> David
>>>>   
>>>>> Feel free to set up a meeting.
>>>>>
>>>>> David Welker wrote:
>>>>>     
>>>>>> A couple of points:
>>>>>>
>>>>>> (1) The question of whether what support we are going to have for
>>>>>> multiple installations of Kepler versus having one installation 
>>>>>> of Kepler
>>>>>> that is configurable has yet to be fully developed via 
>>>>>> discussion. Your
>>>>>> design assumes that we are supporting multiple installations of 
>>>>>> Kepler on
>>>>>> the same machine and that each of these multiple installations 
>>>>>> have their
>>>>>> own cache. Those are all reasonable ideas, but these ideas have 
>>>>>> yet to be
>>>>>> discussed and agreed upon by Kepler management or the various 
>>>>>> development
>>>>>> teams. This is not a minor implementation decision, and it has 
>>>>>> implications
>>>>>> that need to be discussed.
>>>>>>
>>>>>> (2) The implications of the fact that Kepler could be installed in a
>>>>>> read-only area are far reaching. It certainly affects the build 
>>>>>> system.
>>>>>> Right now, modules.txt in the build-area is regularly rewritten 
>>>>>> with every
>>>>>> issuance of the change-to command. But, what if Kepler is 
>>>>>> installed in a
>>>>>> read-only area? It sounds like in that case, we are going to have 
>>>>>> to think
>>>>>> about what we can do if modules.txt cannot be written. Like, for 
>>>>>> example,
>>>>>> having a version of modules.txt in the .kepler directory that is 
>>>>>> read from
>>>>>> for example. So, this is definitely not unrelated to the general 
>>>>>> work you
>>>>>> are doing with .kepler right now. Also, we need to think about 
>>>>>> where kars
>>>>>> are build. Originally, kars were built in the common module. 
>>>>>> Then, they were
>>>>>> built in the kepler.modules directory. Then, they were build in 
>>>>>> the module
>>>>>> where the came from. However, if Kepler is installed in a read-only
>>>>>> locations, none of these solutions for where kars are built would be
>>>>>> acceptable. We would probably want to build kars in the .kepler 
>>>>>> or similar
>>>>>> directory in the user directory.
>>>>>>
>>>>>> (3) There are also implications here for the configuration 
>>>>>> system. In
>>>>>> general, the system default configurations could be in a 
>>>>>> read-only area.
>>>>>> Therefore, user changes to configuration need to be stored in a 
>>>>>> different
>>>>>> location than the original configuration files, which would 
>>>>>> contain default
>>>>>> configuration options.
>>>>>>
>>>>>> (4) For development purposes, I do not think we need to support 
>>>>>> caches
>>>>>> for multiple installations of Kepler. However, for your testing 
>>>>>> purposes, it
>>>>>> would be acceptable to have the build system generate these sorts 
>>>>>> of files,
>>>>>> just as the installer would. But, before that, we need to think 
>>>>>> about what
>>>>>> sort of support we are going to provide for multiple 
>>>>>> installations of Kepler
>>>>>> and only then go forward with an implementation or strategy.
>>>>>>
>>>>>> (5) Kepler itself should not try to write files that in the 
>>>>>> normal course
>>>>>> of events should be written by the installer only. Instead, the 
>>>>>> system
>>>>>> should have a sensible default behavior that it utilizes when 
>>>>>> such files are
>>>>>> absent.
>>>>>>
>>>>>> (6) If you delete "only one file" in an installation of Kepler or 
>>>>>> any
>>>>>> other software, there is a good chance that this will in fact 
>>>>>> mess up the
>>>>>> software. Of course, the more robust we can make the system, the 
>>>>>> better.
>>>>>> However, there is no reason to think that people will be likely 
>>>>>> to delete
>>>>>> this file. Especially if it is written to a less visible location 
>>>>>> in a
>>>>>> subdirectory of the build-area folder. Of course, I must admit 
>>>>>> that when I
>>>>>> see a file like this in kepler.modules, I am in fact tempted to 
>>>>>> delete it
>>>>>> because it doesn't feel like it belongs there.
>>>>>>
>>>>>> (7) In general, we should avoid storing non module files in
>>>>>> kepler.modules to reduce the probability of name clashes.
>>>>>>
>>>>>> (8) Also, if the purpose of the class is to uniquely identify a
>>>>>> particular installation of Kepler, I think a better name than
>>>>>> InstanceAuthNamespace should be selected. It is not at all clear 
>>>>>> from the
>>>>>> file names what the function of this file is -- although one gets 
>>>>>> a sense of
>>>>>> how it does whatever it does do (i.e. with namespaces). I think 
>>>>>> in general,
>>>>>> it would be better to select file names thinking about "what" 
>>>>>> rather than
>>>>>> "how," especially as implementations are liable to change in the 
>>>>>> future.
>>>>>>
>>>>>> Anyway, I think we need to have a discussion about these issues that
>>>>>> includes Tim and Matt and other interested parties at a minimum, 
>>>>>> and not
>>>>>> just you, me, and Chad. We really need to think about the 
>>>>>> implications of
>>>>>> these decisions before moving forward.
>>>>>>
>>>>>> David
>>>>>>
>>>>>>       
>>>>>>> I think it's ok to have that file in the project root, alongside 
>>>>>>> all of
>>>>>>> the read-only module jars, the installer could initially 
>>>>>>> generate the file,
>>>>>>> but we still need the runtime mechanism for development 
>>>>>>> purposes.  It also
>>>>>>> increases the reliability quite a bit, imagine if I had to rerun my
>>>>>>> installer just because I deleted that one file...
>>>>>>>
>>>>>>> Aaron
>>>>>>>
>>>>>>>
>>>>>>> David Welker wrote:
>>>>>>>         
>>>>>>>> If you are going to have a file that uniquely identifies an
>>>>>>>> installation and that file is going to be written to 
>>>>>>>> kepler.modules (I think
>>>>>>>> a better location would be in a subfolder of the build-area 
>>>>>>>> folder where it
>>>>>>>> not likely to be seen very often), then it must be written by 
>>>>>>>> the installer
>>>>>>>> and not by any other process. In general, Kepler is likely to 
>>>>>>>> be installed
>>>>>>>> in a read-only area of disk, so such a file must be generated 
>>>>>>>> only once at
>>>>>>>> installation time.
>>>>>>>>
>>>>>>>> David
>>>>>>>>           
>>>>>>>>> I thought that would work (and actually moved it there 
>>>>>>>>> yesterday) but
>>>>>>>>> the problem arises that if you have two different 
>>>>>>>>> installations of Kepler on
>>>>>>>>> the same machine they both end up using the 
>>>>>>>>> .kepler/InstanceAuthNamespace
>>>>>>>>> file and therefore the same cache.  Now this may be what you 
>>>>>>>>> want but may
>>>>>>>>> not be what you want.  Say I have a kepler base configuration 
>>>>>>>>> installation
>>>>>>>>> and a Kepler WRP installation, there are cases where I may 
>>>>>>>>> want them to use
>>>>>>>>> the same cache or I may not want to use the same cache.  If the
>>>>>>>>> InstanceAuthNamespace is in .kepler then they both MUST use 
>>>>>>>>> the same cache.
>>>>>>>>>  If the InstanceAuthNamespace file is stored at the project 
>>>>>>>>> root then they
>>>>>>>>> can both use different caches or they can both use the same 
>>>>>>>>> cache (by
>>>>>>>>> copying the InstanceAuthNamespace file to both project root 
>>>>>>>>> directories).
>>>>>>>>>
>>>>>>>>> Aaron
>>>>>>>>>
>>>>>>>>> Chad Berkley wrote:
>>>>>>>>>             
>>>>>>>>>> Could it go in the root of the .kepler directory so the 
>>>>>>>>>> system will
>>>>>>>>>> know where to find it?  Or maybe in some 'common' directory 
>>>>>>>>>> in .kepler that
>>>>>>>>>> doesn't rely on the unique id system?
>>>>>>>>>>
>>>>>>>>>> chad
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Aaron Schultz wrote:
>>>>>>>>>>               
>>>>>>>>>>> Hi Chad,
>>>>>>>>>>>
>>>>>>>>>>> The InstanceAuthNamespace file contains a java serialized 
>>>>>>>>>>> string for
>>>>>>>>>>> the unique authority and namespace of any given kepler 
>>>>>>>>>>> installation (which
>>>>>>>>>>> is either retrieved from a webservice or assigned via UUID).
>>>>>>>>>>>
>>>>>>>>>>> For each installation of kepler on a given machine there 
>>>>>>>>>>> would be
>>>>>>>>>>> (actually there now is) a subdirectory in the .kepler 
>>>>>>>>>>> directory that was
>>>>>>>>>>> named using the authority and namespace.
>>>>>>>>>>> Each of these .kepler/instance directories would contain 
>>>>>>>>>>> it's own
>>>>>>>>>>> cache as we discussed yesterday....
>>>>>>>>>>>
>>>>>>>>>>> So we need that one file to find all the other files for any 
>>>>>>>>>>> given
>>>>>>>>>>> installation of kepler...
>>>>>>>>>>>
>>>>>>>>>>> Aaron
>>>>>>>>>>>
>>>>>>>>>>> Chad Berkley wrote:
>>>>>>>>>>>                 
>>>>>>>>>>>> Hi Aaron,
>>>>>>>>>>>>
>>>>>>>>>>>> I've been seeing this file appear and wondering what it 
>>>>>>>>>>>> was.  I
>>>>>>>>>>>> think things like this should be written to .kepler or some 
>>>>>>>>>>>> other user
>>>>>>>>>>>> directory.  I agree with David that the kepler project 
>>>>>>>>>>>> directory should
>>>>>>>>>>>> probably not be written to in general.  Maybe this type of 
>>>>>>>>>>>> file should be
>>>>>>>>>>>> written to the cache so that it could be programatically 
>>>>>>>>>>>> purged if and when
>>>>>>>>>>>> it needs to be.
>>>>>>>>>>>>
>>>>>>>>>>>> thanks,
>>>>>>>>>>>> chad
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> David Welker wrote:
>>>>>>>>>>>>                   
>>>>>>>>>>>>> Hi Aaron,
>>>>>>>>>>>>>
>>>>>>>>>>>>> In general, you need to avoid writing anything to the
>>>>>>>>>>>>> kepler.modules project root directory. In the general 
>>>>>>>>>>>>> case, this directory
>>>>>>>>>>>>> is likely to be stored in a read-only area of disk.
>>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>>                     
>>>>>>>>>>>>>> You will want to delete the InstanceAuthNamespace file in 
>>>>>>>>>>>>>> your
>>>>>>>>>>>>>> project root directory.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Otherwise you will have troubles getting proper object 
>>>>>>>>>>>>>> ids for
>>>>>>>>>>>>>> LSIDs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Aaron
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> Kepler-dev mailing list
>>>>>>>>>>>>>> Kepler-dev at kepler-project.org
>>>>>>>>>>>>>> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-dev 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                         
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Kepler-dev mailing list
>>>>>>>>>>>>> Kepler-dev at kepler-project.org
>>>>>>>>>>>>> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-dev 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>                       
>>>>>>>>>>>                   
>>>>>>>>>               
>>>>>>>           
>>>>>       
>>>> _______________________________________________
>>>> Kepler-dev mailing list
>>>> Kepler-dev at kepler-project.org
>>>> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-dev
>>>>
>>>>     
>>>
>>>
>>>
>>>   
>>
>>
>
>



More information about the Kepler-dev mailing list