EML questions at HFR
Peter McCartney
peter.mccartney at asu.edu
Tue Apr 8 09:17:19 PDT 2003
Well that's a pretty significant qualifier that I think clearly negates
relying on a simple synchronized system that doesn't have the ability to
archive and manage links among various versioned documents My point was
simply: whether you produce eml dynamically and give it a new packageid
every time to do so, or whether you religiously assign new version numbers
whenever an edit is made, you have to make some management decisions that
control when older content should or should not be changed. The solution
that NCEAS proposed of whole and fractional identifiers that are all
variants on a single "concept" (to borrow a term from the taxonomy people)
is one way of doing it, but its just one way of managing it - its not built
into EML.
EML is a standard...not a set of rules!
Peter McCartney (peter.mccartney at asu.edu)
Center for Environmental-Studies
Arizona State University
-----Original Message-----
From: Matt Jones [mailto:jones at nceas.ucsb.edu]
Sent: Tuesday, April 08, 2003 8:06 AM
To: Emery R. Boose
Cc: Peter McCartney; 'David Blankman'; Iml; Eml-Dev (E-mail); Jeanine
Subject: Re: EML questions at HFR
Actually, there are those that would argue (me :) that EML content
should definitely change dynamically as the management database changes.
However, I would qualify that by saying that those changes should be
easily recognizable by also changing the identifier for the EML
documents so that external parties can see that there was a change. I
don't know of anyone who would argue that metadata should be static --
rather, it should track the data it is describing, but those changes
should be easily recognizable as new revisions of content published earlier.
I think the easiest way to sync with an external source of responsible
party info is to provide the userId field pointing at your directory
system. Thus, when I download your data and metadata and store it on my
disk, and in six years I read it and get an out-of-service message for
Joe Scientist at 804-678-9870, I can look at the userId and directory
and look up his current phone and address (assuming your directory is
still in place and working as it was originally when the metadata were
distributed, which is a bit of a stretch but possible).
Matt
Emery R. Boose wrote:
> Interesting ... We have numerous ongoing projects that will require
> EML
> updates at least annually. And we'd like to provide up-to-date contact
> information (address, phone, email) and publications for all projects,
> though that information could be provided in other ways and not
> necessarily as part of the EML.
>
>
> At 09:16 AM 4/7/2003 -0700, Peter McCartney wrote:
>
>> well there are those that would argue that once published, EML
>> content
>> shouldnt be allowed to change automatically according to changes in a
>> current management database. But there are two sides to the coin. When
>> area codes change (as they do in Arizona every 5 years or so), you
>> dont want all your data contacts broken. However, as project
>> participants work their way up the hierarcy, you dont want their roles
>> on past projects to be changed to reflect thier roles on current ones.
>> So its really an issue for everyone deciding when and when not to
>> update information in EML.
>>
>>
>> Peter McCartney (peter.mccartney at asu.edu
>> <mailto:peter.mccartney at asu.edu>) Center for Environmental-Studies
>> Arizona State University
>>
>>
>> -----Original Message-----
>> From: Emery R. Boose [mailto:boose at fas.harvard.edu]
>> Sent: Sunday, April 06, 2003 4:32 PM
>> To: Peter McCartney; 'David Blankman'
>> Cc: Iml; Eml-Dev (E-mail); Jeanine
>> Subject: RE: EML questions at HFR
>>
>> Hi Peter & David,
>>
>> Thanks for such rapid and helpful responses!
>>
>> Sounds like synchronization may be a critical issue for those
>> intending to use EML for content storage (and not just exchange)
>> ...
>>
>> Best, Emery
>>
>>
>>
>> At 03:04 PM 4/4/2003 -0700, Peter McCartney wrote:
>>
>>> what we were striving for in eml was to ensure that there was
>>> a place to describe the research activities that were
>>> directly responsible for the production of a dataset. in my
>>> opionoin a core area program or an LTER site project is
>>> probably a related project to the specific project. But from
>>> the data management workshop, it became obvious that many
>>> sites dont have this concept of specific research projects,
>>> so we removed the methods discussion from the project
>>> descpription to ensure that the specific activites related to
>>> this dataset could be expressed even though they were not
>>> part of the project description.So the bottom line is
>>> describe the smallest resolution research activity as the
>>> project, and then link related project descriptions for any
>>> "parent" projects.
>>>
>>>
>>> EML requires that the original content for any block of
>>> information that is referenced with <references> must appear
>>> once in the document. That means that if you want to have a
>>> unique list of content items that you only have to enter
>>> once, you must manage that list separately and then draw from
>>> it when you creat your eml files. one obvious way is to have
>>> a database of these things (for example - you already have
>>> the LTER persBIB database from which you could draw personnel
>>> descriptions) and then write some tool that pastes that
>>> content into your eml files as you build them up. another way
>>> would be manage an xml file with party fragments in it as you
>>> describe, and copy and paste from that into your eml files as
>>> you go. there is really no effective way to maintain the
>>> syncronization between your eml files and the external
>>> source. the system and scope attributes can be used to
>>> indicate that this content is identified in an external
>>> source, but you'd have to write your own custom software to
>>> manage the relationship. We're discovering that this can lead
>>> to key violations since the xml parsers dont consider the
>>> scope attribute when enforcing the unique constraint on ID's,
>>> so we're now looking at writing our own metadata into
>>> additionalMetadata to manage the relationship between content
>>> blocks and our relational database.
>>>
>>> you can enter discontinuous polygons in geographic coverage
>>> if you want to represent the study ares of two lter sites as
>>> the study area of a project. Bear in mind that geographic
>>> coverage at the resource level is primarily discovery
>>> information. there is a separate section under methods where
>>> you can define the extent of your study area with respect to
>>> its spatial sampling universe. In principle these are the
>>> same, but entries in /dataset/coverage tend to be imprecise
>>> bounding boxes
>>>
>>>
>>> Peter McCartney (peter.mccartney at asu.edu
>>> <mailto:peter.mccartney at asu.edu>)
>>> Center for Environmental-Studies
>>> Arizona State University
>>
> _______________________________________________ eml-dev mailing list
> eml-dev at ecoinformatics.org
> http://www.ecoinformatics.org/mailman/listinfo/eml-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20030408/d61c9da4/attachment.htm
More information about the Eml-dev
mailing list