[LTER-emlbestpractices] RE: [LTER-im] [Fwd: [Fwd: Re: FW: Report from Metacat Harvester: Wed Aug 25 11:00:36 MDT 2004]]
Duane Costa
dcosta at lternet.edu
Mon Sep 20 16:25:48 PDT 2004
I haven't received any feedback on this since I sent the original message
out ten days ago.
To recap, we would like to know how the business rules, as enforced by the
EML parser, should handle the case described below. If the system attribute
in the <reference> element is considered required in this case, the
programming enhancement in the EML parser would be easier to implement, but
it makes things a little less flexible in composing the EML.
Knowing how this case should be handled will help us produce a more accurate
estimate of the effort required for the EML Parser enhancement that is
needed to support the recent schema change made by Peter (i.e. the schema
change that allows two elements with the same id value, but different system
values, to exist within the same document).
Thanks,
Duane
> -----Original Message-----
> From: eml-dev-admin at ecoinformatics.org [mailto:eml-dev-
> admin at ecoinformatics.org] On Behalf Of Duane Costa
> Sent: Friday, September 10, 2004 3:58 PM
> To: eml-dev at ecoinformatics.org
> Subject: RE: [LTER-emlbestpractices] RE: [LTER-im] [Fwd: [Fwd: Re: FW:
> Report from Metacat Harvester: Wed Aug 25 11:00:36 MDT 2004]]
>
> This is the follow-on email Mark promised below. Following Matt's example,
> I
> am sending it to eml-dev rather than only to the original recipients.
>
> The main issue Mark and I have come across is that the business rules will
> need to clarify the following case:
>
> An element is specified with both an id attribute and a system attribute.
> The id is unique (i.e. its value is unique) within the document. A
> subsequent element references the id but does not contain a system
> attribute. Is this legal?
>
> One could make the case that since the id is unique within the document,
> the
> reference to the id does not need to specify the system attribute because
> there is no chance of ambiguity.
>
> Here is a concrete example to illustrate this case:
>
> <dataset>
> <title>Sample dataset Description</title>
> <creator id="23445" scope="system" system="abc">
> <individualName>
> <surName>Smith</surName>
> </individualName>
> </creator>
> .
> .
> .
> <contact>
> <references>23445</references>
> </contact>
>
> Should the <references> element be considered valid if it does not include
> the appropriate system attribute (as above), provided that the id is
> unique
> within the document, or should it be required to be specified (as below)
> regardless of whether or not the id is unique within the document?:
>
> .
> .
> .
> <contact>
> <references system="abc">23445</references>
> </contact>
>
> We see the advantages and disadvantages as follows:
>
> (1) If the system attribute is optional, it gives the user more
> flexibility
> in composing a valid document, but it complicates the error checking logic
> in the EML parser.
>
> (2) If the system attribute is required, the user is more restricted, and
> there might be the potential of invalidating existing documents (we're not
> able to judge how real a problem this might be). The advantage of this
> approach is that it simplifies the new logic that needs to be added to the
> EML parser. It also enforces symmetry between the referenced element and
> the
> referencing element: if the referenced element specifies that it is in the
> scope of a given system, then the referencing element should also specify
> that it is in the scope of that same system.
>
>
> Duane
>
>
> > -----Original Message-----
> > From: Mark Servilla [mailto:servilla at lternet.edu]
> > Sent: Friday, September 10, 2004 12:44 PM
> > To: Peter McCartney; Matt Jones
> > Cc: James Brunt; Duane Costa
> > Subject: Re: [LTER-emlbestpractices] RE: [LTER-im] [Fwd: [Fwd: Re: FW:
> > Report from Metacat Harvester: Wed Aug 25 11:00:36 MDT 2004]]
> >
> > Peter/Matt,
> >
> > Duane and I have discussed some of the EML Parser modifications required
> > by the schema change. A rough estimate would be 1-2 weeks of an FTE.
> > There are, however, some unknowns depending on how one would view the
> > business rule logic for determining EML validity. Duane will summarize
> > this (by examples) in a follow-on email.
> >
> > Sincerely,
> > Mark
> >
> > ---
> > Mark Servilla, Ph.D.
> >
> > LTER Network Office
> > Department of Biology
> > MSC 03 2020
> > 1 University of New Mexico
> > Albuquerque, NM 87131-0001
> >
> > servilla at lternet.edu
> > Office (505) 277-2619
> > Cell (505) 453-8593
> >
> >
> > Peter McCartney wrote:
> > > Ok thanks Matt. I see you already created a 2.02 milestone. So I think
> > > we need some feeback on 1662 and we can probably move forward.
> > >
> > > Peter McCartney (peter.mccartney at asu.edu)
> > > Center for Environmental-Studies
> > > Arizona State University
> > >
> > >
> > >
> > >
> > >>-----Original Message-----
> > >>From: Matt Jones [mailto:jones at nceas.ucsb.edu]
> > >>Sent: Thursday, September 02, 2004 9:50 AM
> > >>To: Peter McCartney
> > >>Cc: eml-dev at ecoinformatics.org; emlbestpractices at lternet.edu;
> > >>im at lternet.edu
> > >>Subject: Re: [LTER-im] [Fwd: [Fwd: Re: FW: Report from
> > >>Metacat Harvester: Wed Aug 25 11:00:36 MDT 2004]]
> > >>
> > >>
> > >>Hi Peter,
> > >>
> > >>Peter McCartney wrote:
> > >>
> > >>>That probably depends on whethere a 2.02 should only address this
> > >>>issue (in which case I think a month could handle it) or
> > >>
> > >>more. I did
> > >>
> > >>>not pay a great deal of attention to the 2.01 process, so I
> > >>
> > >>don't know
> > >>
> > >>>the procedural details - did a branch or tag get created
> > >>
> > >>for 2.01 in
> > >>
> > >>>cvs?New Development is done on the HEAD of cvs. Then a tag
> > >>
> > >>is created to
> > >>mark the files as released for a particular version (e.g., the tag
> > >>RELEASE_EML_2_0_1 was the most recent). If fixes need to happen to a
> > >>release, then the tagged files are forked into a branch and
> > >>patched --
> > >>so far we haven't needed to do that.
> > >>
> > >>
> > >>>Bugzilla does not seem to have a version tag for 2.01 so
> > >>
> > >>how were bugs
> > >>
> > >>>related to that kept separate from other bugs?
> > >>
> > >>We create a target milestone for each release, so bugs are
> > >>targeted at
> > >>that. If you 'Change columns' in your bugzilla display and
> > >>add 'Target
> > >>Milestone' and sort by that field then things make more
> > >>sense. Official
> > >>'versions' get created when each release is made so that bugs can be
> > >>filed against that version. I usually create a tracker bug as the
> > >>release nears that lists all of the odds and ends that need
> > >>to be done
> > >>to get the release out the door (e.g., see bug 1195
> > >>http://bugzilla.ecoinformatics.org/show_bug.cgi?id=1195)
> > >>
> > >>When bugs are initially filed, they default to the milestone
> > >>'Unspecified'. Someone then needs to choose a target
> > >>milestone for that
> > >>bug, at which point it enters the list of TODO's before the
> > >>release gets
> > >>released. Obviously, due to various time constraints some
> > >>bug targets
> > >>are changed to a later target in order to release in a timely manner.
> > >>
> > >>The few discussions ive
> > >>
> > >>>particpated in would indicate that a possible roadmap out
> > >>
> > >>there goes
> > >>
> > >>>something like this:
> > >>>
> > >>>2.02 - support for scoping id's to system
> > >>> ? Support for multiple authentication systems within eml-access.
> > >>>Ive talked with matt about this, but I don't think
> > >>
> > >>there is a bug
> > >>
> > >>>entered yet.
> > >>>
> > >>>2.1? - support for updatable, online dictionaries for
> > >>
> > >>enumerated content
> > >>
> > >>>(file format, connection schemas, units, projections, etc) -
> > >>>similar to virus definition files.
> > >>> ? New modules for resource types - we are working on an
> > >>>eml-model candidate under our ITR grant and are about to send
> > >>>out invitations for a meeting on that this fall.
> > >>>3.0? - probably major restructuring to better support semantic
> > >>>extensions...
> > >>
> > >>Sounds good to me, and is in line with the issues that I've seen
> > >>discussed. We have target milestones for each of those versions, but
> > >>don't have bug descriptions for all of those tasks. I'm not
> > >>sure about
> > >>what 'multiple authentication systems' would involve, but its
> > >>certainly
> > >>worth creating a bug and discussing what to do about it.
> > >>
> > >>Cheers,
> > >>Matt
> > >>
> > >>
> > >>>Peter McCartney (peter.mccartney at asu.edu)
> > >>>Center for Environmental-Studies
> > >>>Arizona State University
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>>-----Original Message-----
> > >>>>From: Mark Servilla [mailto:servilla at lternet.edu]
> > >>>>Sent: Wednesday, September 01, 2004 4:00 PM
> > >>>>To: Peter McCartney
> > >>>>Cc: Matt Jones; jbrunt at LTERnet.edu;
> > >>>>eml-dev at ecoinformatics.org; emlbestpractices at lternet.edu;
> > >>>>im at lternet.edu
> > >>>>Subject: Re: [LTER-im] [Fwd: [Fwd: Re: FW: Report from
> > >>>>Metacat Harvester: Wed Aug 25 11:00:36 MDT 2004]]
> > >>>>
> > >>>>
> > >>>>Matt/Peter,
> > >>>>
> > >>>>Duane and I will evaluate the level of effort necessary for
> > >>>>the changes
> > >>>>to the EML-parser based on Peter's schema mods. I hope to
> > >>
> > >>have a LOE
> > >>
> > >>>>defined by next week. Assuming it is not too great (and with
> > >>>>agreement
> > >>>
> > >>>>from our management), we will then enter the task into our
> > >>>
> > >>>>schedule. In
> > >>>>addition, we would be glad to take a crack at
> > >>
> > >>reviewing/updating the
> > >>
> > >>>>documentation.
> > >>>>
> > >>>>What is (in your opinions) the overall urgency of this task
> > >>>>(i.e., what
> > >>>>would be a reasonable target date for EML-2.0.2)?
> > >>>>--------------
> > >>>>
> > >>>>Matt,
> > >>>>
> > >>>>Would you please add both Duane and myself to the eml-cvs
> > >>>>list service.
> > >>>>
> > >>>>Is the EML-parser within the Metacat cvs or a separate cvs? If
> > >>>>separate, Duane will need update permission.
> > >>>>
> > >>>>Thanks!
> > >>>>
> > >>>>Sincerely,
> > >>>>Mark
> > >>>>
> > >>>>Peter McCartney wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>>>I will.
> > >>>>>On Tue, 2004-08-31 at 14:42, Matt Jones wrote:
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>>Yeah, I think there might be essentially full agreement
> > >>
> > >>on the right
> > >>
> > >>>>>>approach here -- minor differences maybe in what we
> > >>>>
> > >>>>emphasize. In the
> > >>>>
> > >>>>
> > >>>>>>interest of moving forward, is anyone willing to take the lead on
> > >>>>>>developing the schema changes and other changes needed
> > >>
> > >>for a 2.0.2
> > >>
> > >>>>>>release that would deal Mark's #2 proposal? They should
> > >>
> > >>be pretty
> > >>
> > >>>>>>minor, but I'm feeling kind of swamped, and the 2.0.1
> > >>>>
> > >>>>release was enough
> > >>>>
> > >>>>
> > >>>>>>of a burden that I'm not real excited to start right back
> > >>>>
> > >>>>up on it given
> > >>>>
> > >>>>
> > >>>>>>other priorities.
> > >>>>>>
> > >>>>>>Matt
> > >>>>>>
> > >>>>>>Peter McCartney wrote:
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>>Careful. i never said that to solve the example james just
> > >>>>
> > >>>>described
> > >>>>
> > >>>>
> > >>>>>>>that one should take the first node. i said that in the
> > >>
> > >>case where
> > >>
> > >>>>>>>you have duplicated content and have given both of them
> > >>>>
> > >>>>the same id
> > >>>>
> > >>>>
> > >>>>>>>and system, you can take the first, or any, node and it doesn't
> > >>>>>>>matter. in the case of James's example, Mark's fix# 2
> > >>
> > >>applies - i
> > >>
> > >>>>>>>think we are all in agreement on that.
> > >>>>>>>
> > >>>>>>>The suggestion that we just don't include ids for things
> > >>>>
> > >>>>we know are
> > >>>>
> > >>>>
> > >>>>>>>duplicating will of course solve the problem and that is probably
> > >>>>>>>what we will do for now. However, it has the unfortunate
> > >>>>
> > >>>>side effect
> > >>>>
> > >>>>
> > >>>>>>>that it takes away our ability to maintain a relationship
> > >>>>
> > >>>>within EML
> > >>>>
> > >>>>
> > >>>>>>>back to the original source content (because all of the
> > >>
> > >>content in
> > >>
> > >>>>>>>our EML files is just a copy of the original record in our
> > >>>>
> > >>>>database
> > >>>>
> > >>>>
> > >>>>>>>anyway). This is very useful when loading EML files into a
> > >>>>
> > >>>>relational
> > >>>>
> > >>>>
> > >>>>>>>database through the xanthoria put method. But thats our
> > >>
> > >>problem...
> > >>
> > >>>>>>>
> > >>>>>>>On Tue, 2004-08-31 at 13:43, Matt Jones wrote:
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>>Hi James,
> > >>>>>>>>
> > >>>>>>>>Yes, that's exactly the problem. Peter is proposing to
> > >>>>
> > >>>>solve it by
> > >>>>
> > >>>>
> > >>>>>>>>taking the *first* of the redundant trees. But, which is
> > >>>>
> > >>>>first depends
> > >>>>
> > >>>>
> > >>>>>>>>on whether you traverse the document in breadth-first order or
> > >>>>>>>>depth-first order. That, to me, is just asking for
> > >>>>
> > >>>>trouble -- we'd be
> > >>>>
> > >>>>
> > >>>>>>>>asking people to remember to put the subtree they want
> > >>>>
> > >>>>referenced in the
> > >>>>
> > >>>>
> > >>>>>>>>"depth-first" first node, which can change as the
> > >>>>
> > >>>>structure of the tree
> > >>>>
> > >>>>
> > >>>>>>>>changes. Hard to do and harder to maintain.
> > >>>>>>>>
> > >>>>>>>>Also, if we do it this way, we should probably check to
> > >>>>
> > >>>>be sure that
> > >>>>
> > >>>>
> > >>>>>>>>two
> > >>>>>>>>subtrees that have identical id's also have identical
> > >>>>
> > >>>>content, which is
> > >>>>
> > >>>>
> > >>>>>>>>not a trivial programming task (assuming they are
> > >>>>
> > >>>>identical could easily
> > >>>>
> > >>>>
> > >>>>>>>>lead to conflicting information).
> > >>>>>>>>
> > >>>>>>>>I would far prefer to keep the links unambiguous (ie, references
> > >>>>>>>>always
> > >>>>>>>>can be resolved to one and only one id). If someone
> > >>>>
> > >>>>doesn't want to
> > >>>>
> > >>>>
> > >>>>>>>>deal with that stuff, they can always omit the ids and
> > >>>>
> > >>>>just duplicate
> > >>>>
> > >>>>
> > >>>>>>>>the content, which is why we made the ids optional originally.
> > >>>>>>>>
> > >>>>>>>>Matt
> > >>>>>>>>
> > >>>>>>>>James W Brunt wrote:
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>>Just a clarification...The specific error example we have been
> > >>>>>>>>>discussing is concerning two identical ids with
> > >>>>
> > >>>>different content...
> > >>>>
> > >>>>
> > >>>>>>>>><dataset id="30" system="ces_dataset"> ... Is different from
> > >>>>>>>>><creator id="30" system="ces_party"> ....
> > >>>>>>>>>
> > >>>>>>>>>Admittedly, were the content the same we would still get
> > >>>>
> > >>>>the error
> > >>>>
> > >>>>
> > >>>>>>>>>(if
> > >>>>>>>>>the parser is written to the spec). However, if there
> > >>>>
> > >>>>were (in this case
> > >>>>
> > >>>>
> > >>>>>>>>>there wasn't) a
> > >>>>>>>>>
> > >>>>>>>>><references>30</references>
> > >>>>>>>>>
> > >>>>>>>>>it would be ambiguous. Correct?
> > >>>>>>>>>
> > >>>>>>>>>James
> > >>>>>>>>>
> > >>>>>>>>>Peter McCartney wrote:
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>>On Tue, 2004-08-31 at 11:35, Matt Jones wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>>>It would really help me justify the extra work involed in
> > >>>>>>>>>>>>managing ids and references if someone could give me
> > >>>>
> > >>>>a concrete
> > >>>>
> > >>>>
> > >>>>>>>>>>>>example of why it would be bad to have a document
> > >>
> > >>contain two
> > >>
> > >>>>>>>>>>>>elements with identical ids and identical content.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>Like in other relational systems, The key (id) acts as a
> > >>>>>>>>>>>surrogate
> > >>>>>>>>>>>for the content. So, references should resolve to one
> > >>>>
> > >>>>(and only one)
> > >>>>
> > >>>>
> > >>>>>>>>>>>id. It is far harder to validate that the content is
> > >>>>
> > >>>>the same between
> > >>>>
> > >>>>
> > >>>>>>>>>>>two nodes with identical keys than it is to validate
> > >>>>
> > >>>>that no key is
> > >>>>
> > >>>>
> > >>>>>>>>>>>duplicated. I think they got this right in the
> > >>>>
> > >>>>relational model, and
> > >>>>
> > >>>>
> > >>>>>>>>>>>we should follow that lead. If you allow duplicate
> > >>>>
> > >>>>ids, then I am
> > >>>>
> > >>>>
> > >>>>>>>>>>>sure this situation will arise:
> > >>>>>>>>>>>
> > >>>>>>>>>>><a id="1">foo</a>
> > >>>>>>>>>>><a id="1">bar></a>
> > >>>>>>>>>>><b><references>1</references></b>
> > >>>>>>>>>>>
> > >>>>>>>>>>>What is the value of <b>? foo, or bar? It is indeterminate.
> > >>>>>>>>>>>And
> > >>>>>>>>>>>this is precisely why this is a problem.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>I agree this would be bad, but this is not what is
> > >>>>
> > >>>>happening. The
> > >>>>
> > >>>>
> > >>>>>>>>>>documents that are being rejected have: <a id="1">foo</a> <a
> > >>>>>>>>>>id="1">foo></a> Typically, when this happens, the code is
> > >>>>>>>>>>obviously not
> > >>>>
> > >>>>bothering with
> > >>>>
> > >>>>
> > >>>>>>>>>>references tags, so we aren't likely to create broken
> > >>>>
> > >>>>or ambiguous
> > >>>>
> > >>>>
> > >>>>>>>>>>reference tags. Even if we did throw in a
> > >>>>>>>>>><b><references>1</references></b>, it really wouldn't
> > >>>>
> > >>>>be a problem. In
> > >>>>
> > >>>>
> > >>>>>>>>>>some of our files where attributes are repeated in view
> > >>>>
> > >>>>entities, we are
> > >>>>
> > >>>>
> > >>>>>>>>>>also getting this:
> > >>>>>>>>>>
> > >>>>>>>>>><a id="1">foo</a>
> > >>>>>>>>>><a id="2">foo></a>
> > >>>>>>>>>>
> > >>>>>>>>>>but your parser hasn't spotted that one yet :) and again, even
> > >>>>>>>>>>though it violates the spec, i would contend that this
> > >>>>
> > >>>>causes no
> > >>>>
> > >>>>
> > >>>>>>>>>>problem.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>>If my xpath returns one or several nodes and they
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>>are all identical, why is it so bad to just assume
> > >>>>
> > >>>>that the rule
> > >>>>
> > >>>>
> > >>>>>>>>>>>>is: "identical id (and system) means identical
> > >>>>
> > >>>>content" and just
> > >>>>
> > >>>>
> > >>>>>>>>>>>>use the first one in the list?
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>Because relational models have shown that this never
> > >>
> > >>works. I
> > >>
> > >>>>>>>>>>>think
> > >>>>>>>>>>>that such an assumption will result in lots of broken docs.
> > >>>>>>>>>>>
> > >>>>>>>>>>>I think it is no more work to write parsers to
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>>check for differences between nodes ith similar ids
> > >>>>
> > >>>>than it is
> > >>>>
> > >>>>
> > >>>>>>>>>>>>to check for duplicate ids in the first place, but it makes
> > >>>>>>>>>>>>generating valid eml a LOT simpler.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>Generating valid eml with only one copy of a subtree
> > >>>>
> > >>>>is easy --
> > >>>>
> > >>>>
> > >>>>>>>>>>>just
> > >>>>>>>>>>>track whether you've already inserted it, and reference it
> > >>>>>>>>>>>thereafter. I don't understand at all why this is
> > >>>>
> > >>>>hard. However, I
> > >>>>
> > >>>>
> > >>>>>>>>>>>do understand the problem with system not being
> > >>>>
> > >>>>included in the
> > >>>>
> > >>>>
> > >>>>>>>>>>>assessment of the uniqueness of the ID. So I like
> > >>
> > >>the idea of
> > >>
> > >>>>>>>>>>>pursuing Mark's suggestion (2).
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>I also like pursuing 2 regardless of how we debate over 3 and
> > >>>>>>>>>>would support a hasty 2.02 to revise the spec
> > >>
> > >>documentation and
> > >>
> > >>>>>>>>>>add an optional system attribute as such:
> > >>>>>>>>>>
> > >>>>>>>>>><references system="ces_dataset">201</references>.
> > >>>>>>>>>>Keep in mind that when people hear you (or me, or
> > >>>>
> > >>>>anyone...) say
> > >>>>
> > >>>>
> > >>>>>>>>>>"its not that hard" they are thinking "sure, if you
> > >>>>
> > >>>>have a team of
> > >>>>
> > >>>>
> > >>>>>>>>>>Java programmers!"). So perhaps it would help to
> > >>>>
> > >>>>provide some code
> > >>>>
> > >>>>
> > >>>>>>>>>>samples that can be adapted to the kind of approaches
> > >>>>
> > >>>>people are
> > >>>>
> > >>>>
> > >>>>>>>>>>taking with more off-the-shelf tools so that people don't feel
> > >>>>>>>>>>like the only way to work with valid eml is to use one set of
> > >>>>>>>>>>tools from one shop. For example, the approach we take in
> > >>>>>>>>>>Xanthoria for converting from RDBMS to xml is
> > >>
> > >>actually a fairly
> > >>
> > >>>>>>>>>>common one that appears in Cocoon, XML spy's RDPMS
> > >>>>
> > >>>>mapping tool,
> > >>>>
> > >>>>
> > >>>>>>>>>>and many other vendor-specific DB->xml modules.
> > >>>>
> > >>>>Specifically, the
> > >>>>
> > >>>>
> > >>>>>>>>>>rdbms content is exported to a generic, denormalized
> > >>>>
> > >>>>xml and then
> > >>>>
> > >>>>
> > >>>>>>>>>>transformed with xsl to map to the desired schema. So for most
> > >>>>>>>>>>cases, the place where this tracking needs to be done
> > >>>>
> > >>>>is likely to
> > >>>>
> > >>>>
> > >>>>>>>>>>be in XSL. While we have found it relatively easy when
> > >>>>
> > >>>>parsing EML
> > >>>>
> > >>>>
> > >>>>>>>>>>in XSL to follow references to find the content, we have also
> > >>>>>>>>>>found that tracking things within xsls when writing out
> > >>>>
> > >>>>eml to be
> > >>>>
> > >>>>
> > >>>>>>>>>>a cumbersome process, let alone making sure that each
> > >>>>
> > >>>>time we do
> > >>>>
> > >>>>
> > >>>>>>>>>>it it is going to come out consistent. So if there is some xsl
> > >>>>>>>>>>sample that we can easily add to xanthoria style sheets
> > >>>>
> > >>>>to solve
> > >>>>
> > >>>>
> > >>>>>>>>>>this problem, then thats cool. Otherwise, I really
> > >>>>
> > >>>>think it would
> > >>>>
> > >>>>
> > >>>>>>>>>>be folly to hang too long on this when we (LTER that is) have
> > >>>>>>>>>>bigger fish to fry. Namely, building a better search
> > >>>>
> > >>>>interface for
> > >>>>
> > >>>>
> > >>>>>>>>>>searching LTER data via eml. The query interface is
> > >>
> > >>what the CC
> > >>
> > >>>>>>>>>>spent hours talking about in Fairbanks, so if we come back in
> > >>>>>>>>>>Miami with the ID problem solved but no improved query
> > >>>>
> > >>>>system, I'd
> > >>>>
> > >>>>
> > >>>>>>>>>>prefer not be the one to give that powerpoint.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>>Matt
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>>Peter McCartney (peter.mccartney at asu.edu)
> > >>>>>>>>>>>>Center for Environmental-Studies
> > >>>>>>>>>>>>Arizona State University
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>>-----Original Message-----
> > >>>>>>>>>>>>>From: owner-im at lternet.edu [mailto:owner-im at lternet.edu] On
> > >>>>>>>>>>>>>Behalf Of James W Brunt
> > >>>>>>>>>>>>>Sent: Monday, August 30, 2004 2:57 PM
> > >>>>>>>>>>>>>To: eml-dev at ecoinformatics.org;
> > >>
> > >>emlbestpractices at lternet.edu;
> > >>
> > >>>>>>>>>>>>>im at lternet.edu
> > >>>>>>>>>>>>>Subject: [LTER-im] [Fwd: [Fwd: Re: FW: Report from Metacat
> > >>>>>>>>>>>>>Harvester: Wed Aug 25 11:00:36 MDT 2004]]
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>Peter, et. al,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>Mark's email to me (below) has reinforced my own conclusion
> > >>>>>>>>>>>>>about the id, system, references question. There
> > >>
> > >>at least 2
> > >>
> > >>>>>>>>>>>>>possibly 3 issues (bugs if you will) here to be dealt with:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>1. The eml normative documentation needs to
> > >>
> > >>reflect the real
> > >>
> > >>>>>>>>>>>>>intent and use of the system attribute. Read (Can
> > >>
> > >>O Worms).
> > >>
> > >>>>>>>>>>>>>Options as I see them:
> > >>>>>>>>>>>>> a. deprecate the system attribute until it can be better
> > >>>>>>>>>>>>>defined - ignore 2 and 3 below (Mark goes even
> > >>>>
> > >>>>further on this one
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>below).
> > >>>>>>>>>>>>> b. clearly define the system attribute and make
> > >>>>
> > >>>>the changes in
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>2 and 3 below.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>2. <references> tag needs to be made system/scope aware
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>3. EMLparser needs to enforce the final outcome of 1 and 2.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>Currently, the documentation introduces system but it's
> > >>>>>>>>>>>>>definition does not supercede the unique ID
> > >>>>
> > >>>>requirement within
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>a document, references is not system aware, EMLparser is
> > >>>>>>>>>>>>>enforcing exactly what the documentation says.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>Turning off the ID checking as Peter has suggested
> > >>>>
> > >>>>(different
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>thread) would result in uninterpretable EML
> > >>>>
> > >>>>documents were the
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>references tag to be used (Although, in all but one
> > >>>>
> > >>>>case in the
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>example below there were no references to the IDs).
> > >>>>
> > >>>>I don't see
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>this as an intermediate solution.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>The intent as I remember all that long discussion
> > >>
> > >>ago was to
> > >>
> > >>>>>>>>>>>>>create a way to get around having to completely duplicate
> > >>>>>>>>>>>>>content in a document. Thus creating a more
> > >>
> > >>compact document
> > >>
> > >>>>>>>>>>>>>and one that would be more easily maintained for
> > >>
> > >>someone not
> > >>
> > >>>>>>>>>>>>>generating the documents
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>>from a database. I'm sure I can be clarified some here by
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>>others
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>>that were present. I realize the difficulty in tracking a
> > >>>>>>>>>>>>>document
> > >>>>>>>>>>>>>ID map for every document you automatically generate
> > >>>>
> > >>>>however I
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>really don't understand why you wouldn't completely
> > >>>>
> > >>>>duplicate the
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>content. However, the inclusion of a second
> > >>>>
> > >>>>qualifying attribute
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>that has to be checked for every id tag is doable
> > >>>>
> > >>>>but before we
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>begin something like this it must be clearly
> > >>
> > >>spelled-out and
> > >>
> > >>>>>>>>>>>>>agreeable to the group(s). We'd like to hear from eml-dev,
> > >>>>>>>>>>>>>eml-bestpractices, and im as well as individual
> > >>
> > >>stakeholders.
> > >>
> > >>>>>>>>>>>>>Thanks,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>James
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>--
> > >>>>>>>>>>>>>James W. Brunt
> > >>>>>>>>>>>>>Associate Director for Information Management
> > >>>>>>>>>>>>>Long Term Ecological Research Network Office Department of
> > >>>>>>>>>>>>>Biology University of New Mexico
> > >>>>>>>>>>>>>Albuquerque, NM 87131-1091
> > >>>>>>>>>>>>>505 272 7085
> > >>>>>>>>>>>>>jbrunt at lternet.edu
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>-------- Original Message --------
> > >>>>>>>>>>>>>From: Mark Servilla <servilla at lternet.edu>
> > >>>>>>>>>>>>>To: James Brunt <jbrunt at lternet.edu>
> > >>>>>>>>>>>>>Subject: [Fwd: Re: FW: Report from Metacat
> > >>>>
> > >>>>Harvester: Wed Aug
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>25 11:00:36 MDT 2004]
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>James,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>After reviewing the EML specification documents, it
> > >>>>
> > >>>>appears to
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>me that duplicate IDs within a single instance
> > >>>>
> > >>>>document is not
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>valid EML, and therefore (IMHO), the EML Parser is behaving
> > >>>>>>>>>>>>>correctly. I cannot see how setting either the
> > >>>>
> > >>>>SYSTEM or SCOPE
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>attribute can be used by the REFERENCES element to
> > >>>>
> > >>>>distinguish
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>duplicate IDs within a single document (perhaps someone in
> > >>>>>>>>>>>>>eml-dev can help answer how SYSTEM/SCOPE are used in this
> > >>>>>>>>>>>>>context).
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>Some possible solutions are:
> > >>>>>>>>>>>>>(1) Deprecate SYSTEM/SCOPE attributes in this
> > >>>>
> > >>>>context, update
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>the specification to reflect such change, and do not allow
> > >>>>>>>>>>>>>duplicate IDs.
> > >>>>>>>>>>>>>(2) Modify the specification to allow SYSTEM/SCOPE
> > >>
> > >>to narrow
> > >>
> > >>>>>>>>>>>>>the ID
> > >>>>>>>>>>>>>scope, thereby allowing duplicate IDs when qualified
> > >>>>
> > >>>>by either
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>SYSTEM/SCOPE -- and, modify the specification for
> > >>>>
> > >>>>REFERENCES to
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>make use of such change.
> > >>>>>>>>>>>>>(3) Deprecate REFERENCES completely and force
> > >>>>
> > >>>>repeated content.
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>Just my thoughts - thanks!
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>Mark
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>-------- Original Message --------
> > >>>>>>>>>>>>>Subject: Re: FW: Report from Metacat Harvester: Wed Aug 25
> > >>>>>>>>>>>>>11:00:36 MDT 2004
> > >>>>>>>>>>>>>Date: Mon, 30 Aug 2004 09:26:13 -0600
> > >>>>>>>>>>>>>From: Mark Servilla <servilla at lternet.edu>
> > >>>>>>>>>>>>>To: 'Corinna Gries' <corinna at asu.edu>
> > >>>>>>>>>>>>>CC: James Brunt <jbrunt at lternet.edu>, Duane Costa
> > >>>>>>>>>>>>><dcosta at lternet.edu>
> > >>>>>>>>>>>>>References: <E1C0TNQ-00066I-00 at lternet.lternet.edu>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>Hi Corinna,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>I have been discussing this issue of ID attributes
> > >>>>
> > >>>>with James
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>and Duane here at LNO. Please correct me if I am
> > >>
> > >>wrong, but
> > >>
> > >>>>>>>>>>>>>the section on Reusable Content (below or
> > >>>>>>>>>>>>>http://knb.ecoinformatics.org/software/eml/eml-2.0.1/
> > >>>>
> > >>>>index.htm
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>l#reusableContent)
> > >>>>>>>>>>>>>states that "two identical ids cannot exist in a single
> > >>>>>>>>>>>>>document".
> > >>>>>>>>>>>>>It appears that the "SYSTEM" attribute only allows
> > >>>>
> > >>>>identical ids in
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>multiple documents within the system (that is, only
> > >>>>
> > >>>>if the repeated
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>ids reference the exact same object) - something
> > >>>>
> > >>>>like globalizing
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>the id'ed object to the system for repeated
> > >>>>
> > >>>>reference in one or
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>more documents, but not necessarily allowing
> > >>>>
> > >>>>identical ids within a
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>single document by changing the SYSTEM attribute
> > >>>>
> > >>>>value. I am not
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>really sure how one would take advantage of the
> > >>>>
> > >>>>SYSTEM attribute
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>for reusable content. And, I don't know the
> > >>>>
> > >>>>provenance of this
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>particular issue (the documentation could certainly
> > >>>>
> > >>>>be more clear),
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>but if we were to follow the documentation as we
> > >>>>
> > >>>>interpret, would
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>this still be a bug in the Harvester/Metacat software?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>Sincerely,
> > >>>>>>>>>>>>>Mark
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>3.3. Reusable Content
> > >>>>>>>>>>>>>EML allows the reuse of previously defined
> > >>>>
> > >>>>structured content
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>(DOM
> > >>>>>>>>>>>>>sub-trees) through the use of key/keyRef type
> > >>
> > >>references. In
> > >>
> > >>>>>>>>>>>>>order for an EML package to remain cohesive and to
> > >>>>
> > >>>>allow for the
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>cross platform compatability of packages, the
> > >>>>
> > >>>>following rules with
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>respect to packaging must be followed. 1. An ID is
> > >>>>
> > >>>>required on the
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>eml root element. 2. IDs are optional on all other
> > >>>>
> > >>>>elements. 3. If
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>an ID is not provided, that content must be interpreted as
> > >>>>>>>>>>>>>representing a distinct object. 4. If an ID is
> > >>>>
> > >>>>provided for content
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>then that content is distinct
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>>from all other content except for that content that
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>>references its ID. 5. If a user wants to reuse content to
> > >>>>>>>>>>>>>indicate
> > >>>>>>>>>>>>>the repetition of an object, a reference must be used. Two
> > >>>>>>>>>>>>>identical ids cannot exist in a single document. 6.
> > >>>>
> > >>>>"Document"
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>scope is defined as identifiers unique only to a
> > >>>>
> > >>>>single instance
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>document (if a document does not have a system
> > >>>>
> > >>>>attribute or if
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>scope is set to 'document' then all IDs are defined
> > >>>>
> > >>>>as distinct
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>content). 7. "System" scope is defined as
> > >>>>
> > >>>>identifiers unique to an
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>entire data management system (if two documents
> > >>>>
> > >>>>share a system
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>string, then any IDs in those two documents that are
> > >>>>
> > >>>>identical
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>refer to the same object). 8. If an element
> > >>>>
> > >>>>references another
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>element, it must not have an ID itself. 9. All EML
> > >>>>
> > >>>>packages must
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>have the 'eml' module as the root. 10. The system and scope
> > >>>>>>>>>>>>>attribute are always optional except for at the
> > >>>>
> > >>>>'eml' module where
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>the scope attribute is fixed as 'system'. The scope
> > >>>>
> > >>>>attribute
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>defaults to 'document' for all other modules.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>Duane Costa wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>Could anyone comment as to whether the EML error reported
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>by Metacat
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>below is a genuine EML error versus a bug in
> > >>
> > >>Metacat or the
> > >>
> > >>>>>>>>>>>>>>EML validator program? The issue is whether the id
> > >>>>
> > >>>>value for
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>><dataset> must be unique from the id value for <creator>.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>Thanks,
> > >>>>>>>>>>>>>>Duane
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>-----Original Message-----
> > >>>>>>>>>>>>>>From: Corinna Gries [mailto:corinna at asu.edu]
> > >>>>>>>>>>>>>>Sent: Thursday, August 26, 2004 3:48 PM
> > >>>>>>>>>>>>>>To: dcosta at lternet.edu
> > >>>>>>>>>>>>>>Subject: RE: Report from Metacat Harvester: Wed Aug 25
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>11:00:36 MDT 2004
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>Hi Duane,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>I am trying to fix these problems with our eml
> > >>>>
> > >>>>files. Some are
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>>easy because they are actual errors in our files,
> > >>>>
> > >>>>but there is
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>one where I
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>wonder if the ID checking is right. I understood
> > >>
> > >>IDs should
> > >>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>be unique
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>within the system, that is for example:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>><dataset id="30" system="ces_dataset"> ... Is different
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>from <creator
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>>>id="30" system="ces_party"> ....
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>However, your harvester complains that they are the same:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>*****************************************************
> > >>>>
> > >>>>**********
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>*******
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>**
> > >>>>>>>>>>>>>>*****
> > >>>>>>>>>>>>>>*
> > >>>>>>>>>>>>>>* METACAT HARVESTER REPORT: Wed Aug 25 11:00:36 MDT 2004
> > >>>>>>>>>>>>>>*
> > >>>>>>>>>>>>>>* A TOTAL OF 22 ERRORS WERE DETECTED.
> > >>>>>>>>>>>>>>* Please see the log entries below for additonal details.
> > >>>>>>>>>>>>>>*
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>*****************************************************
> > >>>>
> > >>>>*********
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>**********
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>*****
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>*****************************************************
> > >>>>
> > >>>>*********
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>**********
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>*****
> > >>>>>>>>>>>>>>*
> > >>>>>>>>>>>>>>* harvestLogID: 5549
> > >>>>>>>>>>>>>>* harvestDate: Wed Aug 25 11:00:36 MDT 2004
> > >>>>>>>>>>>>>>* status: 1
> > >>>>>>>>>>>>>>* message: * harvestOperationCode:
> > >>>>
> > >>>>InsertDocError
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>>* description: Error inserting EML
> > >>>>
> > >>>>document to Metacat
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>>* detailLogID: 383
> > >>>>>>>>>>>>>>* errorMessage: MetacatException: <?xml
> > >>>>
> > >>>>version="1.0"?>
> > >>>>
> > >>>>>>>>>>>>>><error>
> > >>>>>>>>>>>>>>Error running xpath expression:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>//dateTimeDomain|//nonNumericDomain|//numericDomain|/
> > >>>>
> > >>>>/access|/
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>/attribute
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>List|//constraint|//coverage|//temporalCoverage|//geo
> > >>>>
> > >>>>graphicCov
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>List|erage|/
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>List|/t
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>axonomicCoverage|/dataset|/eml/dataset|//dataSource|/
> > >>>>
> > >>>>/dataTable
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>axonomicCoverage||//othe
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>axonomicCoverage|rE
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>ntity|//citation|//address|//conferenceLocation|//par
> > >>>>
> > >>>>ty|//origi
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>ntity|nator|/
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>ntity|/c
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>reator|//contact|//publisher|//editor|//recipient|//p
> > >>>>
> > >>>>erformer|/
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>reator|/instit
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>reator|ut
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>ion|//metadataProvider|//associatedParty|//personnel|
> > >>>>
> > >>>>//physical
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>ion||//conn
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>ion|ec
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>tionDefinition|//distribution|//researchProject|//pro
> > >>>>
> > >>>>ject|//rel
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>tionDefinition|atedPro
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>tionDefinition|je
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>ct|//software|//spatialRaster|//spatialReference|//sp
> > >>>>
> > >>>>atialVecto
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>ct|r|//sto
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>ct|re
> > >>>>>>>>>>>>>>dProcedure|//view|//protocol|//additionalMetadata :
> > >>>>
> > >>>>Error in
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>>dProcedure|xml
> > >>>>>>>>>>>>>>document. This EML document is not valid because
> > >>
> > >>the id 30
> > >>
> > >>>>>>>>>>>>>>occurs more than once. IDs must be unique. </error>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>* scope: ces_dataset
> > >>>>>>>>>>>>>>* identifier: 30
> > >>>>>>>>>>>>>>* revision: 1
> > >>>>>>>>>>>>>>* documentType: eml://ecoinformatics.org/eml-2.0.0
> > >>>>>>>>>>>>>>* documentURL:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>http://seinet.asu.edu/DataCatalog/getXanthoriaRecord.
> > >>>>
> > >>>>jsp?source
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>=ces_da
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>ta
> > >>>>>>>>>>>>>>set_mohave&id=30
> > >>>>>>>>>>>>>>*
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>*****************************************************
> > >>>>
> > >>>>*********
> > >>>>
> > >>>>
> > >>>>>>>>>>>>>**********
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>*****
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>What do you think?
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>Corinna
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>_______________________________________________
> > >>>>>>>>>>>>>>eml-dev mailing list
> > >>>>>>>>>>>>>>eml-dev at ecoinformatics.org
> > >>>>>>>>>>>>>>http://www.ecoinformatics.org/mailman/listinfo/eml-dev
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>--
> > >>>>>>>>>>>>>Mark Servilla, Ph.D.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>LTER Network Office
> > >>>>>>>>>>>>>Department of Biology
> > >>>>>>>>>>>>>MSC 03 2020
> > >>>>>>>>>>>>>1 University of New Mexico
> > >>>>>>>>>>>>>Albuquerque, NM 87131-0001
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>servilla at lternet.edu
> > >>>>>>>>>>>>>Office (505) 277-2619
> > >>>>>>>>>>>>>Cell (505) 453-8593
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>--
> > >>>>>>>>>>>>>Mark Servilla, Ph.D.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>LTER Network Office
> > >>>>>>>>>>>>>Department of Biology
> > >>>>>>>>>>>>>MSC 03 2020
> > >>>>>>>>>>>>>1 University of New Mexico
> > >>>>>>>>>>>>>Albuquerque, NM 87131-0001
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>servilla at lternet.edu
> > >>>>>>>>>>>>>Office (505) 277-2619
> > >>>>>>>>>>>>>Cell (505) 453-8593
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>--
> > >>>>>>>>>>>>>James W. Brunt
> > >>>>>>>>>>>>>Associate Director for Information Management
> > >>>>>>>>>>>>>Long Term Ecological Research Network Office Department of
> > >>>>>>>>>>>>>Biology University of New Mexico
> > >>>>>>>>>>>>>Albuquerque, NM 87131-1091
> > >>>>>>>>>>>>>505 272 7085
> > >>>>>>>>>>>>>jbrunt at lternet.edu
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>-------------------------------------------------
> > >>>>>>>>>>>>>Long-Term Ecological Research Network Mailing List
> > >>>>>>>>>>>>>im at LTERnet.edu
> > >>>>
> > >>>>http://sql.lternet.edu/cgi/mailgroups_view.pl?> im
> > >>>>
> > >>>>
> > >>>>>>>>>>>>_______________________________________________
> > >>>>>>>>>>>>eml-dev mailing list
> > >>>>>>>>>>>>eml-dev at ecoinformatics.org
> > >>>>>>>>>>>>http://www.ecoinformatics.org/mailman/listinfo/eml-dev
> > >>>>
> > >>>>--
> > >>>>Mark Servilla, Ph.D.
> > >>>>
> > >>>>LTER Network Office
> > >>>>Department of Biology
> > >>>>MSC 03 2020
> > >>>>1 University of New Mexico
> > >>>>Albuquerque, NM 87131-0001
> > >>>>
> > >>>>servilla at lternet.edu
> > >>>>Office (505) 277-2619
> > >>>>Cell (505) 453-8593
> > >>>
> > >>>
> > >>--
> > >>-------------------------------------------------------------------
> > >>Matt Jones jones at nceas.ucsb.edu
> > >>http://www.nceas.ucsb.edu/ Fax: 425-920-2439 Ph: 907-789-0496
> > >>National Center for Ecological Analysis and Synthesis (NCEAS)
> > >>University of California Santa Barbara Interested in
> > >>ecological informatics? http://www.ecoinformatics.org
> > >>-------------------------------------------------------------------
> > >>
> > >
> > >
> > > -------------------------------------------------
> > > Long-Term Ecological Research Network Mailing List
> > > emlbestpractices at LTERnet.edu
> > > http://longterm.lternet.edu/groups/members.php?groupid0
>
> _______________________________________________
> eml-dev mailing list
> eml-dev at ecoinformatics.org
> http://www.ecoinformatics.org/mailman/listinfo/eml-dev
More information about the Eml-dev
mailing list