[LTER-im] [Fwd: [Fwd: Re: FW: Report from Metacat Harvester: Wed Aug 25 11:00:36 MDT 2004]]
Peter McCartney
peter.mccartney at asu.edu
Thu Sep 2 08:25:48 PDT 2004
That probably depends on whethere a 2.02 should only address this issue
(in which case I think a month could handle it) or more. I did not pay a
great deal of attention to the 2.01 process, so I don't know the
procedural details - did a branch or tag get created for 2.01 in cvs?
Bugzilla does not seem to have a version tag for 2.01 so how were bugs
related to that kept separate from other bugs? The few discussions ive
particpated in would indicate that a possible roadmap out there goes
something like this:
2.02 - support for scoping id's to system
? Support for multiple authentication systems within eml-access.
Ive talked with matt about this, but I don't think there is a bug
entered yet.
2.1? - support for updatable, online dictionaries for enumerated content
(file format, connection schemas, units, projections, etc) -
similar to virus definition files.
? New modules for resource types - we are working on an
eml-model candidate under our ITR grant and are about to send
out invitations for a meeting on that this fall.
3.0? - probably major restructuring to better support semantic
extensions...
Peter McCartney (peter.mccartney at asu.edu)
Center for Environmental-Studies
Arizona State University
> -----Original Message-----
> From: Mark Servilla [mailto:servilla at lternet.edu]
> Sent: Wednesday, September 01, 2004 4:00 PM
> To: Peter McCartney
> Cc: Matt Jones; jbrunt at LTERnet.edu;
> eml-dev at ecoinformatics.org; emlbestpractices at lternet.edu;
> im at lternet.edu
> Subject: Re: [LTER-im] [Fwd: [Fwd: Re: FW: Report from
> Metacat Harvester: Wed Aug 25 11:00:36 MDT 2004]]
>
>
> Matt/Peter,
>
> Duane and I will evaluate the level of effort necessary for
> the changes
> to the EML-parser based on Peter's schema mods. I hope to have a LOE
> defined by next week. Assuming it is not too great (and with
> agreement
> from our management), we will then enter the task into our
> schedule. In
> addition, we would be glad to take a crack at reviewing/updating the
> documentation.
>
> What is (in your opinions) the overall urgency of this task
> (i.e., what
> would be a reasonable target date for EML-2.0.2)?
> --------------
>
> Matt,
>
> Would you please add both Duane and myself to the eml-cvs
> list service.
>
> Is the EML-parser within the Metacat cvs or a separate cvs? If
> separate, Duane will need update permission.
>
> Thanks!
>
> Sincerely,
> Mark
>
> Peter McCartney wrote:
>
> > I will.
> > On Tue, 2004-08-31 at 14:42, Matt Jones wrote:
> >
> >>Yeah, I think there might be essentially full agreement on the right
> >>approach here -- minor differences maybe in what we
> emphasize. In the
> >>interest of moving forward, is anyone willing to take the lead on
> >>developing the schema changes and other changes needed for a 2.0.2
> >>release that would deal Mark's #2 proposal? They should be pretty
> >>minor, but I'm feeling kind of swamped, and the 2.0.1
> release was enough
> >>of a burden that I'm not real excited to start right back
> up on it given
> >>other priorities.
> >>
> >>Matt
> >>
> >>Peter McCartney wrote:
> >>
> >>>Careful. i never said that to solve the example james just
> described
> >>>that one should take the first node. i said that in the case where
> >>>you have duplicated content and have given both of them
> the same id
> >>>and system, you can take the first, or any, node and it doesn't
> >>>matter. in the case of James's example, Mark's fix# 2 applies - i
> >>>think we are all in agreement on that.
> >>>
> >>>The suggestion that we just don't include ids for things
> we know are
> >>>duplicating will of course solve the problem and that is probably
> >>>what we will do for now. However, it has the unfortunate
> side effect
> >>>that it takes away our ability to maintain a relationship
> within EML
> >>>back to the original source content (because all of the content in
> >>>our EML files is just a copy of the original record in our
> database
> >>>anyway). This is very useful when loading EML files into a
> relational
> >>>database through the xanthoria put method. But thats our problem...
> >>>
> >>>
> >>>On Tue, 2004-08-31 at 13:43, Matt Jones wrote:
> >>>
> >>>
> >>>>Hi James,
> >>>>
> >>>>Yes, that's exactly the problem. Peter is proposing to
> solve it by
> >>>>taking the *first* of the redundant trees. But, which is
> first depends
> >>>>on whether you traverse the document in breadth-first order or
> >>>>depth-first order. That, to me, is just asking for
> trouble -- we'd be
> >>>>asking people to remember to put the subtree they want
> referenced in the
> >>>>"depth-first" first node, which can change as the
> structure of the tree
> >>>>changes. Hard to do and harder to maintain.
> >>>>
> >>>>Also, if we do it this way, we should probably check to
> be sure that
> >>>>two
> >>>>subtrees that have identical id's also have identical
> content, which is
> >>>>not a trivial programming task (assuming they are
> identical could easily
> >>>>lead to conflicting information).
> >>>>
> >>>>I would far prefer to keep the links unambiguous (ie, references
> >>>>always
> >>>>can be resolved to one and only one id). If someone
> doesn't want to
> >>>>deal with that stuff, they can always omit the ids and
> just duplicate
> >>>>the content, which is why we made the ids optional originally.
> >>>>
> >>>>Matt
> >>>>
> >>>>James W Brunt wrote:
> >>>>
> >>>>
> >>>>>Just a clarification...The specific error example we have been
> >>>>>discussing is concerning two identical ids with
> different content...
> >>>>>
> >>>>><dataset id="30" system="ces_dataset"> ... Is different from
> >>>>><creator id="30" system="ces_party"> ....
> >>>>>
> >>>>>Admittedly, were the content the same we would still get
> the error
> >>>>>(if
> >>>>>the parser is written to the spec). However, if there
> were (in this case
> >>>>>there wasn't) a
> >>>>>
> >>>>><references>30</references>
> >>>>>
> >>>>>it would be ambiguous. Correct?
> >>>>>
> >>>>>James
> >>>>>
> >>>>>Peter McCartney wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>>On Tue, 2004-08-31 at 11:35, Matt Jones wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>It would really help me justify the extra work involed in
> >>>>>>>>managing ids and references if someone could give me
> a concrete
> >>>>>>>>example of why it would be bad to have a document contain two
> >>>>>>>>elements with identical ids and identical content.
> >>>>>>>
> >>>>>>>
> >>>>>>>Like in other relational systems, The key (id) acts as a
> >>>>>>>surrogate
> >>>>>>>for the content. So, references should resolve to one
> (and only one)
> >>>>>>>id. It is far harder to validate that the content is
> the same between
> >>>>>>>two nodes with identical keys than it is to validate
> that no key is
> >>>>>>>duplicated. I think they got this right in the
> relational model, and
> >>>>>>>we should follow that lead. If you allow duplicate
> ids, then I am
> >>>>>>>sure this situation will arise:
> >>>>>>>
> >>>>>>><a id="1">foo</a>
> >>>>>>><a id="1">bar></a>
> >>>>>>><b><references>1</references></b>
> >>>>>>>
> >>>>>>>What is the value of <b>? foo, or bar? It is indeterminate.
> >>>>>>>And
> >>>>>>>this is precisely why this is a problem.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>I agree this would be bad, but this is not what is
> happening. The
> >>>>>>documents that are being rejected have: <a id="1">foo</a>
> >>>>>><a id="1">foo></a>
> >>>>>>Typically, when this happens, the code is obviously not
> bothering with
> >>>>>>references tags, so we aren't likely to create broken
> or ambiguous
> >>>>>>reference tags. Even if we did throw in a
> >>>>>><b><references>1</references></b>, it really wouldn't
> be a problem. In
> >>>>>>some of our files where attributes are repeated in view
> entities, we are
> >>>>>>also getting this:
> >>>>>>
> >>>>>><a id="1">foo</a>
> >>>>>><a id="2">foo></a>
> >>>>>>
> >>>>>>but your parser hasn't spotted that one yet :) and again, even
> >>>>>>though it violates the spec, i would contend that this
> causes no
> >>>>>>problem.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>If my xpath returns one or several nodes and they
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>are all identical, why is it so bad to just assume
> that the rule
> >>>>>>>>is: "identical id (and system) means identical
> content" and just
> >>>>>>>>use the first one in the list?
> >>>>>>>
> >>>>>>>
> >>>>>>>Because relational models have shown that this never works. I
> >>>>>>>think
> >>>>>>>that such an assumption will result in lots of broken docs.
> >>>>>>>
> >>>>>>>I think it is no more work to write parsers to
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>check for differences between nodes ith similar ids
> than it is
> >>>>>>>>to check for duplicate ids in the first place, but it makes
> >>>>>>>>generating valid eml a LOT simpler.
> >>>>>>>
> >>>>>>>
> >>>>>>>Generating valid eml with only one copy of a subtree
> is easy --
> >>>>>>>just
> >>>>>>>track whether you've already inserted it, and reference it
> >>>>>>>thereafter. I don't understand at all why this is
> hard. However, I
> >>>>>>>do understand the problem with system not being
> included in the
> >>>>>>>assessment of the uniqueness of the ID. So I like the idea of
> >>>>>>>pursuing Mark's suggestion (2).
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>I also like pursuing 2 regardless of how we debate over 3 and
> >>>>>>would support a hasty 2.02 to revise the spec documentation and
> >>>>>>add an optional system attribute as such:
> >>>>>>
> >>>>>><references system="ces_dataset">201</references>.
> >>>>>>Keep in mind that when people hear you (or me, or
> anyone...) say
> >>>>>>"its not that hard" they are thinking "sure, if you
> have a team of
> >>>>>>Java programmers!"). So perhaps it would help to
> provide some code
> >>>>>>samples that can be adapted to the kind of approaches
> people are
> >>>>>>taking with more off-the-shelf tools so that people don't feel
> >>>>>>like the only way to work with valid eml is to use one set of
> >>>>>>tools from one shop. For example, the approach we take in
> >>>>>>Xanthoria for converting from RDBMS to xml is actually a fairly
> >>>>>>common one that appears in Cocoon, XML spy's RDPMS
> mapping tool,
> >>>>>>and many other vendor-specific DB->xml modules.
> Specifically, the
> >>>>>>rdbms content is exported to a generic, denormalized
> xml and then
> >>>>>>transformed with xsl to map to the desired schema. So for most
> >>>>>>cases, the place where this tracking needs to be done
> is likely to
> >>>>>>be in XSL. While we have found it relatively easy when
> parsing EML
> >>>>>>in XSL to follow references to find the content, we have also
> >>>>>>found that tracking things within xsls when writing out
> eml to be
> >>>>>>a cumbersome process, let alone making sure that each
> time we do
> >>>>>>it it is going to come out consistent. So if there is some xsl
> >>>>>>sample that we can easily add to xanthoria style sheets
> to solve
> >>>>>>this problem, then thats cool. Otherwise, I really
> think it would
> >>>>>>be folly to hang too long on this when we (LTER that is) have
> >>>>>>bigger fish to fry. Namely, building a better search
> interface for
> >>>>>>searching LTER data via eml. The query interface is what the CC
> >>>>>>spent hours talking about in Fairbanks, so if we come back in
> >>>>>>Miami with the ID problem solved but no improved query
> system, I'd
> >>>>>>prefer not be the one to give that powerpoint.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>Matt
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>Peter McCartney (peter.mccartney at asu.edu)
> >>>>>>>>Center for Environmental-Studies
> >>>>>>>>Arizona State University
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>-----Original Message-----
> >>>>>>>>>From: owner-im at lternet.edu [mailto:owner-im at lternet.edu] On
> >>>>>>>>>Behalf Of James W Brunt
> >>>>>>>>>Sent: Monday, August 30, 2004 2:57 PM
> >>>>>>>>>To: eml-dev at ecoinformatics.org; emlbestpractices at lternet.edu;
> >>>>>>>>>im at lternet.edu
> >>>>>>>>>Subject: [LTER-im] [Fwd: [Fwd: Re: FW: Report from Metacat
> >>>>>>>>>Harvester: Wed Aug 25 11:00:36 MDT 2004]]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>Peter, et. al,
> >>>>>>>>>
> >>>>>>>>>Mark's email to me (below) has reinforced my own conclusion
> >>>>>>>>>about the id, system, references question. There at least 2
> >>>>>>>>>possibly 3 issues (bugs if you will) here to be dealt with:
> >>>>>>>>>
> >>>>>>>>>1. The eml normative documentation needs to reflect the real
> >>>>>>>>>intent and use of the system attribute. Read (Can O Worms).
> >>>>>>>>>Options as I see them:
> >>>>>>>>> a. deprecate the system attribute until it can be better
> >>>>>>>>>defined - ignore 2 and 3 below (Mark goes even
> further on this one
> >>>>>>>>>below).
> >>>>>>>>> b. clearly define the system attribute and make
> the changes in
> >>>>>>>>>2 and 3 below.
> >>>>>>>>>
> >>>>>>>>>2. <references> tag needs to be made system/scope aware
> >>>>>>>>>
> >>>>>>>>>3. EMLparser needs to enforce the final outcome of 1 and 2.
> >>>>>>>>>
> >>>>>>>>>Currently, the documentation introduces system but it's
> >>>>>>>>>definition does not supercede the unique ID
> requirement within
> >>>>>>>>>a document, references is not system aware, EMLparser is
> >>>>>>>>>enforcing exactly what the documentation says.
> >>>>>>>>>
> >>>>>>>>>Turning off the ID checking as Peter has suggested
> (different
> >>>>>>>>>thread) would result in uninterpretable EML
> documents were the
> >>>>>>>>>references tag to be used (Although, in all but one
> case in the
> >>>>>>>>>example below there were no references to the IDs).
> I don't see
> >>>>>>>>>this as an intermediate solution.
> >>>>>>>>>
> >>>>>>>>>The intent as I remember all that long discussion ago was to
> >>>>>>>>>create a way to get around having to completely duplicate
> >>>>>>>>>content in a document. Thus creating a more compact document
> >>>>>>>>>and one that would be more easily maintained for someone not
> >>>>>>>>>generating the documents
> >>>>>>>>
> >>>>>>>>>from a database. I'm sure I can be clarified some here by
> >>>>>>>>>others
> >>>>>>>>
> >>>>>>>>>that were present. I realize the difficulty in tracking a
> >>>>>>>>>document
> >>>>>>>>>ID map for every document you automatically generate
> however I
> >>>>>>>>>really don't understand why you wouldn't completely
> duplicate the
> >>>>>>>>>content. However, the inclusion of a second
> qualifying attribute
> >>>>>>>>>that has to be checked for every id tag is doable
> but before we
> >>>>>>>>>begin something like this it must be clearly spelled-out and
> >>>>>>>>>agreeable to the group(s). We'd like to hear from eml-dev,
> >>>>>>>>>eml-bestpractices, and im as well as individual stakeholders.
> >>>>>>>>>
> >>>>>>>>>Thanks,
> >>>>>>>>>
> >>>>>>>>>James
> >>>>>>>>>
> >>>>>>>>>--
> >>>>>>>>>James W. Brunt
> >>>>>>>>>Associate Director for Information Management
> >>>>>>>>>Long Term Ecological Research Network Office
> >>>>>>>>>Department of Biology
> >>>>>>>>>University of New Mexico
> >>>>>>>>>Albuquerque, NM 87131-1091
> >>>>>>>>>505 272 7085
> >>>>>>>>>jbrunt at lternet.edu
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>-------- Original Message --------
> >>>>>>>>>From: Mark Servilla <servilla at lternet.edu>
> >>>>>>>>>To: James Brunt <jbrunt at lternet.edu>
> >>>>>>>>>Subject: [Fwd: Re: FW: Report from Metacat
> Harvester: Wed Aug
> >>>>>>>>>25 11:00:36 MDT 2004]
> >>>>>>>>>
> >>>>>>>>>James,
> >>>>>>>>>
> >>>>>>>>>After reviewing the EML specification documents, it
> appears to
> >>>>>>>>>me that duplicate IDs within a single instance
> document is not
> >>>>>>>>>valid EML, and therefore (IMHO), the EML Parser is behaving
> >>>>>>>>>correctly. I cannot see how setting either the
> SYSTEM or SCOPE
> >>>>>>>>>attribute can be used by the REFERENCES element to
> distinguish
> >>>>>>>>>duplicate IDs within a single document (perhaps someone in
> >>>>>>>>>eml-dev can help answer how SYSTEM/SCOPE are used in this
> >>>>>>>>>context).
> >>>>>>>>>
> >>>>>>>>>Some possible solutions are:
> >>>>>>>>>(1) Deprecate SYSTEM/SCOPE attributes in this
> context, update
> >>>>>>>>>the specification to reflect such change, and do not allow
> >>>>>>>>>duplicate IDs.
> >>>>>>>>>(2) Modify the specification to allow SYSTEM/SCOPE to narrow
> >>>>>>>>>the ID
> >>>>>>>>>scope, thereby allowing duplicate IDs when qualified
> by either
> >>>>>>>>>SYSTEM/SCOPE -- and, modify the specification for
> REFERENCES to
> >>>>>>>>>make use of such change.
> >>>>>>>>>(3) Deprecate REFERENCES completely and force
> repeated content.
> >>>>>>>>>
> >>>>>>>>>Just my thoughts - thanks!
> >>>>>>>>>
> >>>>>>>>>Mark
> >>>>>>>>>
> >>>>>>>>>-------- Original Message --------
> >>>>>>>>>Subject: Re: FW: Report from Metacat Harvester: Wed Aug 25
> >>>>>>>>>11:00:36 MDT 2004
> >>>>>>>>>Date: Mon, 30 Aug 2004 09:26:13 -0600
> >>>>>>>>>From: Mark Servilla <servilla at lternet.edu>
> >>>>>>>>>To: 'Corinna Gries' <corinna at asu.edu>
> >>>>>>>>>CC: James Brunt <jbrunt at lternet.edu>, Duane Costa
> >>>>>>>>><dcosta at lternet.edu>
> >>>>>>>>>References: <E1C0TNQ-00066I-00 at lternet.lternet.edu>
> >>>>>>>>>
> >>>>>>>>>Hi Corinna,
> >>>>>>>>>
> >>>>>>>>>I have been discussing this issue of ID attributes
> with James
> >>>>>>>>>and Duane here at LNO. Please correct me if I am wrong, but
> >>>>>>>>>the section on Reusable Content (below or
> >>>>>>>>>http://knb.ecoinformatics.org/software/eml/eml-2.0.1/
> index.htm
> >>>>>>>>>l#reusableContent)
> >>>>>>>>>states that "two identical ids cannot exist in a single
> >>>>>>>>>document".
> >>>>>>>>>It appears that the "SYSTEM" attribute only allows
> identical ids in
> >>>>>>>>>multiple documents within the system (that is, only
> if the repeated
> >>>>>>>>>ids reference the exact same object) - something
> like globalizing
> >>>>>>>>>the id'ed object to the system for repeated
> reference in one or
> >>>>>>>>>more documents, but not necessarily allowing
> identical ids within a
> >>>>>>>>>single document by changing the SYSTEM attribute
> value. I am not
> >>>>>>>>>really sure how one would take advantage of the
> SYSTEM attribute
> >>>>>>>>>for reusable content. And, I don't know the
> provenance of this
> >>>>>>>>>particular issue (the documentation could certainly
> be more clear),
> >>>>>>>>>but if we were to follow the documentation as we
> interpret, would
> >>>>>>>>>this still be a bug in the Harvester/Metacat software?
> >>>>>>>>>
> >>>>>>>>>Sincerely,
> >>>>>>>>>Mark
> >>>>>>>>>
> >>>>>>>>>3.3. Reusable Content
> >>>>>>>>>EML allows the reuse of previously defined
> structured content
> >>>>>>>>>(DOM
> >>>>>>>>>sub-trees) through the use of key/keyRef type references. In
> >>>>>>>>>order for an EML package to remain cohesive and to
> allow for the
> >>>>>>>>>cross platform compatability of packages, the
> following rules with
> >>>>>>>>>respect to packaging must be followed. 1. An ID is
> required on the
> >>>>>>>>>eml root element. 2. IDs are optional on all other
> elements. 3. If
> >>>>>>>>>an ID is not provided, that content must be interpreted as
> >>>>>>>>>representing a distinct object. 4. If an ID is
> provided for content
> >>>>>>>>>then that content is distinct
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>from all other content except for that content that
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>references its ID. 5. If a user wants to reuse content to
> >>>>>>>>>indicate
> >>>>>>>>>the repetition of an object, a reference must be used. Two
> >>>>>>>>>identical ids cannot exist in a single document. 6.
> "Document"
> >>>>>>>>>scope is defined as identifiers unique only to a
> single instance
> >>>>>>>>>document (if a document does not have a system
> attribute or if
> >>>>>>>>>scope is set to 'document' then all IDs are defined
> as distinct
> >>>>>>>>>content). 7. "System" scope is defined as
> identifiers unique to an
> >>>>>>>>>entire data management system (if two documents
> share a system
> >>>>>>>>>string, then any IDs in those two documents that are
> identical
> >>>>>>>>>refer to the same object). 8. If an element
> references another
> >>>>>>>>>element, it must not have an ID itself. 9. All EML
> packages must
> >>>>>>>>>have the 'eml' module as the root. 10. The system and scope
> >>>>>>>>>attribute are always optional except for at the
> 'eml' module where
> >>>>>>>>>the scope attribute is fixed as 'system'. The scope
> attribute
> >>>>>>>>>defaults to 'document' for all other modules.
> >>>>>>>>>
> >>>>>>>>>Duane Costa wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>Could anyone comment as to whether the EML error reported
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>by Metacat
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>below is a genuine EML error versus a bug in Metacat or the
> >>>>>>>>>>EML validator program? The issue is whether the id
> value for
> >>>>>>>>>><dataset> must be unique from the id value for <creator>.
> >>>>>>>>>>
> >>>>>>>>>>Thanks,
> >>>>>>>>>>Duane
> >>>>>>>>>>
> >>>>>>>>>>-----Original Message-----
> >>>>>>>>>>From: Corinna Gries [mailto:corinna at asu.edu]
> >>>>>>>>>>Sent: Thursday, August 26, 2004 3:48 PM
> >>>>>>>>>>To: dcosta at lternet.edu
> >>>>>>>>>>Subject: RE: Report from Metacat Harvester: Wed Aug 25
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>11:00:36 MDT 2004
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>Hi Duane,
> >>>>>>>>>>
> >>>>>>>>>>I am trying to fix these problems with our eml
> files. Some are
> >>>>>>>>>>easy because they are actual errors in our files,
> but there is
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>one where I
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>wonder if the ID checking is right. I understood IDs should
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>be unique
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>within the system, that is for example:
> >>>>>>>>>>
> >>>>>>>>>><dataset id="30" system="ces_dataset"> ... Is different
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>from <creator
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>>id="30" system="ces_party"> ....
> >>>>>>>>>>
> >>>>>>>>>>However, your harvester complains that they are the same:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>*****************************************************
> **********
> >>>>>>>>>*******
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>**
> >>>>>>>>>>*****
> >>>>>>>>>>*
> >>>>>>>>>>* METACAT HARVESTER REPORT: Wed Aug 25 11:00:36 MDT 2004
> >>>>>>>>>>*
> >>>>>>>>>>* A TOTAL OF 22 ERRORS WERE DETECTED.
> >>>>>>>>>>* Please see the log entries below for additonal details.
> >>>>>>>>>>*
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>*****************************************************
> *********
> >>>>>>>>>**********
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>*****
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>*****************************************************
> *********
> >>>>>>>>>**********
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>*****
> >>>>>>>>>>*
> >>>>>>>>>>* harvestLogID: 5549
> >>>>>>>>>>* harvestDate: Wed Aug 25 11:00:36 MDT 2004
> >>>>>>>>>>* status: 1
> >>>>>>>>>>* message: * harvestOperationCode:
> InsertDocError
> >>>>>>>>>>* description: Error inserting EML
> document to Metacat
> >>>>>>>>>>* detailLogID: 383
> >>>>>>>>>>* errorMessage: MetacatException: <?xml
> version="1.0"?>
> >>>>>>>>>><error>
> >>>>>>>>>>Error running xpath expression:
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>//dateTimeDomain|//nonNumericDomain|//numericDomain|/
> /access|/
> >>>>>>>>>/attribute
> >>>>>>>>>
> >>>>>>>>>List|//constraint|//coverage|//temporalCoverage|//geo
> graphicCov
> >>>>>>>>>List|erage|/
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>List|/t
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>axonomicCoverage|/dataset|/eml/dataset|//dataSource|/
> /dataTable
> >>>>>>>>>axonomicCoverage||//othe
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>axonomicCoverage|rE
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>ntity|//citation|//address|//conferenceLocation|//par
> ty|//origi
> >>>>>>>>>ntity|nator|/
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>ntity|/c
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>reator|//contact|//publisher|//editor|//recipient|//p
> erformer|/
> >>>>>>>>>reator|/instit
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>reator|ut
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>ion|//metadataProvider|//associatedParty|//personnel|
> //physical
> >>>>>>>>>ion||//conn
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>ion|ec
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>tionDefinition|//distribution|//researchProject|//pro
> ject|//rel
> >>>>>>>>>tionDefinition|atedPro
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>tionDefinition|je
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>ct|//software|//spatialRaster|//spatialReference|//sp
> atialVecto
> >>>>>>>>>ct|r|//sto
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>ct|re
> >>>>>>>>>>dProcedure|//view|//protocol|//additionalMetadata :
> Error in
> >>>>>>>>>>dProcedure|xml
> >>>>>>>>>>document. This EML document is not valid because the id 30
> >>>>>>>>>>occurs more than once. IDs must be unique. </error>
> >>>>>>>>>>
> >>>>>>>>>>* scope: ces_dataset
> >>>>>>>>>>* identifier: 30
> >>>>>>>>>>* revision: 1
> >>>>>>>>>>* documentType: eml://ecoinformatics.org/eml-2.0.0
> >>>>>>>>>>* documentURL:
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>http://seinet.asu.edu/DataCatalog/getXanthoriaRecord.
> jsp?source
> >>>>>>>>>=ces_da
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>ta
> >>>>>>>>>>set_mohave&id=30
> >>>>>>>>>>*
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>*****************************************************
> *********
> >>>>>>>>>**********
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>*****
> >>>>>>>>>>
> >>>>>>>>>>What do you think?
> >>>>>>>>>>
> >>>>>>>>>>Corinna
> >>>>>>>>>>
> >>>>>>>>>>_______________________________________________
> >>>>>>>>>>eml-dev mailing list
> >>>>>>>>>>eml-dev at ecoinformatics.org
> >>>>>>>>>>http://www.ecoinformatics.org/mailman/listinfo/eml-dev
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>--
> >>>>>>>>>Mark Servilla, Ph.D.
> >>>>>>>>>
> >>>>>>>>>LTER Network Office
> >>>>>>>>>Department of Biology
> >>>>>>>>>MSC 03 2020
> >>>>>>>>>1 University of New Mexico
> >>>>>>>>>Albuquerque, NM 87131-0001
> >>>>>>>>>
> >>>>>>>>>servilla at lternet.edu
> >>>>>>>>>Office (505) 277-2619
> >>>>>>>>>Cell (505) 453-8593
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>--
> >>>>>>>>>Mark Servilla, Ph.D.
> >>>>>>>>>
> >>>>>>>>>LTER Network Office
> >>>>>>>>>Department of Biology
> >>>>>>>>>MSC 03 2020
> >>>>>>>>>1 University of New Mexico
> >>>>>>>>>Albuquerque, NM 87131-0001
> >>>>>>>>>
> >>>>>>>>>servilla at lternet.edu
> >>>>>>>>>Office (505) 277-2619
> >>>>>>>>>Cell (505) 453-8593
> >>>>>>>>>
> >>>>>>>>>--
> >>>>>>>>>James W. Brunt
> >>>>>>>>>Associate Director for Information Management
> >>>>>>>>>Long Term Ecological Research Network Office
> >>>>>>>>>Department of Biology
> >>>>>>>>>University of New Mexico
> >>>>>>>>>Albuquerque, NM 87131-1091
> >>>>>>>>>505 272 7085
> >>>>>>>>>jbrunt at lternet.edu
> >>>>>>>>>
> >>>>>>>>>-------------------------------------------------
> >>>>>>>>>Long-Term Ecological Research Network Mailing List
> >>>>>>>>>im at LTERnet.edu
> http://sql.lternet.edu/cgi/mailgroups_view.pl?> im
> >>>>>>>>>
>
> >>>>>>>>
>
> >>>>>>>>_______________________________________________
> >>>>>>>>eml-dev mailing list
> >>>>>>>>eml-dev at ecoinformatics.org
> >>>>>>>>http://www.ecoinformatics.org/mailman/listinfo/eml-dev
>
> --
> Mark Servilla, Ph.D.
>
> LTER Network Office
> Department of Biology
> MSC 03 2020
> 1 University of New Mexico
> Albuquerque, NM 87131-0001
>
> servilla at lternet.edu
> Office (505) 277-2619
> Cell (505) 453-8593
>
More information about the Eml-dev
mailing list