FW: Report from Metacat Harvester: Wed Aug 25 11:00:36 MDT 2004
Peter McCartney
peter.mccartney at asu.edu
Fri Aug 27 09:55:49 PDT 2004
I didnt see this prior to entering my bug in bugizilla on this. Based on
what Matt says, it appears that NCEAS intended to write their EMLParser
software to enforce ids only with the system, allowing documents to
contain a mix of ids based on different systems. This is not reflected
in the EML documentation currently online which is what I based my
comments on. Sorry for my confusion. I also may have been confused about
the presence of keyRef statements in eml.xsd to enforce these rules as i
am unable to find them now. I have a vivid recollection of them being
there at one point, but the last time they were discussed in the cvs log
is somewhere between rc1 and rc2. If this is the case, then the wording
of my bug should be updated to reflect that the problem is not in EML
xsd but in the in the wording of the documentation which still states
that IDs are unique accross the document and that a document can have
only name one system.
However, this only partly addressed the issues I raise in the bug and
the Matt's fix to EMLParser will only partly solve our rejected files
problem. There is still the problem of conflicts between eml embedded in
the methods section, and the fact that enforcing the use of references
tags to normalize content makes programming for eml more complicated
than it has to be and, at the very least, doesn't help in what is
turning into a very slow adoption of EML.
On Thu, 2004-08-26 at 16:45, Matt Jones wrote:
> Hi Duane,
>
> I think its a bug in the EMLParser -- it appears to be ignoring system,
> when in fact it should do as Corinna suggests make sure that all IDs
> within a system are unique. Want to fix this bug?
>
> Matt
>
> Duane Costa wrote:
> > Could anyone comment as to whether the EML error reported by Metacat below
> > is a genuine EML error versus a bug in Metacat or the EML validator program?
> > The issue is whether the id value for <dataset> must be unique from the id
> > value for <creator>.
> >
> > Thanks,
> > Duane
> >
> > -----Original Message-----
> > From: Corinna Gries [mailto:corinna at asu.edu]
> > Sent: Thursday, August 26, 2004 3:48 PM
> > To: dcosta at lternet.edu
> > Subject: RE: Report from Metacat Harvester: Wed Aug 25 11:00:36 MDT 2004
> >
> > Hi Duane,
> >
> > I am trying to fix these problems with our eml files. Some are easy
> > because they are actual errors in our files, but there is one where I
> > wonder if the ID checking is right. I understood IDs should be unique
> > within the system, that is for example:
> >
> > <dataset id="30" system="ces_dataset"> ... Is different from
> > <creator id="30" system="ces_party"> ....
> >
> > However, your harvester complains that they are the same:
> >
> > ************************************************************************
> > *****
> > *
> > * METACAT HARVESTER REPORT: Wed Aug 25 11:00:36 MDT 2004
> > *
> > * A TOTAL OF 22 ERRORS WERE DETECTED.
> > * Please see the log entries below for additonal details.
> > *
> > ************************************************************************
> > *****
> > ************************************************************************
> > *****
> > *
> > * harvestLogID: 5549
> > * harvestDate: Wed Aug 25 11:00:36 MDT 2004
> > * status: 1
> > * message:
> > * harvestOperationCode: InsertDocError
> > * description: Error inserting EML document to Metacat
> > * detailLogID: 383
> > * errorMessage: MetacatException: <?xml version="1.0"?>
> > <error>
> > Error running xpath expression:
> > //dateTimeDomain|//nonNumericDomain|//numericDomain|//access|//attribute
> > List|//constraint|//coverage|//temporalCoverage|//geographicCoverage|//t
> > axonomicCoverage|/dataset|/eml/dataset|//dataSource|//dataTable|//otherE
> > ntity|//citation|//address|//conferenceLocation|//party|//originator|//c
> > reator|//contact|//publisher|//editor|//recipient|//performer|//institut
> > ion|//metadataProvider|//associatedParty|//personnel|//physical|//connec
> > tionDefinition|//distribution|//researchProject|//project|//relatedProje
> > ct|//software|//spatialRaster|//spatialReference|//spatialVector|//store
> > dProcedure|//view|//protocol|//additionalMetadata : Error in xml
> > document. This EML document is not valid because the id 30 occurs more
> > than once. IDs must be unique. </error>
> >
> > * scope: ces_dataset
> > * identifier: 30
> > * revision: 1
> > * documentType: eml://ecoinformatics.org/eml-2.0.0
> > * documentURL:
> > http://seinet.asu.edu/DataCatalog/getXanthoriaRecord.jsp?source=ces_data
> > set_mohave&id=30
> > *
> > ************************************************************************
> > *****
> >
> > What do you think?
> >
> > Corinna
> >
> > _______________________________________________
> > eml-dev mailing list
> > eml-dev at ecoinformatics.org
> > http://www.ecoinformatics.org/mailman/listinfo/eml-dev
--
Peter McCartney(peter.mccartney at asu.edu)
Center for Environmental Studies
480-965-6791
More information about the Eml-dev
mailing list