validation of eml sample documents
Matt Jones
jones at nceas.ucsb.edu
Tue Aug 20 13:35:19 PDT 2002
Peter,
I've taken the time to look through your attached file and I found all
of the validation errors and fixed them. There were two minor datatype
definition bugs with the eml-spatialRepresentation.xsd file, which I
have fixed and checked into CVS. There were 7 types of problems that
were encountered in your files (repeated many, many times because there
were so many data tables). I list them below. Finding these issues was
mainly a matter of putting in the minor effort to systematically track
down each error that Xerces reported and fix it.
I think validation is critical because we will prevent drift among sites
where they do things slightly differently. If we're going to build
applications that rely on EML, we need to be sure that EML being
exchanged conforms to a single standard. That doesn't mean that you
have to validate everything coming and going, only that the systems we
build need to be checked when they are built so that they produce valid
EML for others to utilize.
I've attached some example files derived from yours that do in fact
validate. The files I have attached include a stripped-down, simplest
eml document possible (simplest-eml.xml), a validating version of your
original file that only contains one dataTable (cap-example.xml), and
the same example but with use of references to make sure the keyref
stuff is working (cap-example-with-references.xml). I validated all of
these files successfully using xerces 2 and the attached
SAXValidate.java program against the current version in CVS (because of
the bug I just fixed in eml-spatialRepresentation.xsd). You can compile
and run the program if an appropriate version of Xerces is on your
classpath (make sure you use the "-s" option to the program to turn on
schema validation.
Here's the list of validation errors in your example file:
1) Added an xmlns:eml namespace declaration and used it on the eml root
element (and deleted several other xmlns that were not needed, but this
was optional)
2) it appears that all id's (id and packageId) must contain an
alphabetic character because we have defined them as xs:ID. We might
want to consider redefining them as xs:string because of this
limitation, and just rely on the key for uniqueness. Entered as Bug #563.
3) renamed your "description" element to "entityDescription"
4) added the required "unit" element
5) moved the "physical" element to the proper location in the sequence
(following constraint)
6) added the required "measumentScale" element
7) added the required "attributeDomain" element
Hope this helps!
Matt
Peter McCartney wrote:
>
> Thanks for the comments Scott. A couple clarifications about my comments
> for the discussion:
>
> 1) Ive attached a file generated by one of our tools for reverse
> engineering metadata from an RDBMS. The file is not very complete, but
> should be valid as near as i can tell by manually inspecting it.
> However, i am unable to validated either with Excelon Stylus Studio, XML
> Spy, or Forte (with different errors reported in each). While i could
> easily believe that one or another of these has less than perfect
> support for schema, the fact that i cant validate with all three (two of
> which are using the Xerces parser) is significant. By comparison, i had
> no problem validating instance files against the various nceas and asu
> drafts prior to beta9. If anyone has a separate tool for validating,
> please either send it or try this file and let me know whats wrong.
> Alternately, send me an instance file that you have been able to
> validate, and ill try it here. To be honest, validation isnt all that
> important to us - we'd prefer to have our applications attempt to use
> the metadata and try to trap for errors rather than give up just because
> it didnt validate - but I'd like to know why im having such aproblem
> with beta 9 when no one else is....
[snip]
>
> Peter McCartney (peter.mccartney at asu.edu)
> Center for Environmental Studies
> Arizona State University
> 480-965-6791
--
*******************************************************************
Matt Jones jones at nceas.ucsb.edu
http://www.nceas.ucsb.edu/ Fax: 425-920-2439 Ph: 907-789-0496
National Center for Ecological Analysis and Synthesis (NCEAS)
Interested in ecological informatics? http://www.ecoinformatics.org
*******************************************************************
-------------- next part --------------
A non-text attachment was scrubbed...
Name: simplest-eml.xml
Type: text/xml
Size: 566 bytes
Desc: not available
Url : http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20020820/ae5a3a72/simplest-eml.xml
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cap-example.xml
Type: text/xml
Size: 2850 bytes
Desc: not available
Url : http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20020820/ae5a3a72/cap-example.xml
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cap-example-with-references.xml
Type: text/xml
Size: 2954 bytes
Desc: not available
Url : http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20020820/ae5a3a72/cap-example-with-references.xml
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: SAXValidate.java
Url: http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20020820/ae5a3a72/SAXValidate.bat
More information about the Eml-dev
mailing list