validation of eml sample documents

Matt Jones jones at nceas.ucsb.edu
Tue Aug 20 13:35:19 PDT 2002


Peter,

I've taken the time to look through your attached file and I found all 
of the validation errors and fixed them.  There were two minor datatype 
definition bugs with the eml-spatialRepresentation.xsd file, which I 
have fixed and checked into CVS.  There were 7 types of problems that 
were encountered in your files (repeated many, many times because there 
were so many data tables).  I list them below.  Finding these issues was 
mainly a matter of putting in the minor effort to systematically track 
down each error that Xerces reported and fix it.

I think validation is critical because we will prevent drift among sites 
where they do things slightly differently.  If we're going to build 
applications that rely on EML, we need to be sure that EML being 
exchanged conforms to a single standard.  That doesn't mean that you 
have to validate everything coming and going, only that the systems we 
build need to be checked when they are built so that they produce valid 
EML for others to utilize.

I've attached some example files derived from yours that do in fact 
validate.  The files I have attached include a stripped-down, simplest 
eml document possible (simplest-eml.xml), a validating version of your 
original file that only contains one dataTable (cap-example.xml), and 
the same example but with use of references to make sure the keyref 
stuff is working (cap-example-with-references.xml).  I validated all of 
these files successfully using xerces 2 and the attached 
SAXValidate.java program against the current version in CVS (because of 
the bug I just fixed in eml-spatialRepresentation.xsd).  You can compile 
and run the program if an appropriate version of Xerces is on your 
classpath (make sure you use the "-s" option to the program to turn on 
schema validation.

Here's the list of validation errors in your example file:
1) Added an xmlns:eml namespace declaration and used it on the eml root 
element (and deleted several other xmlns that were not needed, but this 
was optional)
2) it appears that all id's (id and packageId) must contain an 
alphabetic character because we have defined them as xs:ID.  We might 
want to consider redefining them as xs:string because of this 
limitation, and just rely on the key for uniqueness. Entered as Bug #563.
3) renamed your "description" element to "entityDescription"
4) added the required "unit" element
5) moved the "physical" element to the proper location in the sequence 
(following constraint)
6) added the required "measumentScale" element
7) added the required "attributeDomain" element

Hope this helps!
Matt

Peter McCartney wrote:
> 
> Thanks for the comments Scott. A couple clarifications about my comments 
> for the discussion:
> 
> 1) Ive attached a file generated by one of our tools for reverse 
> engineering metadata from an RDBMS. The file is not very complete, but 
> should be valid as near as i can tell by manually inspecting it. 
> However, i am unable to validated either with Excelon Stylus Studio, XML 
> Spy, or Forte (with different errors reported in each). While i could 
> easily believe that one or another of these has less than perfect 
> support for schema, the fact that i cant validate with all three (two of 
> which are using the Xerces parser) is significant. By comparison, i had 
> no problem validating instance files against the various nceas and asu 
> drafts prior to beta9. If anyone has a separate tool for validating, 
> please either send it or try this file and let me know whats wrong. 
> Alternately, send me an instance file that you have been able to 
> validate, and ill try it here. To be honest, validation isnt all that 
> important to us - we'd prefer to have our applications attempt to use 
> the metadata and try to trap for errors rather than give up just because 
> it didnt validate - but I'd like to know why im having such aproblem 
> with beta 9 when no one else is....
[snip]
>  
> Peter McCartney (peter.mccartney at asu.edu)
> Center for Environmental Studies
> Arizona State University
> 480-965-6791


-- 
*******************************************************************
Matt Jones                                    jones at nceas.ucsb.edu
http://www.nceas.ucsb.edu/    Fax: 425-920-2439   Ph: 907-789-0496
National Center for Ecological Analysis and Synthesis (NCEAS)

Interested in ecological informatics? http://www.ecoinformatics.org
*******************************************************************
-------------- next part --------------
A non-text attachment was scrubbed...
Name: simplest-eml.xml
Type: text/xml
Size: 566 bytes
Desc: not available
Url : http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20020820/ae5a3a72/simplest-eml.xml
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cap-example.xml
Type: text/xml
Size: 2850 bytes
Desc: not available
Url : http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20020820/ae5a3a72/cap-example.xml
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cap-example-with-references.xml
Type: text/xml
Size: 2954 bytes
Desc: not available
Url : http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20020820/ae5a3a72/cap-example-with-references.xml
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: SAXValidate.java
Url: http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20020820/ae5a3a72/SAXValidate.bat


More information about the Eml-dev mailing list