[eml-dev] [Bug 3232] New: - EML parser limitations
bugzilla-daemon@ecoinformatics.org
bugzilla-daemon at ecoinformatics.org
Thu Apr 17 11:32:27 PDT 2008
http://bugzilla.ecoinformatics.org/show_bug.cgi?id=3232
Summary: EML parser limitations
Product: EML
Version: 2.0.1
Platform: Other
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: eml-parser
AssignedTo: jones at nceas.ucsb.edu
ReportedBy: mob at icess.ucsb.edu
QAContact: eml-dev at ecoinformatics.org
This is just for the record. It seems that the EML parser could benefit from an
update, although it's current behavior is perfectly legal.
It may be that bug 2054 appeared because the parser that comes with EML does
not use schema-full-checking. My main resourse (Walmsley 2002 book) says that
this is the xerces feature that checks for non-deterministic content models
(which was the error in 2054). That feature doesn't appear to be in the file
SAXValidate.java - at least not to my untrained eye.
Bug 2703 seems to have come about because Xerces does not necessarily load all
the import schemas. The content model for appinfo and documentation is a
wildcard, and can be validated laxly. So it's up to the validator to go looking
for element declarations, but it doesnt have to. This behavior is perfectly
legal.
So the parser can detect errors instance documents, but it does not adequately
catch schema errors. Maybe this was always the intent, but not quite clearly
stated. Or, maybe it's a simple matter to add some other xerces features, or
incorporate XSV instead - but not being a java programmer, I dont know.
More information about the Eml-dev
mailing list