[eml-dev] [Bug 2479] New: - unable to validate eml.xsd and related

inigo san gil isangil at lternet.edu
Wed Jun 28 10:31:57 PDT 2006


Hi EML-Dev!

Well, I have trouble understanding the details of the problem at hand 
here,  but I can add my two cents.

According to the last communication with some technical staff at Altova 
(XMLSpy, Mapforce), the EML 2.0.1 schema is wrong.

Here is Altova's message to Sabine(Thanks to Sabine, LTER MCR's IM):

----------------------------------------------------------------

I have been in touch with the developer who analysed this issue and he 
has determined that in fact this is not a bug, and that the schema in 
question is *not* valid after all. Thus the validation result of the 
2006 version of XMLSpy is correct.

To explain: 'appinfo' uses an 'any' wildcard in its content model  This 
wildcard has processContents="lax" which will attempt to validate 
elements with known
names. So, the Schema for Schemas demands validation of known 
declarations - even within the <appinfo> element. One might argue that 
this does not
make much sense, but the Schema for Schemas is normative and, thus, 
binding for XMLSpy.

I hope this helps - please don't hesitate to let me know if there is 
anything further I can add.

Best regards,

.. Paul Rees
.. Support Engineer
.. Altova GmbH

-------------------------------------------------------------------------------------------------

If someone really understands the above explanation, please let me know.

What I have done is roll back to the Altova 2005 suite, and everything 
works just fine. Well, yes, you *have* to do something about the 
<describes> element in eml/additionalMetadata/describes. You can change 
the cardinality, or eliminate it all together. I have only seen some 
metadata files from SEVilleta using that feature.

This so far about the EML "errors" problem mentioned in the orignal 
message. Now a few words about the case FGDC to EML crosswalk:

 I use Mapforce to help the BDP->EML conversion, and I have such 
stylesheet. I am attaching my progress so far, which is incomplete and 
also, needs a "Perl script companion" (Iam working on it!) for the 
reasons I exaplain below. The real problem is the well known 
"granularity" differences, which are better tackled  with the aid of 
other scripting language, such as Perl.  In addition, the government 
standard is generally more lax than the EML standard, which in practice 
translates into not-so-rich metadata content. In the FGDC (BDP and the 
like) standard, only two sections are mandatory: the section that 
corresponds to EML's "metadata provider" info, and what we know as the 
"resource group", the basic info on "creator", "point of contact" and 
the like. All the rest is technically mandatory if the info is 
applicable, but optional in practice. To make matters worse for this 
crosswalk,  the NBII standard and clearinghouse harvester makes no 
systematic effort to validate records against the DTD/Schema. Sure, 
there are some neat tools to fix FGDC documents, such as the MP 
(metadata parser), and there are talented individuals to aid the many 
sites to fix their documents, but what I have seen in practice is, to 
put it nicely, "creative workarounds to fill somewhat ambiguos elements."

In short, I give you Altova's take on the EML 2.0.1 schema. In earlier 
correspondence (Feb 06) with Altova, they catalogued this problem as a 
"2006 version bug", just to retract themselves in recent correspondence.

I am working on a tool to make the crosswalk between FGDC (and 
specificially, its BDP extension) to EML, but it is not finished, since 
it needs help from a more powerful text parser to granularize. I give 
you here a stylesheet that outputs valid EML from a BDP source. Note 
that  there are unresolved issues such as "date converters", the 
"packageId" problem, and others that I document as comments. When I have 
a version that is publishable, I will post it here.

Cheers, Inigo

bugzilla-daemon at ecoinformatics.org wrote:


>>http://bugzilla.ecoinformatics.org/show_bug.cgi?id=2479
>>
>>           Summary: unable to validate eml.xsd and related schemas with
>>                    XML*Spy and related suite of products
>>           Product: EML
>>           Version: 2.0.1
>>          Platform: PC
>>        OS/Version: All
>>            Status: NEW
>>          Severity: blocker
>>          Priority: P2
>>         Component: eml - general bugs
>>        AssignedTo: jones at nceas.ucsb.edu
>>        ReportedBy: john.cree at ec.gc.ca
>>         QAContact: eml-dev at ecoinformatics.org
>>
>>
>>As per question 10. in the FAQ's
>>"How can I get my existing metadata into EML?" ...  "Case 3: If your metadata
>>is already in XML but in some other form such as NBII or FGDC use the following
>>conversion method..." The suggested method is to write an XSLT script to do the
>>conversion.  Although this may be possible, I am trying to save some time by
>>using an Altova product (Mapforce) to do the conversion.  It will essentially
>>create the XSLT to do the conversion; however, it will only work with  valid
>>schema definitions or DTD's as input and output.  There is no problem using the
>> NBII DTD as input or output, however there are many errors when trying to
>>validate the EML schemas.  Is this a problem that has been observed by others
>>that is occuring only with the validation done using Altova products, or is the
>> problem with the eml schema definitions themselves?
>>_______________________________________________
>>Eml-dev mailing list
>>Eml-dev at ecoinformatics.org
>>http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>>  
>>
>  
>


-------------- next part --------------
A non-text attachment was scrubbed...
Name: bdp2eml.xslt
Type: text/xml
Size: 183024 bytes
Desc: not available
Url : http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20060628/5ba4c565/bdp2eml-0001.xml


More information about the Eml-dev mailing list