[eml-dev] Invalide eml 201 xsd fils in Metacat (fwd)

Jing Tao tao at nceas.ucsb.edu
Mon Apr 28 16:08:12 PDT 2008


Morpho should use legitimate schema after Metacat 1.8.1 out. However, this 
would not hurt anything, since morpho would NOT add this extral 
attributes and it also would NOT reject any valide doucment. I added a bug 
for morpho to use legitimate schema: http://bugzilla.ecoinformatics.org/show_bug.cgi?id=3244

So, I think the patch apporach is still a good solution in metacat 1.8.1 
release - bug 3239 and bug 3241.

Any suggestion and comment?

Thanks!

Jing



Jing Tao
National Center for Ecological
Analysis and Synthesis (NCEAS)
735 State St. Suite 204
Santa Barbara, CA 93101

On Mon, 28 Apr 2008, Margaret O'Brien wrote:

> Moving the RELEASE_EML_2_0_1 to something further up the head might be easy, 
> but that would confuse things forever. It means that all of a sudden, an 
> EML201 doc is supposed to look different, and that's no good.
>
> But I think that validating against the RELEASE_EML_2_0_1_UPDATE_x schema is 
> harmless for now (?), since it allows both the legitimate EML201 docs, and 
> the illegitimate. But the damage is done -- does this mean that Morpho cant 
> use the legitimate schema until Metacat 1.8.1 is out? And then when Morpho 
> tries to load a EML201 from a catalog that hasn't yet upgraded to 1.8.1, it 
> wont be able to.
>
> I dont think this helps the whole 201 validation mess at all -- but we could 
> put the offending change back in for EML2.1.0 (see bug 1662) so that at least 
> the element's structure could match the schema when docs were upgraded to 
> 2.1.0. But recall that a default was set for the extra attribute that was 
> nonsense (system="document"), and system should be just xs:string. So those 
> EML 201 docs would carry around that nonsense - which although it's probably 
> harmless, it looks a little lame.
>
> margaret
> -------------
>
> Jing Tao wrote:
>> Hi, Margaret:
>> 
>> Good point for morpho. I double checked morpho and found morpho is using a 
>> wrong version of eml201 schema too. (The reason why morpho doesn't add the 
>> extral attribute is that morpho doesn't parse the eml doucment by sax 
>> parser, but metadata does).
>> 
>> Since morpho is using the same schema which metacat uses, morpho wouldn't 
>> reject the invalide document downloading from Metacat. Moreover, eml201 
>> schema were checked into morpho cvs repository, so I actually don't know 
>> which exact version it is. But apparently it is not RELEASE_EML_2_0_1.
>> 
>> Both morpho and metacat are using RELEASE_EML_2_0_1_UPDATE_? schema rather 
>> than RELEASE_EML_2_0_1. The only difference between 
>> RELEASE_EML_2_0_1_UPDATE_? and RELEASE_EML_2_0_1 is in eml-resource.xsd 
>> (e.g. references data type). Is it possilbe we can move RELEASE_EML_2_0_1 
>> tag to point the same eml-resource.xsd which RELEASE_EML_2_0_1_UPDATE_? 
>> points to? If we can, life will be easier.
>> 
>> Thanks,
>> 
>> Jing
>> 
>> 
>> Jing Tao
>> National Center for Ecological
>> Analysis and Synthesis (NCEAS)
>> 735 State St. Suite 204
>> Santa Barbara, CA 93101
>> 
>> On Fri, 25 Apr 2008, Margaret O'Brien wrote:
>> 
>>> Thanks Jing -
>>> I think your idea for a patch for metacat 1.8.1 that repairs the eml201 
>>> docs is a good one. Basically, it's transparent to any other installation, 
>>> since earlier metacat versions will still allow instance docs that haven't 
>>> had that spurious attribute added in. And waiting to update the schema in 
>>> the knb means that other metacats will still be able to replicate there.
>>> 
>>> The only downside that I can see is (as you say) that these invalid docs 
>>> are allowed to stay in metacat. So if someone were to download an EML201 
>>> document (e.g., with action=read&qformat=xml) from any metacat earlier 
>>> than v1.8.1, it would not validate against a local copy of the schema. In 
>>> fact, it seems odd that this has not already happened, since presumably 
>>> metacat has been behaving this way for quite some time. Maybe this means 
>>> that relatively few documents containing references are handled this way. 
>>> In any case, metacat managers should be aware that this could happen.
>>> 
>>> But what about Morpho? You demonstrated that morpho did not add the extra 
>>> attribute to its locally saved copy. Has anyone complained of morpho 
>>> subsequently rejecting a doc (with references) that it created and 
>>> inserted into metacat? When Morpho tried to reload that same doc, and 
>>> compared it to the legitimate schema, it should have been rejected. Maybe 
>>> Morpho is validating against an illegitimate schema, too.
>>> 
>>> Margaret
>>> -----------------
>>> 
>>> Jing Tao wrote:
>>>> By the way, margaret:
>>>> 
>>>> I haven't update the schema in knb server yet.
>>>> 
>>>> Jing Tao
>>>> National Center for Ecological
>>>> Analysis and Synthesis (NCEAS)
>>>> 735 State St. Suite 204
>>>> Santa Barbara, CA 93101
>>>> 
>>>> ---------- Forwarded message ----------
>>>> Date: Fri, 25 Apr 2008 11:21:45 -0700 (PDT)
>>>> From: Jing Tao <tao at nceas.ucsb.edu>
>>>> To: jones at nceas.ucsb.edu
>>>> Cc: metacat-dev at ecoinformatics.org
>>>> Subject: Invalide eml 201 xsd fils in Metacat
>>>> 
>>>> Hi, Matt:
>>>> 
>>>> I updated eml201 schema in dev. See bug 
>>>> http://bugzilla.ecoinformatics.org/show_bug.cgi?id=3239
>>>> 
>>>> As we expected, the process would fail if we tried to update an existed 
>>>> older eml document in dev machine after we put the correct schema.
>>>> 
>>>> This prompted a question to me: if we update the eml201 xsd files in knb 
>>>> production server, how many complain emails we will get? So I haven't 
>>>> updated the eml201 schema in production server yet.
>>>> 
>>>> Can we do this way - let the eml201 schema as it is now. But in 1.8.1 
>>>> release, we need a patch to fix the existed valide eml documents in 
>>>> metacat?
>>>> 
>>>> I inputed a new bug for this patch 
>>>> http://bugzilla.ecoinformatics.org/show_bug.cgi?id=3241
>>>> 
>>>> I targeted this bug as 1.8.1 and made it critical. A java program will be 
>>>> written which will remove the extral attributes ("system"="document" 
>>>> under "references" element)in xml_nodes table and  extral xpath 
>>>> (references/@system) in xml_index table (I don't think we should do 
>>>> anything about xml_path_index table).
>>>> 
>>>> So in metacat 1.8.1 release the correct schema will be there and all 
>>>> invalide eml201 will be fixed. We will be all set.
>>>> 
>>>> The downside part of not correcting the eml201 schema now in knb is 
>>>> metacat will continue generating invalide eml documents. But those new 
>>>> invalide document will be fixed in 1.8.1 too and the java path doen't 
>>>> care to handle more documents :). The gains of this apporach will be no 
>>>> complain emails and user wouldn't manually to delete extra attributes.
>>>> 
>>>> Any thoughts and comment?
>>>> 
>>>> Thanks,
>>>> 
>>>> Jing
>>>> 
>>>> 
>>>> Jing Tao
>>>> National Center for Ecological
>>>> Analysis and Synthesis (NCEAS)
>>>> 735 State St. Suite 204
>>>> Santa Barbara, CA 93101
>>> 
>>> -- 
>>> 
>>> 
>>> ========================
>>> Margaret O'Brien
>>> Information Management
>>> Santa Barbara Coastal LTER Marine Science Institute
>>> University of California
>>> Santa Barbara, CA  93106-6150
>>> 
>>> 805-893-2071
>>> mob at msi.ucsb.edu
>>> http://sbc.lternet.edu
>>> ========================
>>> 
>>> 
>
> -- 
>
>
> ========================
> Margaret O'Brien
> Information Management
> Santa Barbara Coastal LTER Marine Science Institute
> University of California
> Santa Barbara, CA  93106-6150
>
> 805-893-2071
> mob at msi.ucsb.edu
> http://sbc.lternet.edu
> ========================
>
>


More information about the Eml-dev mailing list