[eml-dev] EML 2.0.2 changes to text leaf nodes

inigo isangil at lternet.edu
Thu Mar 20 15:09:25 PDT 2008


Even though it is desirable to decouple the presentation
from the content, in some cases losing the original
format is a problem.

I think it is a beneficial compromise and allow in
more leafs the capabilities of the eml-text module AND
even additional markup-oriented tags (such as DocBook 5.*)
in EML. We would perhaps have to do more work,
but the content will not be distorted by such tags.

There are many instances where this will be beneficial.
<title> is one, but more critical is <geographicDescription>.
In principle, you can describe many aspects about a particular
site, so <section>s and <para>graphs, <emphasis>, itemizedLists
and the like, would help preserving some of the formating
therein. I can place examples of LTER sites that lose the
ability to cleanly describe a site, just because the <geographicDescription>
is xs:string. Other tags that would benefit are
<pubPlace>, (may need paragraphs, if someone decides
to provide something more than a name)
<instrumentation> (manufacturer, model numbers) ,
<maintenanceUpdateFrequency>. (may need explanation,
since it may not be constant through the temporal
range covered/refered in the metadata)
<entityName> like <title>
<entityDescription> perhaps some feel like using paragraphs.
<objectName> like <title>
<attributeName>
<attributeDefinition>
<codeExplanation> paragraphs?
<definition> (within //dataset//nominal//textdomain/definition) paragraphs.)
<attributeAccuracyReport> parag.
<attributeAccuracyExplanation> parag.

Also, this action suggest that we can expand those children and parent
elements in the <para> (eml-text module) to include more
possible format-markup related tags available. Not just
add <url> and <citetitle> to avoid  "critical errors" in
the EML schema,  but to add the capability of preserving
better the original user markup.

XSL would be in charge of rendering the markup appropriately.
More work? Absolutely, but many EML users would not
feel as dissapointed when reading a document that destroyed
the original formatting because it lost the original format
for lacking appropriate markup and/or defective stylesheet
interpretation (the <literalLayout> problem..)

Then also is the issue of adding the MathML or similar
tags to document models/equations appropriately. It goes
along similar lines in this thread

Cheers, Inigo

Inigo

and other fields can benefit from the formatting that the user
intended to start with
James Brunt wrote:
> I've wrestled with this issue for a good while. My experience has been 
> primarily with bibliographic database entries in which this is a raging 
> problem. It comes down to the question, Can you express complex text in 
> a limited character set with no formatting? And unfortunately I see both 
> sides. Philosophically I prefer to keep things simple and decouple 
> formatting from text but practically how can you deal with things like 
> C00 and species names in titles that are indicated in practice by 
> formatting. This is made more complex by the fact that you are trying to 
> represent a title that is fixed in the annals of history by something 
> that doesn't look at all like it.
>
> So I guess I'm in favor of the change for <title> elements but I'm not 
> sure about the implications of changing all xs:string types. Others will 
> have to decide.
>
> Also, I'm in agreement with Inigo that making the schema "clean" should 
> be a priority in this bug-fix release.
>
> James
>
> James W. Brunt
> Associate Director for Information Management
> Long Term Ecological Research Network Office
> Department of Biology MSC03 2020
> 1 University of New Mexico
> Albuquerque, NM 87131-0001
> 505 277 2535
> jbrunt at LTERnet.edu
>
>
> Christopher Jones wrote:
>   
>> Hi all,
>>
>> Margaret and I were discussing changes to EML slated for the 2.0.2  
>> bugfix release, and a frequent request that I have seen involves  
>> elements that are xs:string leaf nodes throughout the schema.  There  
>> are places within the EML schema that we consciously decided to type  
>> the leaf node as an eml-text node (txt:TextType) in order to provide  
>> DocBook-type formatting capabilities.  However, there have been many  
>> requests for formatting options in text leaf node elements where it is  
>> not allowed.
>>
>> The proposal is to convert all leaf nodes in the EML schema that are  
>> currently typed as xs:string to be of type txt:TextType so they may  
>> all take advantage of the formatting options.
>>
>> A required change in eml-text.xsd is:
>>
>> <ComplexType name="TextType"> becomes <ComplexType name="TextType"  
>> mixed="true">
>>
>> An example would be:
>>
>> EML 2.0.1 title element:
>> <xs:element name="title" type="xs:string" maxOccurs="unbounded">
>>
>> EML 2.0.2 proposed title element:
>> <xs:element name="title" type="txt:TextType" maxOccurs="unbounded">
>>
>>
>> This would allow for backwards compatible markup such as:
>> <eml>
>>    <dataset>
>>      <title>My Title Text</title>
>>      ...
>>    </dataset>
>> </eml>
>>
>> and also:
>>
>> <eml>
>>    <dataset>
>>      <title><emphasis>My Title Text</emphasis></title>
>>      ...
>>    </dataset>
>> </eml>
>>
>> This change should be backward compatible with EML 2.0.1 in that an  
>> element of type txt:TextType can take a plain xs:string without any  
>> other markup.
>>
>> So, we wanted to open this type of change up to the community for  
>> comment since it would affect all of the text nodes in the EML schema,  
>> even though the changes aren't immense.
>>
>> Can anyone see downsides to this type of change?
>>
>> Other comments?
>>
>> Cheers,
>> Chris
>> _________________________________________________________________
>> christopher jones       cjones at msi.ucsb.edu      (805) 680-5946
>> marine science institute  university of california, santa barbara
>> _________________________________________________________________
>>
>>
>>
>>
>> _______________________________________________
>> Eml-dev mailing list
>> Eml-dev at ecoinformatics.org
>> http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>>     
> _______________________________________________
> Eml-dev mailing list
> Eml-dev at ecoinformatics.org
> http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>   



More information about the Eml-dev mailing list