[eml-dev] EML 2.0.2 changes to text leaf nodes

Christopher Jones cjones at msi.ucsb.edu
Thu Mar 20 16:09:07 PDT 2008


Hi all,

I strongly agree that content and presentation, ideally, should be  
kept separate by allowing stylesheets to handle the latter.  I'm  
struggling a bit with what constitutes 'content'.  A structural tag  
such as <title> lends 'meaning' to the contained text, at least in  
english.  A <b> tag in HTML seems much more presentational - it  
doesn't add meaning, merely emphasis.  However, when formatting  
conventions in scientific domains lend 'meaning' to text, like  
italicizing species binomials, it seems that we need to provide the  
facility for this, lest we lose semantic information.

I agree with Wade that we walk a fine line here between expressing  
semantics and presenting.  Cluttered EML docs could abound.  Is the  
preservation of 'meaning' worth the trade-off?

On Mar 20, 2008, at Mar20---3:06:43 PM, Wade Sheldon wrote:
> I think your casual example makes this point very well - what real  
> use is preserving <emphasis> markup in a data set title? That's what  
> XSL is for. If this is a legacy issue for some metadata providers,  
> then I think they should be encouraged (or helped) to offload  
> embedded display markup when porting to EML.

True, my example was a bit simple.  A better example would be the  
species binomial case:

<title>
   Acetylene reduction and 15N2 uptake rates for
   <emphasis>Alnus tenuifolia</emphasis> and
   <emphasis>Alnus crispa</emphasis>
   in six different successional habitats
</title>

where the stylesheet treats title tags followed by emphasis tags with  
italics.  This certainly is a presentation issue, but one that imparts  
meaning based on known conventions.  Notice how the 15N2 also seems to  
lose meaning in this title without appropriate formatting.

Perhaps there is another way to deal with this, though?  It seems too  
big of a job to try to infer meaning from straight xs:string word  
combinations (such as Alnus tenuifolia) and then present it correctly  
with the right markup for presentation.

On Mar 20, 2008, at Mar20---3:22:06 PM, inigo wrote:
> Margaret O'Brien and myself with help of Mark Servilla, and  to some
> extent J. Brunt and Corinna Gries worked on this minor fix. In it,
> we addressed the bug that Chris is talking about, yet the workaround
> that Chris is proposing does not fix the fact that there are   
> DocBook 4.*
> Schema tags present in the documentation module of EML not declared
> in the text-module of EML. Examples are <url> and <citetitle>. By
> redefining the types, we address these errors partially, yet some
> stringent XML editors (the XML Spy 2007, 2008) will call on the
> existence of these undeclared tag, critical errors. This makes the  
> schema
> rather unprofessional.

On Mar 20, 2008, at Mar20---3:39:10 PM, James Brunt wrote:
> Also, I'm in agreement with Inigo that making the schema "clean"  
> should be a priority in this bug-fix release.

Fair enough.  Consistent and complete support for either DocBook 4.x  
or DocBook 5.x throughout the EML schemas (in the eml-text module and  
the documentation tags in every module) seems like a good goal, and  
one that isn't particularly onerous.  Likewise, an audit of the  
documentation tags is in order to ensure completeness.

Questions -

Have the EML-2.0.2 proposed fixes stated in the "Community opinion on  
minor revision of EML" post been implemented in a branch in the  
Ecoinformatics EML repository? If so, are they tagged?

Besides bug #s 2054 and 2073, have the other 11 bullets in this email  
post been entered into the ecoinfo bugzilla?

Cheers,
Chris
_________________________________________________________________
christopher jones       cjones at msi.ucsb.edu      (805) 680-5946
marine science institute  university of california, santa barbara
_________________________________________________________________






More information about the Eml-dev mailing list