[eml-dev] Proposed revision to eml-literature module...

Matt Jones jones at nceas.ucsb.edu
Tue Aug 9 16:20:10 PDT 2005


Mark,

I think I agree with you -- the proposed bibliography element is a 
container that allows lists of citations that is very useful and should 
be a direct part of EML.  I don't think the long lists of elements is a 
real problem, as its just an XML document and judicious use of an event 
parser like SAX allows one to handle even the largest XML documents (use 
of DOM or JDOM can definitely have a negative impact on performance in a 
situation like this).

I haven't had a chance to review the proposal fully yet (I will do so 
when I return), but at first glance it seemed like a beneficial change.

Matt

Mark Servilla wrote:
> Hi Peter,
> 
> Thank you for your thoughts.  I've added some additional comments below.
> 
> Sincerely,
> Mark
> 
> Peter McCartney wrote:
> 
>>Ive thought about this since it was presented last week and I have to
>>say I don't believe its necessary. The purpose of EML is to provide a
>>standard for describing an information resources. We discussed the issue
>>of using it as a container for many documents early on and decided this
>>was not appropriate. Early experiments using this type of schema with
> 
> 
> For clarity, what we have proposed is not a container for multiple 
> documents, but only for multiple document citations - similar to how the 
> dataset, software, and protocol modules allow for multiple entries 
> within each of their respective modules.  I realize that there is a 
> concern for the volume that could be generated within a "bibliography" 
> module, but similar constraints are not enforced within the other 
> modules and volume with in-line data could certainly far out-weigh 
> multiple citation entries (especially, any remote sensing imagery).  In 
> such cases, asynchronous communication issues should be addressed at a 
> different level of the application.
> 
> 
>>Xanthoria revealted that the file could potentially grow very large with
>>no warning, resulting in timeouts and hangs.
>>
>>I think an equivalent solution that does not introduce any backward
>>compatibility is to define a new schema called "bibliography" and import
>>the eml-literature.xsd using the citation element as a repeatable
>>element within that schema. We have done this lots in our xylopia
>>project where we wanted to define a schema for one purpose or another
>>that contained within it some eml document. Any aplication that reads
>>such a document can take each individual citation element and write it
>>out as valid EML document on the receiving end simply by generating a
>>new <eml> tag and inserting the entire <citation> or <dataset> tag
> 
> 
> But isn't this really a work around for short comings in eml?  Wouldn't 
> correcting eml be a more appealing fix, thus not requiring each domain 
> to develop an eml work-around - and, making the correction part of the 
> standard?
> 
> 
>>inside that. An even better solution is to simply use the harvest
>>document format used for metacat uploads that contains only pointers to
>>the individual documents so they can be retrieved at a pace that the
>>ingesting service can determine. SEINet uses bibliography files that
>>look like this for managing user's bibliographies. Ive attached a
>>sample.
> 
> 
> Agreed, if I understand what you are saying.  The proposed change only 
> contains references to the citation (not the actual document).  If 
> changes to eml include an external referencing mechanism (wasn't this 
> once implemented?), then this should be a no brainer.
> 
> 
>>Peter McCartney(peter.mccartney at asu.edu)
>>International Institute for Sustainability
>>Arizona State University
>>480-965-6791
>>
>>
>>
>>
>>>-----Original Message-----
>>>From: eml-dev-bounces at ecoinformatics.org 
>>>[mailto:eml-dev-bounces at ecoinformatics.org] On Behalf Of Mark Servilla
>>>Sent: Tuesday, August 09, 2005 11:34 AM
>>>To: eml-dev at ecoinformatics.org
>>>Cc: Margaret O'Brien
>>>Subject: [eml-dev] Proposed revision to eml-literature module...
>>>
>>>
>>>Hello EML Community,
>>>
>>>The LTER Network Office and Santa Barbara Coastal LTER site 
>>>would like 
>>>to propose a change to the eml-literature module.  The 
>>>proposed change 
>>>is to move the "citation" element subtree currently at the top module 
>>>level (where the cardinality is 1) to an inner and new top 
>>>level module, 
>>>"bibliography", where the cardinality of citation would be 1 to 
>>>infinity.  The goal of this change is to better reflect management of 
>>>publication style citation lists as opposed to a single citation for 
>>>each eml document instance.  Note that a single citation is 
>>>still very 
>>>possible.
>>>
>>>We have also added the "contact" subtree within the "bibliography" at 
>>>the same level as "citation", in addition to adding "contact" 
>>>within the 
>>>actual "citation" subtree.  The first "bibliography/contact" would be 
>>>used to denote the manager of the bibliography, where as the 
>>>"citation/contact" would reference the manager of the actual 
>>>citation. 
>>>The following link is to the revised schema within our public CVS 
>>>(http://cvs.lternet.edu/cgi-bin/viewcvs.cgi/NIS/projects/bibli
>>
>>ography/eml-2.0.1bib/). 
>>  I have also attached a simple "png" view of the proposed change in 
>>XMLSpy graphical notation as a quick reference.
>>
>>We were also discussing the merit of having the "title" element in the 
>>eml-resource module change from a simple element to a complex element, 
>>and include within the title subtree similar structure to the "section" 
>>and "para" elements (found within "abstract") for those more complicated
>>
>>titles that include text-based style and formatting.  We did not, 
>>however, modify the the test schema to include this change (at least at 
>>this point).
>>
>>We realize that any such change to the current EML-2.0.1 standard would 
>>certainly break backward compatibility.  However, it may be acceptable 
>>if/when the next major eml release would potential have the same effect.
>>
>>  Your thoughts are most welcome on this proposed change.
>>
>>Sincerely,
>>Mark
>>
>>
>>------------------------------------------------------------------------
>>
>><bibliography creationDate="Mar 8, 2004" id="1078769397263"><name>peter</name><item id="101 " schema="EML Dataset" src="ces_dataset"/><item id="102 " schema="EML Dataset" src="ces_dataset"/><item id="801" schema="EML Literature" src="ces_literature"/><item id="805" schema="EML Literature" src="ces_literature"/></bibliography>
> 
> 

-- 
-------------------------------------------------------------------
Matt Jones                                     jones at nceas.ucsb.edu
http://www.nceas.ucsb.edu/    Fax: 425-920-2439    Ph: 907-789-0496
National Center for Ecological Analysis and Synthesis (NCEAS)
University of California Santa Barbara
Interested in ecological informatics? http://www.ecoinformatics.org
-------------------------------------------------------------------


More information about the Eml-dev mailing list