[eml-dev] quick question on adding citation information

Wade Sheldon sheldon at uga.edu
Fri Dec 6 06:55:21 PST 2013


Chris and Carl,

I agree with Chris' interpretation of the specification, and would not recommend putting a general literature citation under methods/methodStep/citation or under software. Those elements are best used to link to protocols and other documents specific to the parent elements, and viewers would not think to look for citations referencing the entire data set there.

And no, we did not specifically address this issue in the first LTER EML Best Practices document in 2004, and I don't recall that issue being addressed version 2 in 2011 either (but Margaret can correct me if I'm wrong).

In my opinion, it makes less sense to embed citations to publications in data than citations to data in publications, so we do not attempt to shoe-horn citations into EML documents using additionalMetadata or other approaches. The data used in a publication is fixed at the time of publication, whereas a published data set will ideally be used and cited many times (hey - I'm an optimist), so literature citations in data sets would quickly stale and require ongoing document maintenance to keep current.

I think the best approach to this issue, long-term, is to rely on data registries and link-outs associated with journals to provide this type of association.

Regards,

Wade Sheldon
GCE-LTER Information Manager


On 12/6/2013 8:18 AM, Christopher Jones wrote:
> Hi Carl,
>
> I'd say what you're trying to do is pretty fundamental, and should be straight forward.  However, I think you've highlighted an issue that is a consequence of a design transition that happened in the EML schemas quite a while ago.
>
> In the first designs of EML, the last of which was EML 2 Beta 6 <https://code.ecoinformatics.org/code/eml/tags/RELEASE_EML_2_0_0_BETA_6/>, each of the EML modules were linkable to each other in an RDF-like syntax (see the <triple> tag in eml-resource <https://code.ecoinformatics.org/code/eml/tags/RELEASE_EML_2_0_0_BETA_6/eml-resource.png>).  And so what you're describing would entail creating an EML Dataset document, then an EML Citation document, and then linking the two with a relationship ("citation.1.1" "is citation for" "dataset.1.1").
>
> This obviously provided plenty of flexibility, but also a degree of complexity, and so the community decided to move toward a hierarchical structure with a top-level eml.xsd schema.  In doing so, some of the module relationships were hard-coded into the EML schema hierarchy, and at the top level, datasets and citations were encoded as top-level choices, and as you point out, are mutually exclusive.
>
> As I do a quick scan of the schemas for references to eml-literature CitationType, the module is used in the following other modules:
>
> eml.xsd
> eml-attribute.xsd
> eml-coverage.xsd
> eml-methods.xsd
> eml-physical.xsd
> eml-project.xsd
>
> Given the history above, I think that the intention is to document your paper at the /eml/citation level.  Hard-coded links to citations in eml-attribute, eml-coverage, eml-methods, eml-physical, and eml-project all describe links to citations that are very specific in scope (e.g., in eml-methods, the citation is intended to document a specific procedure used).
>
> So, to me, the use of a citation in these sub-modules to describe a dataset-level citation doesn't quite fit.  I'd love to hear what others are doing to satisfy this need.  In particular, did the LTER's EML Best Practices group touch on this subject while writing that document?
>
> As one alternative suggestion, I should point out the /eml/dataset/additionalMetadata <https://code.ecoinformatics.org/code/eml/tags/RELEASE_EML_2_1_1/eml.png> field.  This structure was added after EML 2 Beta 6 in order to retain some of the flexibility that the triple structure used to provide.  This element has a <metadata> child, which in theory could contain a full /eml/citation document, and the sibling <describes> element could point to the id attribute value of your /eml/dataset element.
>
> I'd like to hear what others think about this solution.  Obviously, the <describes> link is a more semantically vague link from the citation documentation to the dataset documentation than a predicate like "isCitationFor", but it at least provides a link.
>
> If the community hasn't already come to a consensus on this issue, I think this thread might help in getting there.  Thanks for bringing it up, Carl.
>
> Cheers,
> Chris
>
> On Dec 5, 2013, at 4:43 PM, Carl Boettiger wrote:
>
>> Sorry if this is a bit elementary.
>>
>> Let's say I have an EML file describing data that is published as part of the supplemental materials of a paper.  It seems reasonable to add a citation to that paper in the metadata.  Is the best place for such a citation:
>>
>>     eml/dataset/methods/methodsStep/citation
>>
>> preceded by a
>>
>>     eml/dataset/methods/methodsStep/description
>>
>> explaining that the citation refers to the paper in which the data was first published, etc?
>>
>>
>> In a related question, if the EML was documenting software instead, e.g. eml/software, where would such a citation go?  I don't see <http://knb.ecoinformatics.org/software/eml/eml-2.1.1/eml-software.png> `citation` as child of anything under `eml/software`.
>>
>>
>> Or am I missing the boat entirely here and the natural thing to do is have a separate EML file, eml/citation, and somehow reference that file?  I don't really understand the motivation for having an eml file that consists only of eml/citation, but as I understand, eml/citation and eml/dataset etc are exclusive, right?
>>
>> Thanks for the help!
>>
>> - Carl
>>
>> -- 
>> Carl Boettiger
>> UC Santa Cruz
>> http://carlboettiger.info/
>> _______________________________________________
>> Eml-dev mailing list
>> Eml-dev at ecoinformatics.org <mailto:Eml-dev at ecoinformatics.org>
>> http://lists.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>
>
>
> _______________________________________________
> Eml-dev mailing list
> Eml-dev at ecoinformatics.org
> http://lists.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev

-- 
____________________________________

  Wade M. Sheldon
  GCE-LTER Information Manager
  School of Marine Programs
  University of Georgia
  Athens, GA 30602-3636
  Email: sheldon at uga.edu
  WWW: http://gce-lter.marsci.uga.edu/bios/wsheldon



More information about the Eml-dev mailing list