[eml-dev] quick question on adding citation information

Matt Jones jones at nceas.ucsb.edu
Fri Dec 6 07:51:58 PST 2013


+1 on Chris' and Wade's comments so far.  I think its a shortcoming of EML
that there is no explicitly labeled field for this, and I've gotten this
question from many EML users.  I also agree with Wade that these linkages
quickly become stale.  Nevertheless, it would be good to have a way to link
to citations that use the data.  Chris' solution to add it to
additionalMetadata is certainly a workaround, but not very satisfying
because there won't be consistency across users as to how its embedded.  We
could consider adding an optional top level field to eml-dataset to provide
this, possibly something like:

/eml/dataset/dataUsageCitation which would be of type CitationType

I added a feature request ticket in Redmine to track this issue:
    https://projects.ecoinformatics.org/ecoinfo/issues/6283

Thoughts?  Is such change worth a revision of the EML schemas?  This issue
has been on my radar for a long time, but has never reached the critical
point of triggering a version change, which has widespread impact.  There
are several other outstanding schema changes that might raise the need for
a new release, including fixing internationalization
issues<https://projects.ecoinformatics.org/ecoinfo/issues/5728>,
compatibility with ISO
issues<https://projects.ecoinformatics.org/ecoinfo/issues/5998>raised
by GBIF, key/keyref
parser checking <https://projects.ecoinformatics.org/ecoinfo/issues/5731>,
and other items.  So maybe now's the right time?  I think these could be
done with backwards-compatible changes (all EML 2.1.1 documents would be
valid under the new schema with only a namespace change).

Matt





On Fri, Dec 6, 2013 at 5:55 AM, Wade Sheldon <sheldon at uga.edu> wrote:

> Chris and Carl,
>
> I agree with Chris' interpretation of the specification, and would not
> recommend putting a general literature citation under
> methods/methodStep/citation or under software. Those elements are best used
> to link to protocols and other documents specific to the parent elements,
> and viewers would not think to look for citations referencing the entire
> data set there.
>
> And no, we did not specifically address this issue in the first LTER EML
> Best Practices document in 2004, and I don't recall that issue being
> addressed version 2 in 2011 either (but Margaret can correct me if I'm
> wrong).
>
> In my opinion, it makes less sense to embed citations to publications in
> data than citations to data in publications, so we do not attempt to
> shoe-horn citations into EML documents using additionalMetadata or other
> approaches. The data used in a publication is fixed at the time of
> publication, whereas a published data set will ideally be used and cited
> many times (hey - I'm an optimist), so literature citations in data sets
> would quickly stale and require ongoing document maintenance to keep
> current.
>
> I think the best approach to this issue, long-term, is to rely on data
> registries and link-outs associated with journals to provide this type of
> association.
>
> Regards,
>
> Wade Sheldon
> GCE-LTER Information Manager
>
>
>
> On 12/6/2013 8:18 AM, Christopher Jones wrote:
>
>> Hi Carl,
>>
>> I'd say what you're trying to do is pretty fundamental, and should be
>> straight forward.  However, I think you've highlighted an issue that is a
>> consequence of a design transition that happened in the EML schemas quite a
>> while ago.
>>
>> In the first designs of EML, the last of which was EML 2 Beta 6 <
>> https://code.ecoinformatics.org/code/eml/tags/RELEASE_EML_2_0_0_BETA_6/>,
>> each of the EML modules were linkable to each other in an RDF-like syntax
>> (see the <triple> tag in eml-resource <https://code.ecoinformatics.
>> org/code/eml/tags/RELEASE_EML_2_0_0_BETA_6/eml-resource.png>).  And so
>> what you're describing would entail creating an EML Dataset document, then
>> an EML Citation document, and then linking the two with a relationship
>> ("citation.1.1" "is citation for" "dataset.1.1").
>>
>>
>> This obviously provided plenty of flexibility, but also a degree of
>> complexity, and so the community decided to move toward a hierarchical
>> structure with a top-level eml.xsd schema.  In doing so, some of the module
>> relationships were hard-coded into the EML schema hierarchy, and at the top
>> level, datasets and citations were encoded as top-level choices, and as you
>> point out, are mutually exclusive.
>>
>> As I do a quick scan of the schemas for references to eml-literature
>> CitationType, the module is used in the following other modules:
>>
>> eml.xsd
>> eml-attribute.xsd
>> eml-coverage.xsd
>> eml-methods.xsd
>> eml-physical.xsd
>> eml-project.xsd
>>
>> Given the history above, I think that the intention is to document your
>> paper at the /eml/citation level.  Hard-coded links to citations in
>> eml-attribute, eml-coverage, eml-methods, eml-physical, and eml-project all
>> describe links to citations that are very specific in scope (e.g., in
>> eml-methods, the citation is intended to document a specific procedure
>> used).
>>
>> So, to me, the use of a citation in these sub-modules to describe a
>> dataset-level citation doesn't quite fit.  I'd love to hear what others are
>> doing to satisfy this need.  In particular, did the LTER's EML Best
>> Practices group touch on this subject while writing that document?
>>
>> As one alternative suggestion, I should point out the /eml/dataset/additionalMetadata
>> <https://code.ecoinformatics.org/code/eml/tags/RELEASE_EML_2_1_1/eml.png>
>> field.  This structure was added after EML 2 Beta 6 in order to retain some
>> of the flexibility that the triple structure used to provide.  This element
>> has a <metadata> child, which in theory could contain a full /eml/citation
>> document, and the sibling <describes> element could point to the id
>> attribute value of your /eml/dataset element.
>>
>>
>> I'd like to hear what others think about this solution.  Obviously, the
>> <describes> link is a more semantically vague link from the citation
>> documentation to the dataset documentation than a predicate like
>> "isCitationFor", but it at least provides a link.
>>
>> If the community hasn't already come to a consensus on this issue, I
>> think this thread might help in getting there.  Thanks for bringing it up,
>> Carl.
>>
>> Cheers,
>> Chris
>>
>> On Dec 5, 2013, at 4:43 PM, Carl Boettiger wrote:
>>
>>  Sorry if this is a bit elementary.
>>>
>>> Let's say I have an EML file describing data that is published as part
>>> of the supplemental materials of a paper.  It seems reasonable to add a
>>> citation to that paper in the metadata.  Is the best place for such a
>>> citation:
>>>
>>>     eml/dataset/methods/methodsStep/citation
>>>
>>> preceded by a
>>>
>>>     eml/dataset/methods/methodsStep/description
>>>
>>> explaining that the citation refers to the paper in which the data was
>>> first published, etc?
>>>
>>>
>>> In a related question, if the EML was documenting software instead, e.g.
>>> eml/software, where would such a citation go?  I don't see <
>>> http://knb.ecoinformatics.org/software/eml/eml-2.1.1/eml-software.png>
>>> `citation` as child of anything under `eml/software`.
>>>
>>>
>>>
>>> Or am I missing the boat entirely here and the natural thing to do is
>>> have a separate EML file, eml/citation, and somehow reference that file?  I
>>> don't really understand the motivation for having an eml file that consists
>>> only of eml/citation, but as I understand, eml/citation and eml/dataset etc
>>> are exclusive, right?
>>>
>>> Thanks for the help!
>>>
>>> - Carl
>>>
>>> --
>>> Carl Boettiger
>>> UC Santa Cruz
>>> http://carlboettiger.info/
>>> _______________________________________________
>>> Eml-dev mailing list
>>> Eml-dev at ecoinformatics.org <mailto:Eml-dev at ecoinformatics.org>
>>> http://lists.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>>>
>>
>>
>>
>> _______________________________________________
>> Eml-dev mailing list
>> Eml-dev at ecoinformatics.org
>> http://lists.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>>
>
> --
> ____________________________________
>
>  Wade M. Sheldon
>  GCE-LTER Information Manager
>  School of Marine Programs
>  University of Georgia
>  Athens, GA 30602-3636
>  Email: sheldon at uga.edu
>  WWW: http://gce-lter.marsci.uga.edu/bios/wsheldon
>
>
> _______________________________________________
> Eml-dev mailing list
> Eml-dev at ecoinformatics.org
> http://lists.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20131206/21fafa68/attachment-0001.html>


More information about the Eml-dev mailing list