[eml-dev] quick question on adding citation information

Mark Servilla servilla at lternet.edu
Thu Dec 12 10:31:34 PST 2013


Hi Matt/et al,

The new citation element (dataset/citation) is fine from my
perspective - it is obviously benign regarding backward compatibility
and presence in the hierarchy, and very straightforward in
implementation. It would also appear to meet the requirements outlined
in this thread.

For those groups who use a strict identifier to instance model
(non-serial), there still seems to be an issue of whether this element
would be of any use.

For example, if I publish my data package with identifier
knb-lter-lno.1.1 and then subsequently publish a paper referencing
this specific data package by using its identifier, I would not be
allowed to update the knb-lter-lno.1.1 data package metadata with the
paper citation; in our system, the citation would have to be entered
into the metadata of a new data package revision, knb-lter-lno.1.2.
Unfortunately, the paper references the previous revision.  It seems a
little out of sync here.

I suppose there is some benefit to anyone else who finds the most
recent revision of the data package (i.e., knb-lter-lno.1.2) - they
would see that a paper cited this data package, or upon closer
inspection of the paper, an earlier revision of this data package.  I
also see that the use of the citation element becomes more relevant
within the data package when the data package series is referenced in
lieu of the specific revision.

Sincerely,
Mark

---
Mark Servilla, Ph.D.

LTER Network Office
Department of Biology
MSC 03 2020
1 University of New Mexico
Albuquerque, NM 87131-0001

servilla at LTERnet.edu
(505) 750-3226


On Wed, Dec 11, 2013 at 11:48 AM, Matt Jones <jones at nceas.ucsb.edu> wrote:
> I have made a proposal for a new /eml/dataset/citation field to satisfy the
> needs described in this conversation.  The new field is described in EML
> Ticket # 6283 and I have checked it into the trunk of SVN (r2344).  Please
> review and comment on whether:
>
> 1) You think this will solve the needs described in this thread, and
> 2) If you would like to see any changes in field name, structure, or
> documentation.
>
> Thanks,
>
> Matt
>
>
>
> On Sat, Dec 7, 2013 at 11:18 PM, David Blankman <dblankman1 at gmail.com>
> wrote:
>>
>> I would certainly be in favor of improving compatibility with ISO and
>> other internationalization issue since I am working now primarily in a
>> European context.
>>
>> David
>>
>>
>>
>> David Blankman
>> Chair, ILTER Information Management Committee
>> Director, Information Management, Israel LTER
>>
>> 972-77-442-1951
>> 972-54-685-9345 (mobile)
>> 1-505-349-5680 (Skype)
>> dblankman (Skype)
>>
>>
>> On Fri, Dec 6, 2013 at 5:51 PM, Matt Jones <jones at nceas.ucsb.edu> wrote:
>>>
>>> +1 on Chris' and Wade's comments so far.  I think its a shortcoming of
>>> EML that there is no explicitly labeled field for this, and I've gotten this
>>> question from many EML users.  I also agree with Wade that these linkages
>>> quickly become stale.  Nevertheless, it would be good to have a way to link
>>> to citations that use the data.  Chris' solution to add it to
>>> additionalMetadata is certainly a workaround, but not very satisfying
>>> because there won't be consistency across users as to how its embedded.  We
>>> could consider adding an optional top level field to eml-dataset to provide
>>> this, possibly something like:
>>>
>>> /eml/dataset/dataUsageCitation which would be of type CitationType
>>>
>>> I added a feature request ticket in Redmine to track this issue:
>>>     https://projects.ecoinformatics.org/ecoinfo/issues/6283
>>>
>>> Thoughts?  Is such change worth a revision of the EML schemas?  This
>>> issue has been on my radar for a long time, but has never reached the
>>> critical point of triggering a version change, which has widespread impact.
>>> There are several other outstanding schema changes that might raise the need
>>> for a new release, including fixing internationalization issues,
>>> compatibility with ISO issues raised by GBIF, key/keyref parser checking,
>>> and other items.  So maybe now's the right time?  I think these could be
>>> done with backwards-compatible changes (all EML 2.1.1 documents would be
>>> valid under the new schema with only a namespace change).
>>>
>>> Matt
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Dec 6, 2013 at 5:55 AM, Wade Sheldon <sheldon at uga.edu> wrote:
>>>>
>>>> Chris and Carl,
>>>>
>>>> I agree with Chris' interpretation of the specification, and would not
>>>> recommend putting a general literature citation under
>>>> methods/methodStep/citation or under software. Those elements are best used
>>>> to link to protocols and other documents specific to the parent elements,
>>>> and viewers would not think to look for citations referencing the entire
>>>> data set there.
>>>>
>>>> And no, we did not specifically address this issue in the first LTER EML
>>>> Best Practices document in 2004, and I don't recall that issue being
>>>> addressed version 2 in 2011 either (but Margaret can correct me if I'm
>>>> wrong).
>>>>
>>>> In my opinion, it makes less sense to embed citations to publications in
>>>> data than citations to data in publications, so we do not attempt to
>>>> shoe-horn citations into EML documents using additionalMetadata or other
>>>> approaches. The data used in a publication is fixed at the time of
>>>> publication, whereas a published data set will ideally be used and cited
>>>> many times (hey - I'm an optimist), so literature citations in data sets
>>>> would quickly stale and require ongoing document maintenance to keep
>>>> current.
>>>>
>>>> I think the best approach to this issue, long-term, is to rely on data
>>>> registries and link-outs associated with journals to provide this type of
>>>> association.
>>>>
>>>> Regards,
>>>>
>>>> Wade Sheldon
>>>> GCE-LTER Information Manager
>>>>
>>>>
>>>>
>>>> On 12/6/2013 8:18 AM, Christopher Jones wrote:
>>>>>
>>>>> Hi Carl,
>>>>>
>>>>> I'd say what you're trying to do is pretty fundamental, and should be
>>>>> straight forward.  However, I think you've highlighted an issue that is a
>>>>> consequence of a design transition that happened in the EML schemas quite a
>>>>> while ago.
>>>>>
>>>>> In the first designs of EML, the last of which was EML 2 Beta 6
>>>>> <https://code.ecoinformatics.org/code/eml/tags/RELEASE_EML_2_0_0_BETA_6/>,
>>>>> each of the EML modules were linkable to each other in an RDF-like syntax
>>>>> (see the <triple> tag in eml-resource
>>>>> <https://code.ecoinformatics.org/code/eml/tags/RELEASE_EML_2_0_0_BETA_6/eml-resource.png>).
>>>>> And so what you're describing would entail creating an EML Dataset document,
>>>>> then an EML Citation document, and then linking the two with a relationship
>>>>> ("citation.1.1" "is citation for" "dataset.1.1").
>>>>>
>>>>>
>>>>> This obviously provided plenty of flexibility, but also a degree of
>>>>> complexity, and so the community decided to move toward a hierarchical
>>>>> structure with a top-level eml.xsd schema.  In doing so, some of the module
>>>>> relationships were hard-coded into the EML schema hierarchy, and at the top
>>>>> level, datasets and citations were encoded as top-level choices, and as you
>>>>> point out, are mutually exclusive.
>>>>>
>>>>> As I do a quick scan of the schemas for references to eml-literature
>>>>> CitationType, the module is used in the following other modules:
>>>>>
>>>>> eml.xsd
>>>>> eml-attribute.xsd
>>>>> eml-coverage.xsd
>>>>> eml-methods.xsd
>>>>> eml-physical.xsd
>>>>> eml-project.xsd
>>>>>
>>>>> Given the history above, I think that the intention is to document your
>>>>> paper at the /eml/citation level.  Hard-coded links to citations in
>>>>> eml-attribute, eml-coverage, eml-methods, eml-physical, and eml-project all
>>>>> describe links to citations that are very specific in scope (e.g., in
>>>>> eml-methods, the citation is intended to document a specific procedure
>>>>> used).
>>>>>
>>>>> So, to me, the use of a citation in these sub-modules to describe a
>>>>> dataset-level citation doesn't quite fit.  I'd love to hear what others are
>>>>> doing to satisfy this need.  In particular, did the LTER's EML Best
>>>>> Practices group touch on this subject while writing that document?
>>>>>
>>>>> As one alternative suggestion, I should point out the
>>>>> /eml/dataset/additionalMetadata
>>>>> <https://code.ecoinformatics.org/code/eml/tags/RELEASE_EML_2_1_1/eml.png>
>>>>> field.  This structure was added after EML 2 Beta 6 in order to retain some
>>>>> of the flexibility that the triple structure used to provide.  This element
>>>>> has a <metadata> child, which in theory could contain a full /eml/citation
>>>>> document, and the sibling <describes> element could point to the id
>>>>> attribute value of your /eml/dataset element.
>>>>>
>>>>>
>>>>> I'd like to hear what others think about this solution.  Obviously, the
>>>>> <describes> link is a more semantically vague link from the citation
>>>>> documentation to the dataset documentation than a predicate like
>>>>> "isCitationFor", but it at least provides a link.
>>>>>
>>>>> If the community hasn't already come to a consensus on this issue, I
>>>>> think this thread might help in getting there.  Thanks for bringing it up,
>>>>> Carl.
>>>>>
>>>>> Cheers,
>>>>> Chris
>>>>>
>>>>> On Dec 5, 2013, at 4:43 PM, Carl Boettiger wrote:
>>>>>
>>>>>> Sorry if this is a bit elementary.
>>>>>>
>>>>>> Let's say I have an EML file describing data that is published as part
>>>>>> of the supplemental materials of a paper.  It seems reasonable to add a
>>>>>> citation to that paper in the metadata.  Is the best place for such a
>>>>>> citation:
>>>>>>
>>>>>>     eml/dataset/methods/methodsStep/citation
>>>>>>
>>>>>> preceded by a
>>>>>>
>>>>>>     eml/dataset/methods/methodsStep/description
>>>>>>
>>>>>> explaining that the citation refers to the paper in which the data was
>>>>>> first published, etc?
>>>>>>
>>>>>>
>>>>>> In a related question, if the EML was documenting software instead,
>>>>>> e.g. eml/software, where would such a citation go?  I don't see
>>>>>> <http://knb.ecoinformatics.org/software/eml/eml-2.1.1/eml-software.png>
>>>>>> `citation` as child of anything under `eml/software`.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Or am I missing the boat entirely here and the natural thing to do is
>>>>>> have a separate EML file, eml/citation, and somehow reference that file?  I
>>>>>> don't really understand the motivation for having an eml file that consists
>>>>>> only of eml/citation, but as I understand, eml/citation and eml/dataset etc
>>>>>> are exclusive, right?
>>>>>>
>>>>>> Thanks for the help!
>>>>>>
>>>>>> - Carl
>>>>>>
>>>>>> --
>>>>>> Carl Boettiger
>>>>>> UC Santa Cruz
>>>>>> http://carlboettiger.info/
>>>>>> _______________________________________________
>>>>>> Eml-dev mailing list
>>>>>> Eml-dev at ecoinformatics.org <mailto:Eml-dev at ecoinformatics.org>
>>>>>> http://lists.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Eml-dev mailing list
>>>>> Eml-dev at ecoinformatics.org
>>>>> http://lists.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>>>>
>>>>
>>>> --
>>>> ____________________________________
>>>>
>>>>  Wade M. Sheldon
>>>>  GCE-LTER Information Manager
>>>>  School of Marine Programs
>>>>  University of Georgia
>>>>  Athens, GA 30602-3636
>>>>  Email: sheldon at uga.edu
>>>>  WWW: http://gce-lter.marsci.uga.edu/bios/wsheldon
>>>>
>>>>
>>>> _______________________________________________
>>>> Eml-dev mailing list
>>>> Eml-dev at ecoinformatics.org
>>>> http://lists.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>>>
>>>
>>>
>>> _______________________________________________
>>> Eml-dev mailing list
>>> Eml-dev at ecoinformatics.org
>>> http://lists.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>>>
>>
>>
>> _______________________________________________
>> Eml-dev mailing list
>> Eml-dev at ecoinformatics.org
>> http://lists.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>>
>
>
> _______________________________________________
> Eml-dev mailing list
> Eml-dev at ecoinformatics.org
> http://lists.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>


More information about the Eml-dev mailing list