[eml-dev] Storing a DOI for a dataset in EML
Kyle Braak
kbraak at gbif.org
Tue Dec 2 09:14:57 PST 2014
Dear Ben and Margaret,
Thank you very much for taking the time to reply to my question.
Regarding the use of alternateIdentifier, I thought I would just point out that the documentation adds a bit of confusion since it states:
"The primary identifier belongs in the "id" attribute, but additional identifiers that are used to label this entity, possibly from different data management systems, can be listed here.” -https://knb.ecoinformatics.org/#external//emlparser/docs/eml-2.1.1/./eml-resource.html#alternateIdentifier
This makes it seem like Option 4 I listed is a better.
Anyways, all this feedback helps a lot when it will come time to make a final decision.
Thanks once more,
Kyle
> HI Kyle -
> I agree with Ben - that the alternateIdentifier element is the best place. This is where we put it.
>
> As far as knowing which is "primary" -- in any group of alternateIdentifier tags you couldn't tell which should prevail. You could infer some precedence from the ordering, but that's pretty flimsy. For an EML data record, probably the best place for it's primary identifier is in your "option 3", the packageId attribute.
>
> I have a dataset (an EML record and data object) that also happens to have been contributed to Dryad, so they assigned a DOI too. So you could say that this data object has 2 DOIs, but actually they are distinct records (one in dryad and one in LTER). We wouldn't assign a DOI more than once to one EML record (revision). But since we already knew about the Dryad contribution, we added the Dryad DOI to our EML record with the alteranateId tag.
>
> Best,
> Margaret
>
>
> -----------
> Margaret O'Brien
> Information Management
> Santa Barbara Coastal LTER
> Marine Science Institute, UCSB
> Santa Barbara, CA 93106
> 805-893-2071 (voice)
> http://sbc.lternet.edu
On 27 Nov 2014, at 19:04, Ben Leinfelder <leinfelder at nceas.ucsb.edu> wrote:
> Hi Kyle,
> It seems like your first option is the best one since the DOI represents an identifier for the EML package you are creating. I'm not sure I understand your concern with multiple DOIs since each version of the EML package should only have one corresponding DOI, but please explain further if I've misinterpreted.
> Thanks,
> -ben
>
> On Nov 27, 2014, at 4:04 AM, Kyle Braak <kbraak at gbif.org> wrote:
>
>> Option 1:
>>
>> Storing the DOI inside alternateIdentifier.
>>
>> The problem I see with this option, however, arises when there are multiple DOI alternateIdentifiers. How to know which is the primary/latest DOI?
>>
>
>
>> On 11/27/14 4:04 AM, Kyle Braak wrote:
>>> Hi eml-dev list,
>>>
>>> I’m trying to figure out the best way to store a DOI for a dataset in EML.
>>>
>>> The DOI resolves to a page where both the data and metadata are available for download, and is intended to be used as a unique identifier for the dataset for example when citing it.
>>>
>>> Currently in our EML, we use a packageId on the <eml:eml> element defined in the ID-System “http://gbif.org” <http://gbif.org%3F>. For example:
>>>
>>> <eml:emlpackageId="6f08304a-43e6-41bd-a3a9-f9bbf93984b3/v1.7"system="http://gbif.org"scope="system">
>>>
>>> Below my signature *4 options* we’re considering, and I’d greatly appreciate any guidance or feedback on which is best.
>>>
>>> Kind regards,
>>>
>>> Kyle Braak
>>> Developer
>>> Secretariat of the Global Biodiversity Information Facility (GBIF)
>>> Universitetsparken 15, DK-2100 Copenhagen Ø
>>> Denmark, Europe
>>> www.gbif.org <http://www.gbif.org>
>>>
>>>
>>> *Option 1:*
>>>
>>> Storing the DOI inside *alternateIdentifier*.
>>>
>>> The problem I see with this option, however, arises when there are multiple DOI alternateIdentifiers. How to know which is the primary/latest DOI?
>>>
>>> *Option 2:*
>>>
>>> Storing the DOI using the *citation “identifier” attribute*.
>>>
>>> The problem with this option, however, is that it isn’t very prominent hidden away in this location.
>>>
>>> *Option 3:*
>>>
>>> Storing the DOI using the *“packageId” attribute*. It could include the dataset version number so that each version is always unique. For example:
>>>
>>> <eml:eml packageId="doi:10.5886/1bft7W5f_v1.7"scope="system"system="http://doi.org/“<http://doi.org/%3F>>
>>>
>>> *Option 4:*
>>>
>>> Storing the DOI using the *dataset “id” attribute*, in combination with “system” and “scope” attributes to specify that the identifier is defined in the ID-system “http://doi.org/“ <http://doi.org/%3F> For example:
>>>
>>> <datasetid="doi:10.5886/1bft7W5f"scope="system"system="http://doi.org/“ <http://doi.org/%3F>>
>>>
>>> I suspect, however, that this is not best practice assuming “system” should only be defined once on the <eml:eml> element, and scope=“document” should be be used on all other elements.
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Eml-dev mailing list
>>> Eml-dev at ecoinformatics.org
>>> http://lists.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20141202/96cfcf92/attachment.html>
More information about the Eml-dev
mailing list