<div dir="ltr">Thanks all for the excellent explanations. My own more naive thoughts as I process these suggestions:<div><br></div><div>It does sound like under the current schema that additionalMetadata is a reasonable home. If I understand correctly, this section is rather flexible and I could put an EML citation node under additionalMetadata/metadata. Given this flexibility, I might be tempted to write the citation data out in RDFa with something more widely used such as the PRISM vocabulary (indeed I could copy that from the html headers of most publishers), or crossref's XML -- which perhaps only proves Matt's point about why additionalMetadata isn't ideal. </div>
<div><br></div><div>On extending EML, it's not clear to me that `dataUsageCitation` (or whatever term is chosen) should be an element of `dataset` rather than of `eml`? Personally I would have put it at `eml`, as a sister node to the `dataset`, `protocol` or `software` that the document might be describing. If I wanted to publish an article & an eml metadata file describing a piece of software, I'd use eml/software I think, but then I'd need somewhere other than eml/dataset/dataUsageCitation to link my citation. </div>
<div><br></div><div><div>I appreciate seeing Margaret's examples, though I think it is important for a machine using the EML file to be able to extract the bibliographic information (at least doi, if available) to the work that should cited when that data is used; so that having the citation data in either the abstract or only at a 'package' level above the metadata itself seems non-ideal. </div>
</div><div><br></div><div>Thanks for the background on the shift from multiple RDF linked EML files to a single hierarchical file. On this topic, I wonder if there might be any other places where exclusive elements limit expression in a single file: for instance, it is not obvious to me that I can describe both a dataTable and a spatialRaster in the same file (imagining I use both in the same analysis) Of course I can serialize these into different EML files, but not sure what the best practice would. </div>
<div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Dec 6, 2013 at 7:51 AM, Matt Jones <span dir="ltr"><<a href="mailto:jones@nceas.ucsb.edu" target="_blank">jones@nceas.ucsb.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">+1 on Chris' and Wade's comments so far. I think its a shortcoming of EML that there is no explicitly labeled field for this, and I've gotten this question from many EML users. I also agree with Wade that these linkages quickly become stale. Nevertheless, it would be good to have a way to link to citations that use the data. Chris' solution to add it to additionalMetadata is certainly a workaround, but not very satisfying because there won't be consistency across users as to how its embedded. We could consider adding an optional top level field to eml-dataset to provide this, possibly something like:<div>
<br></div><div>/eml/dataset/dataUsageCitation which would be of type CitationType</div><div><br></div><div>I added a feature request ticket in Redmine to track this issue:</div><div> <a href="https://projects.ecoinformatics.org/ecoinfo/issues/6283" target="_blank">https://projects.ecoinformatics.org/ecoinfo/issues/6283</a></div>
<div><br></div><div>Thoughts? Is such change worth a revision of the EML schemas? This issue has been on my radar for a long time, but has never reached the critical point of triggering a version change, which has widespread impact. There are several other outstanding schema changes that might raise the need for a new release, including fixing <a href="https://projects.ecoinformatics.org/ecoinfo/issues/5728" target="_blank">internationalization issues</a>, <a href="https://projects.ecoinformatics.org/ecoinfo/issues/5998" target="_blank">compatibility with ISO issues</a> raised by GBIF, <a href="https://projects.ecoinformatics.org/ecoinfo/issues/5731" target="_blank">key/keyref parser checking</a>, and other items. So maybe now's the right time? I think these could be done with backwards-compatible changes (all EML 2.1.1 documents would be valid under the new schema with only a namespace change).</div>
<span class="HOEnZb"><font color="#888888">
<div><br></div><div>Matt</div><div><br><div>
<br></div><div><br></div></div></font></span></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Dec 6, 2013 at 5:55 AM, Wade Sheldon <span dir="ltr"><<a href="mailto:sheldon@uga.edu" target="_blank">sheldon@uga.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Chris and Carl,<br>
<br>
I agree with Chris' interpretation of the specification, and would not recommend putting a general literature citation under methods/methodStep/citation or under software. Those elements are best used to link to protocols and other documents specific to the parent elements, and viewers would not think to look for citations referencing the entire data set there.<br>
<br>
And no, we did not specifically address this issue in the first LTER EML Best Practices document in 2004, and I don't recall that issue being addressed version 2 in 2011 either (but Margaret can correct me if I'm wrong).<br>
<br>
In my opinion, it makes less sense to embed citations to publications in data than citations to data in publications, so we do not attempt to shoe-horn citations into EML documents using additionalMetadata or other approaches. The data used in a publication is fixed at the time of publication, whereas a published data set will ideally be used and cited many times (hey - I'm an optimist), so literature citations in data sets would quickly stale and require ongoing document maintenance to keep current.<br>
<br>
I think the best approach to this issue, long-term, is to rely on data registries and link-outs associated with journals to provide this type of association.<br>
<br>
Regards,<br>
<br>
Wade Sheldon<br>
GCE-LTER Information Manager<div><br>
<br>
<br>
On 12/6/2013 8:18 AM, Christopher Jones wrote:<br>
</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>
Hi Carl,<br>
<br>
I'd say what you're trying to do is pretty fundamental, and should be straight forward. However, I think you've highlighted an issue that is a consequence of a design transition that happened in the EML schemas quite a while ago.<br>
<br></div>
In the first designs of EML, the last of which was EML 2 Beta 6 <<a href="https://code.ecoinformatics.org/code/eml/tags/RELEASE_EML_2_0_0_BETA_6/" target="_blank">https://code.ecoinformatics.<u></u>org/code/eml/tags/RELEASE_EML_<u></u>2_0_0_BETA_6/</a>>, each of the EML modules were linkable to each other in an RDF-like syntax (see the <triple> tag in eml-resource <<a href="https://code.ecoinformatics.org/code/eml/tags/RELEASE_EML_2_0_0_BETA_6/eml-resource.png" target="_blank">https://code.ecoinformatics.<u></u>org/code/eml/tags/RELEASE_EML_<u></u>2_0_0_BETA_6/eml-resource.png</a>><u></u>). And so what you're describing would entail creating an EML Dataset document, then an EML Citation document, and then linking the two with a relationship ("citation.1.1" "is citation for" "dataset.1.1").<div>
<br>
<br>
This obviously provided plenty of flexibility, but also a degree of complexity, and so the community decided to move toward a hierarchical structure with a top-level eml.xsd schema. In doing so, some of the module relationships were hard-coded into the EML schema hierarchy, and at the top level, datasets and citations were encoded as top-level choices, and as you point out, are mutually exclusive.<br>
<br>
As I do a quick scan of the schemas for references to eml-literature CitationType, the module is used in the following other modules:<br>
<br>
eml.xsd<br>
eml-attribute.xsd<br>
eml-coverage.xsd<br>
eml-methods.xsd<br>
eml-physical.xsd<br>
eml-project.xsd<br>
<br>
Given the history above, I think that the intention is to document your paper at the /eml/citation level. Hard-coded links to citations in eml-attribute, eml-coverage, eml-methods, eml-physical, and eml-project all describe links to citations that are very specific in scope (e.g., in eml-methods, the citation is intended to document a specific procedure used).<br>
<br>
So, to me, the use of a citation in these sub-modules to describe a dataset-level citation doesn't quite fit. I'd love to hear what others are doing to satisfy this need. In particular, did the LTER's EML Best Practices group touch on this subject while writing that document?<br>
<br></div>
As one alternative suggestion, I should point out the /eml/dataset/<u></u>additionalMetadata <<a href="https://code.ecoinformatics.org/code/eml/tags/RELEASE_EML_2_1_1/eml.png" target="_blank">https://code.ecoinformatics.<u></u>org/code/eml/tags/RELEASE_EML_<u></u>2_1_1/eml.png</a>> field. This structure was added after EML 2 Beta 6 in order to retain some of the flexibility that the triple structure used to provide. This element has a <metadata> child, which in theory could contain a full /eml/citation document, and the sibling <describes> element could point to the id attribute value of your /eml/dataset element.<div>
<br>
<br>
I'd like to hear what others think about this solution. Obviously, the <describes> link is a more semantically vague link from the citation documentation to the dataset documentation than a predicate like "isCitationFor", but it at least provides a link.<br>
<br>
If the community hasn't already come to a consensus on this issue, I think this thread might help in getting there. Thanks for bringing it up, Carl.<br>
<br>
Cheers,<br>
Chris<br>
<br>
On Dec 5, 2013, at 4:43 PM, Carl Boettiger wrote:<br>
<br>
</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>
Sorry if this is a bit elementary.<br>
<br>
Let's say I have an EML file describing data that is published as part of the supplemental materials of a paper. It seems reasonable to add a citation to that paper in the metadata. Is the best place for such a citation:<br>
<br>
eml/dataset/methods/<u></u>methodsStep/citation<br>
<br>
preceded by a<br>
<br>
eml/dataset/methods/<u></u>methodsStep/description<br>
<br>
explaining that the citation refers to the paper in which the data was first published, etc?<br>
<br>
<br></div>
In a related question, if the EML was documenting software instead, e.g. eml/software, where would such a citation go? I don't see <<a href="http://knb.ecoinformatics.org/software/eml/eml-2.1.1/eml-software.png" target="_blank">http://knb.ecoinformatics.<u></u>org/software/eml/eml-2.1.1/<u></u>eml-software.png</a>> `citation` as child of anything under `eml/software`.<div>
<br>
<br>
<br>
Or am I missing the boat entirely here and the natural thing to do is have a separate EML file, eml/citation, and somehow reference that file? I don't really understand the motivation for having an eml file that consists only of eml/citation, but as I understand, eml/citation and eml/dataset etc are exclusive, right?<br>
<br>
Thanks for the help!<br>
<br>
- Carl<br>
<br>
-- <br>
Carl Boettiger<br>
UC Santa Cruz<br>
<a href="http://carlboettiger.info/" target="_blank">http://carlboettiger.info/</a><br>
______________________________<u></u>_________________<br>
Eml-dev mailing list<br>
</div><a href="mailto:Eml-dev@ecoinformatics.org" target="_blank">Eml-dev@ecoinformatics.org</a> <mailto:<a href="mailto:Eml-dev@ecoinformatics.org" target="_blank">Eml-dev@<u></u>ecoinformatics.org</a>><br>
<a href="http://lists.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev" target="_blank">http://lists.nceas.ucsb.edu/<u></u>ecoinformatics/mailman/<u></u>listinfo/eml-dev</a><br>
</blockquote><div>
<br>
<br>
<br>
______________________________<u></u>_________________<br>
Eml-dev mailing list<br>
<a href="mailto:Eml-dev@ecoinformatics.org" target="_blank">Eml-dev@ecoinformatics.org</a><br>
<a href="http://lists.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev" target="_blank">http://lists.nceas.ucsb.edu/<u></u>ecoinformatics/mailman/<u></u>listinfo/eml-dev</a><br>
</div></blockquote><span><font color="#888888">
<br>
-- <br>
______________________________<u></u>______<br>
<br>
Wade M. Sheldon<br>
GCE-LTER Information Manager<br>
School of Marine Programs<br>
University of Georgia<br>
Athens, GA 30602-3636<br>
Email: <a href="mailto:sheldon@uga.edu" target="_blank">sheldon@uga.edu</a><br>
WWW: <a href="http://gce-lter.marsci.uga.edu/bios/wsheldon" target="_blank">http://gce-lter.marsci.uga.<u></u>edu/bios/wsheldon</a></font></span><div><div><br>
<br>
______________________________<u></u>_________________<br>
Eml-dev mailing list<br>
<a href="mailto:Eml-dev@ecoinformatics.org" target="_blank">Eml-dev@ecoinformatics.org</a><br>
<a href="http://lists.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev" target="_blank">http://lists.nceas.ucsb.edu/<u></u>ecoinformatics/mailman/<u></u>listinfo/eml-dev</a><br>
</div></div></blockquote></div><br></div>
</div></div><br>_______________________________________________<br>
Eml-dev mailing list<br>
<a href="mailto:Eml-dev@ecoinformatics.org">Eml-dev@ecoinformatics.org</a><br>
<a href="http://lists.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev" target="_blank">http://lists.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev</a><br>
<br></blockquote></div><br><br clear="all"><div><br></div>-- <br><div dir="ltr">Carl Boettiger<br>UC Santa Cruz<br><a href="http://carlboettiger.info/" target="_blank">http://carlboettiger.info/</a><br></div>
</div>