[Bug 492] - eml-literature changes needed

bugzilla-daemon@ecoinformatics.org bugzilla-daemon at ecoinformatics.org
Tue Sep 3 07:55:36 PDT 2002


http://bugzilla.ecoinformatics.org/show_bug.cgi?id=492





------- Additional Comments From scott.chapal at jonesctr.org  2002-09-03 07:55 -------

I'm trying hard to follow this bug's issues, but I'm struggling.

Running diffs on eml-literature.xsd 1.32 against 1.33, the entire file
diffs because 1.33 uses CR-LF EOLs.  Is there one file format or the
other which is standard in the archive?  Most seem to be UNIX format,
w/ CR EOL's.

So, I stripped CR-LF's to diff.

See other comments below:

> So how do i cite a paper or poster presented at xxth annual meeting
> of the Ecological Socienty of America, Spokane, August 1999 that is
> not published in any proceedings volume?

> I'd be happy to change cardinality in conferenceProceedings and
> elsewhere.  As I was going through literature I felt that
> cardinality was somewhat arbitrary, but for the most part I didn't
> make changes except where your notes indicated explicitly to relax
> the restrictions.  I'd prefer that everything be optional in
> literature unless it is absolutely fundamental to the particular
> reference type.  Is that reasonable?  But, what is fundamental in
> each reference type?  In EndNote every field is optional, so we
> could go that route and I think it would be fine.  I'm open to
> suggestions.

Why don't you consult some other well-established sources, in addition
to EndNote. BibTeX uses the following Entry-Types, for example:

See:
http://www.ecst.csuchico.edu/~jacobsd/bib/formats/bibtex.html
for a brief summary, including description of fields.

@article -  An article from a journal or magazine.

@book - A book with an explicit publisher.

@booklet - A work that is printed and bound, but without a named
publisher or sponsoring institution.

@inbook - A part of a book, which may be a chapter (or section or
whatever) and/or a range of pages.

@incollection - A part of a book having its own title.

@inproceedings - An article in a conference proceedings.

@manual - Technical documentation.

@mastersthesis - A Master's thesis.

@misc - Use this type when nothing else fits.

@phdthesis - A PhD thesis.

@proceedings - The proceedings of a conference.

@techreport - A report published by a school or other institution,
usually numbered within a series.

@unpublished - A document having an author and title, but not formally
published.

BibTex has been in use for almost 20 years and is used in a broad
array of science and humanities disciplines.  I'm not advocating the
use of BibTeX, just that the data structures have been debated and
determined...the point is, lots of people have thought about citation
issues, why do we have to re-invent the wheel in EML?  BibTeX is a
standard output format of EndNote.  Also, there are nascent XML
representations of BibTex, and BibTeXML <-> DocBook efforts eg.:
http://bibtexml.sourceforge.net/

Look at the distinctions (regarding requred elements) between the
Book, Proceedings and InProceedings data structures (OPT fields are
optional, ALT are alternate, others are required).

@Book{,
  ALTauthor = 	 {},
  ALTeditor = 	 {},
  title = 	 {},
  publisher = 	 {},
  year = 	 {},
  OPTkey = 	 {},
  OPTvolume = 	 {},
  OPTnumber = 	 {},
  OPTseries = 	 {},
  OPTaddress = 	 {},
  OPTedition = 	 {},
  OPTmonth = 	 {},
  OPTnote = 	 {},
  OPTannote = 	 {}
}


@Proceedings{,
  title = 	 {},
  year = 	 {},
  OPTkey = 	 {},
  OPTeditor = 	 {},
  OPTvolume = 	 {},
  OPTnumber = 	 {},
  OPTseries = 	 {},
  OPTaddress = 	 {},
  OPTmonth = 	 {},
  OPTorganization = {},
  OPTpublisher = {},
  OPTnote = 	 {},
  OPTannote = 	 {}
}


@InProceedings{,
  author = 	 {},
  title = 	 {},
  booktitle = 	 {},
  OPTcrossref =  {},
  OPTkey = 	 {},
  OPTpages = 	 {},
  OPTyear = 	 {},
  OPTeditor = 	 {},
  OPTvolume = 	 {},
  OPTnumber = 	 {},
  OPTseries = 	 {},
  OPTaddress = 	 {},
  OPTmonth = 	 {},
  OPTorganization = {},
  OPTpublisher = {},
  OPTnote = 	 {},
  OPTannote = 	 {}
}

Some of the issues which are addressed below are solved in BibTex with
higher-granularity definitions, ie. address, note or annote.

Similarly, the entry-types 'Unpublished' and 'Miscellaneous' cover the
broad array of unorthodox citations without specifying all the
details.  So, PersonalCommunication would simply be an instance of
'Miscellaneous' with the note field tagged 'Personal Communication';
likewise for audioVisual, etc. etc.

> e.
> Drop publicationPlace – the locational information is already in 
> publisher

This is called 'address' in BibTeX.  And apparently there are some
usage conventions:

  address

  Usually the address of the publisher or other type of
  institution. For major publishing houses, van Leunen recommends
  omitting the information entirely. For small publishers, on the
  other hand, you can help the reader by giving the complete address.

> Drop presentationPlace. Move the proceedings information from 
> conferenceProceedings to this module.
> 
> PRESENTATION type dropped altogether.  No corresponding type in EndNote.

Would fall under miscellaneous.

> g.
> Drop institution from report. Institutional affiliation of authors is 
> already in the RP information of the authors.

Institution is a standard field in 'Technical Report' in BibTex.

@TechReport{,
  author = 	 {},
  title = 	 {},
  institution =  {},
  year = 	 {},
  OPTkey = 	 {},
  OPTtype = 	 {},
  OPTnumber = 	 {},
  OPTaddress = 	 {},
  OPTmonth = 	 {},
  OPTnote = 	 {},
  OPTannote = 	 {}
}

> h.
> Make report number optional. This may be part of the title or non-
> existent

> i.
> Drop publisher from thesis. If it is published then it is a book.

@PhdThesis{,
  author = 	 {},
  title = 	 {},
  school = 	 {},
  year = 	 {},
  OPTkey = 	 {},
  OPTtype = 	 {},
  OPTaddress = 	 {},
  OPTmonth = 	 {},
  OPTnote = 	 {},
  OPTannote = 	 {}
}

> regarding epiphany:

The decision on something like this ought to be detemined on it's
merits or lack-of.  Although the difficulties addressed by Peters
comments are real, there would be clear programmatic solutions to
those.

A better criteria might be:
Is EML's clarity or utility enhanced by this proposition or not?

  The eml-resource module contains general information that describes
  dataset resources, literature resources, collection resources, and
  software resources. It is intended to provide overview information
  about the resource, including title, abstract, keywords, contacts,
  and the links to associated metadata and data for the given
  resource.

But in the case of eml-literature, many of the bibliographic
references might be completely external to the the data/projects being
documented in that instance.  The current structural relationship
between eml-resource and eml-literature presumes involvement of the
'creator' in the dataset, doesn't it?  A work being cited, however,
wouldn't necessarily be by a defined 'creator' in EML, but just as
likely could be an author of a relevant document, who has no reason to
be defined in EML beyond his authorship status in that particular
cited document (a methods paper; a seminal work defining a research
theme, etc.).  So, extending creator in this way to provide an
'author' in the various entry-types is kind of mutant.

Or am I totally misreading this issue?



More information about the Eml-dev mailing list