[tcs-lc] nameObjects, spellings, vernaculars, etc

Gregor Hagedorn G.Hagedorn at BBA.DE
Wed May 4 04:22:38 PDT 2005


I think others need to chime in here. A few additions/comments from me:

> Can you offer your own personal definition of a "NameObject" as you see it?

A name under a regulation such as ICBN, ICZN, etc., but in the future also 
perhaps other codes of nomenclature (perhaps BAYER/EPPO codes for pathogenic 
species?). I think I agree with an earlier post on this list.

> Yes, but should that generic label include verbatim spelling as by the
> AccordingTo author?  Or the Code-correct spelling of the corresponding
> NameObject?  I would prefer both, unambiguously distingusihed, as described
> above.

I think neither, just the concept-name-string under the canonicalization rules 
preferred by the data provider. To my knowledge there is no code how to express 
a concept-specific name, so this needs to be open.

> I even disagree that we need to keep track in the TCS schema variants of the
> *name* author ("Smith" above).  The exact orthography of the author name has no
> bearing either on the concept, or on the name. These, to me, are properties of a
> human (Agent), not of a ConceptObject or of a NameObject. You may want to model
> it in your local database, but I really don't think it belongs in the transfer
> schema.  I would like to read what others think of this.

I argue this not from a taxonomy-producer/editor standpoint, but from the 
practical consumer standpoint that I would like to work to the data. And in all 
the name/concept-based information I have, names are strings with nomenclatural 
authors, and occasionally (<1%) with a concept suffix.

This is not stuff for the local database insofar as I would like to share this 
huge work of connecting information based on strings with others, rather than 
everybody doing it him or herself in a "local database". Just my desire of 
where GBIF could come in and really increase our efficiency.

> > Imagine the red book lists - how often do people try to copy the
> > spelling from
> > there. If the list contains concept-specific names (which I
> > believe most of us,
> > me included, hope will pick up in the future), would it not be
> > useful to be
> > able to give that spelling for all the sources re-using the concept?
> 
> Spelling of the genus/species/subspecies name components -- yes.  Spelling
> of the Code-regulated author(s) of the applied name -- maybe.  Spelling of
> the Author(s) of the concept definition -- no.

Without spelling of the authors, a tcs data file would be close to worthless to 
me. Or do you mean with yes and no only whether an option to transfer spelling 
variants should be provided?

> - I do think it is important for TCS to record the fact that Hagedorn (2002)
> used a different spelling of the genus name from the other two pubs.

Why? Assuming that the purpose of names is about linking two pieces of 
information - why do you care for the spelling of the linking goal but not for 
the spelling of the link originiators, i.e. those publication that desire to 
link to "xxx sec Hagedorn (2002)"?
 
> - I do not think it is important for TCS to record the fact that Hagedorn
> (2002) abbreviated "L." for the authorship of the name, "Migenus myspecies".
> 
> - I definitely do not think it is important for TCS to record the fact that
> Hagedorn (2002) abbreviated the SEC authorships of "L." and "C. & V.", nor that
> he misspelled "Pile" as a SEC author.

I agree that this is not important as "facts", and in the case of the "&" it 
would be simple to solve this algorithmically. However, I think the name 
variants are a good compromise between the need to be able to associate name-
strings

> > If I were to cite this concept, I would correct the name to the
> > ICBN-canonical
> > one, and would still ass the "sec..." and I would mean to refer
> > to exactly your
> > concept.
> 
> I would like to think that you would preserve Pyle's actual spelling of the
> name-string in connection with his SEC. concept of it

In my database I like to do this, but I so often argue the point that this is 
relevant, that I believe most people prefer not to do so.

> spelling of the name.  If you are talking a paper-published presentation, most
> syononymy listings preserve exact spelling as used by each author.  

I think this assumption is fundamental to your model, and I know of no such 
tradition. The synonymy lists and checklists I know are NOT the source 
spellings but are corrected to the best current knowledge of the author. Can 
you provide some details in which taxonomic areas people follow the rule you 
outline?

> I can imagine tools that, using just the "GenusName (SubgenusName)
> speciesname subspeciesname varietyname", would narrow it down to the
> "correct" name pretty damn quickly, with only the occassional confusing
> homonym -- by which point a pair of human eyes should complete the link.

My own experience is that this is significant work worth sharing. Maybe that is 
special to pathogenic fungi which are known to have a much higher than average 
share of homonyms (because of the tradition to name new species Genus genitive-
of-host-plant). Of the 200 000 GLOPP names to connect to index fungorum, 30% 
worked with authors. I then connected all those blind that did not have a 
homonym in IndexFungorum and did have the same spelling of the name without 
authors. Further, I generated an algorithically generated list of name 
variants, and got to 70%. The rest was still manual work, and it would be good 
to check that piece and share it. Much of it is currently unchecked, and quite 
a number of false connections are found continously.
 
I have no issue of delegating this to another standard, but I think the effort 
to support this in TCS is very little. To me its worth to include in 1.0. All 
depends on what others do. However, I believe the GBIF situation is such that 
GBIF exactly wants to do the connections, not wait until first the nomenclators 
and concept bases are finished, and finally GenBank uses them as a standard.

Gregor----------------------------------------------------------
Gregor Hagedorn (G.Hagedorn at bba.de)
Institute for Plant Virology, Microbiology, and Biosafety
Federal Research Center for Agriculture and Forestry (BBA)
Königin-Luise-Str. 19           Tel: +49-30-8304-2220
14195 Berlin, Germany           Fax: +49-30-8304-2203



More information about the Tcs-lc mailing list