[tcs-lc] nameObjects, spellings, vernaculars, etc
Gregor Hagedorn
G.Hagedorn at BBA.DE
Wed May 4 04:22:38 PDT 2005
I think others need to chime in here. A few additions/comments from me:
> Can you offer your own personal definition of a "NameObject" as you see it?
A name under a regulation such as ICBN, ICZN, etc., but in the future also
perhaps other codes of nomenclature (perhaps BAYER/EPPO codes for pathogenic
species?). I think I agree with an earlier post on this list.
> Yes, but should that generic label include verbatim spelling as by the
> AccordingTo author? Or the Code-correct spelling of the corresponding
> NameObject? I would prefer both, unambiguously distingusihed, as described
> above.
I think neither, just the concept-name-string under the canonicalization rules
preferred by the data provider. To my knowledge there is no code how to express
a concept-specific name, so this needs to be open.
> I even disagree that we need to keep track in the TCS schema variants of the
> *name* author ("Smith" above). The exact orthography of the author name has no
> bearing either on the concept, or on the name. These, to me, are properties of a
> human (Agent), not of a ConceptObject or of a NameObject. You may want to model
> it in your local database, but I really don't think it belongs in the transfer
> schema. I would like to read what others think of this.
I argue this not from a taxonomy-producer/editor standpoint, but from the
practical consumer standpoint that I would like to work to the data. And in all
the name/concept-based information I have, names are strings with nomenclatural
authors, and occasionally (<1%) with a concept suffix.
This is not stuff for the local database insofar as I would like to share this
huge work of connecting information based on strings with others, rather than
everybody doing it him or herself in a "local database". Just my desire of
where GBIF could come in and really increase our efficiency.
> > Imagine the red book lists - how often do people try to copy the
> > spelling from
> > there. If the list contains concept-specific names (which I
> > believe most of us,
> > me included, hope will pick up in the future), would it not be
> > useful to be
> > able to give that spelling for all the sources re-using the concept?
>
> Spelling of the genus/species/subspecies name components -- yes. Spelling
> of the Code-regulated author(s) of the applied name -- maybe. Spelling of
> the Author(s) of the concept definition -- no.
Without spelling of the authors, a tcs data file would be close to worthless to
me. Or do you mean with yes and no only whether an option to transfer spelling
variants should be provided?
> - I do think it is important for TCS to record the fact that Hagedorn (2002)
> used a different spelling of the genus name from the other two pubs.
Why? Assuming that the purpose of names is about linking two pieces of
information - why do you care for the spelling of the linking goal but not for
the spelling of the link originiators, i.e. those publication that desire to
link to "xxx sec Hagedorn (2002)"?
> - I do not think it is important for TCS to record the fact that Hagedorn
> (2002) abbreviated "L." for the authorship of the name, "Migenus myspecies".
>
> - I definitely do not think it is important for TCS to record the fact that
> Hagedorn (2002) abbreviated the SEC authorships of "L." and "C. & V.", nor that
> he misspelled "Pile" as a SEC author.
I agree that this is not important as "facts", and in the case of the "&" it
would be simple to solve this algorithmically. However, I think the name
variants are a good compromise between the need to be able to associate name-
strings
> > If I were to cite this concept, I would correct the name to the
> > ICBN-canonical
> > one, and would still ass the "sec..." and I would mean to refer
> > to exactly your
> > concept.
>
> I would like to think that you would preserve Pyle's actual spelling of the
> name-string in connection with his SEC. concept of it
In my database I like to do this, but I so often argue the point that this is
relevant, that I believe most people prefer not to do so.
> spelling of the name. If you are talking a paper-published presentation, most
> syononymy listings preserve exact spelling as used by each author.
I think this assumption is fundamental to your model, and I know of no such
tradition. The synonymy lists and checklists I know are NOT the source
spellings but are corrected to the best current knowledge of the author. Can
you provide some details in which taxonomic areas people follow the rule you
outline?
> I can imagine tools that, using just the "GenusName (SubgenusName)
> speciesname subspeciesname varietyname", would narrow it down to the
> "correct" name pretty damn quickly, with only the occassional confusing
> homonym -- by which point a pair of human eyes should complete the link.
My own experience is that this is significant work worth sharing. Maybe that is
special to pathogenic fungi which are known to have a much higher than average
share of homonyms (because of the tradition to name new species Genus genitive-
of-host-plant). Of the 200 000 GLOPP names to connect to index fungorum, 30%
worked with authors. I then connected all those blind that did not have a
homonym in IndexFungorum and did have the same spelling of the name without
authors. Further, I generated an algorithically generated list of name
variants, and got to 70%. The rest was still manual work, and it would be good
to check that piece and share it. Much of it is currently unchecked, and quite
a number of false connections are found continously.
I have no issue of delegating this to another standard, but I think the effort
to support this in TCS is very little. To me its worth to include in 1.0. All
depends on what others do. However, I believe the GBIF situation is such that
GBIF exactly wants to do the connections, not wait until first the nomenclators
and concept bases are finished, and finally GenBank uses them as a standard.
Gregor----------------------------------------------------------
Gregor Hagedorn (G.Hagedorn at bba.de)
Institute for Plant Virology, Microbiology, and Biosafety
Federal Research Center for Agriculture and Forestry (BBA)
Königin-Luise-Str. 19 Tel: +49-30-8304-2220
14195 Berlin, Germany Fax: +49-30-8304-2203
More information about the Tcs-lc
mailing list