[tcs-lc] nameObjects, spellings, vernaculars, etc

Sally Hinchcliffe S.Hinchcliffe at rbgkew.org.uk
Fri May 6 00:58:04 PDT 2005


I wrote: 
> > From my memory of Christchurch, we agreed that the Simple name form
> > (which came under a number of different labels) would be a canonical
> > form, with space given elsewhere for the the author's own particular
> > and correctable spelling of it.

and Rich replied:
> Yes, exactly -- but I thought the "Label" element of LC was created for this
> purpose, leaving me to assume that NameSimple (part of TCS, outside of LC)
> was intended for the verbatim spelling.  But I couldn't get confirmation.

If the NameSimple element of TCS IS intended for verbatim spelling 
then it needs another name, for clarity.  Otherwise it should be 
exactly the same as the Label element of LC and be in the canonical 
form, with another element there for the verbatim or as-published 
spelling. A well designed schema should have element names which 'do 
exactly what it says on the tin' 

I also wrote:
> > As far as I am concerned, all other orthographic variants that might
> > have been published elsewhere won't be going into IPNI. (apart, of
> > course, from IPNI's own scanning errors, which we will be correcting
> > as we find them).
And Rich responded:
> But if you are recording the 'as published' name in your database for each
> usage, and you are linking those usages back to a proper canonical name,
> then you are, in effect, building a list of orthographic variants.  The main
> thing here, though (which comes back to my earlier comments to Gregor), is
> that the orthographic variants exist only in the context of their actual
> usage -- they do not exist as a list of variants disconnected from their
> usage instances.  At least, that's how I think the core LC/TCS should be set
> up.  Maybe defined lists of variants attached directly to a canonical name
> (disconnected from any usages) could be a later extension to the schema, but
> I don't see it as part of the core.
>
- OK - I see your point. As it happens, IPNI will only be recording 
the first publication of a name, so the number of orthographic 
variants is limited to the original spelling of the author, plus any 
corrections (or mistakes) made by IPNI rendering that into canonical 
form. But other databases of course will record more than the first 
use of a name. In that case each publication instance can come with 
only one orthographic variant (unless the author has been 
inconsistent within the article or book).

To me it seems simple (I know you will correct me on this point) - 
each concept will have one publication instance and hence one 
orthographic rendering, which may be reproducibly correctable to one 
canonical form. Therefore the LC part of the schema needs to have a 
place where the (single) 'as published' name goes, plus a place 
(Label) where the canonical form goes. I thought this was in the 
schema already. Multiple versions of the same name-object will be 
mapped onto each other by mapping concepts to concepts, because each 
version should have a publication-instance of some sort.

I think Rich and I are in agreement here ... 

as to what consitutes a name object, I leave that to the real 
taxonomists 

> > If we want to allow for users entering typos
> > (whether their own or somebody else's) into the search term, we'll
> > use fuzzy searching (like Google's 'did you mean') which should catch
> > all possible typos not just those that have been published.
> 
> That would be a great feature as well (not of concern to the schema,
> though).  But I think we ought to hard-encode verbatim "name as published"
> attached to each TaxonConcept instance. To me, that is a fundamental piece
> of information, from which a list of published variants can eventually be
> built.
> 
> > Of course if anyone else wanted to do the (sometimes considerable)
> > research involved  in working out that one orthographic variant was
> > actually the same as this other orthographic variant then that is
> > useful information and it should probably be stored somewhere (don't
> > uBio have an interest in this?) But isn't it the case that all of
> > these orthographic variants must have appeared in a publication
> > somewhere (otherwise why bring them up?) and so the mapping must
> > always be 'Migenus myspecies L. sec. R.Pyle, Ladies Home Journal
> 
> :-)
> 
> > 2005' = 'Mygenus myspecies L. sec Linnaeus Species Plantarum 1753' -
> > in which case we are mapping concepts to concepts are we not? Because
> > name objects in themselves don't have a publication other than the
> > original protologue that is recorded in the Original Taxon Concept.
> 
> I'm not sure I exactly understand your point here, but I certainly agree
> that name variants are a property of usage (=concept instance, in this
> context), not of the NameObject per se.
> 
> > My 2p worth. This discussion is rapidly getting over my head so if
> > I've opened  a whole can of worms that someone else had previously
> > closed, please just shout me down
> 
> No shouts from me!!
> 
> Also, the topic of orthographic variants by itself is not all that heady.
> The most heady thing we need to pin down is "What constitutes a NameObject?"
> The differences between Botany and Zoology on this question are greater than
> I originally thought.  As far as I can tell, the botany definition consits
> of a set of these parts:
> 
> Genus or Monomial Name-unit + [species Name-Unit + [tertiary Name-Unit +
> tertiary Name-Rank]]
> 
> "Name-Unit" has a 1:1 correlation with a protonym/protologue. For example,
> the Basionym "Mygenus myspecies" implies two protologues: one for "Mygenus",
> and one for "myspecies"; hence, two Name-Units.  Variants/misspellings do
> not count as a separate Name-unit from their Code-correct version, because
> both share the same protologue. For example, "Mygenus mispecies" consists of
> exactly the same two Name-Units as "Mygenus myspecies".
> 
> Braketed items in the formula above are optional.  As rendered above, it
> implies that a "Genus or Monomial Name-unit" is required for all
> NameObjects. There may optionally be a species Name-Unit (binomials). There
> can only be a tertiary Name-Unit if there is also a species Name-Unit --
> hence the tertiary one is secondarily optional in the context of a provided
> species Name-Unit.
> 
> The part that was new to me (thanks to Paul Kirk's helpful clarification) is
> the "+ tertiary Name-Rank" bit.  I originally though it was simply an issue
> of a set of from one to three Name-Units that defined a botanical
> NameObject, but given that treating the terminal Name-Unit as a subspecies
> as opposed to a variety changes the authorship (and therefore really
> represents a distinct NameObject), the terminal epithet rank is required to
> minimally distinguish a trinomial "NameObject" in botany.
> 
> For comparison, a "NameObject" in zoology is identical to a "Name-Unit", as
> defined above.
> 
> Clear as mud.
> 
> Aloha,
> Rich
> 
> 

*** Sally Hinchcliffe
*** Computer section, Royal Botanic Gardens, Kew
*** tel: +44 (0)20 8332 5708
*** S.Hinchcliffe at rbgkew.org.uk



More information about the Tcs-lc mailing list