[tcs-lc] nameObjects, spellings, vernaculars, etc

Gregor Hagedorn G.Hagedorn at BBA.DE
Wed May 4 01:19:52 PDT 2005


Hi Rich

> So...if I understand you correctly, you're saying that name variants (sensu
> lato) should be treated as properties of Name Objects, rather than as properties
> of TaxonConcept (~=usage) instances?  How would you structure that -- something
> like:
> <VariantSpellings>
>   <VariantSpelling>Euonymus europaeus</VariantSpelling>
>   <VariantSpelling>Evonymus europaeeus</VariantSpelling>
>   <VariantSpelling>Evonymus europaeeus</VariantSpelling>
> </VariantSpellings>
> ...within a NameObject instance that has the code-correct name "Evonymus
> europaeus"?

We must be misunderstanding each other, or at least one of us. I think that the 
above proposal is fine as a child of both a NameObject and a 
TaxonConceptObject. It may be desirable to have source and quality 
designations, but basically these would be optional.

<NameObject>
  <Label xml:lang="la" type="concise">Euonymus europaeus</Label>
  <Label xml:lang="de" type="concise">Pfaffenhütchen</Label>
  <VariantSpellings>
    <VariantSpelling source="doi:10.12812878" location="201" 
revisionstatus="2">Evonymus europaeus</VariantSpelling>
    <VariantSpelling>Euonymus europaeeus</VariantSpelling>
  </VariantSpellings>

> Or...would you treat each variant as a separate NameObject (with its own
> GUID, and its own set of LC elements for canonicalName, CanonicalAuthorship,
> original orthography, etc., etc.)?

No. I see orthographic variants on a distinctly different level. According to 
my other post, I actually believe that this is another point where most likely 
different groups are providing data. I.e. the nomenclators and concept 
databases may provide some data, but GBIF itself may generate such data.

> > Evonymus europaeus sec. Richard Pyle 2000 (= canonical),
> > Euonymus europaeus sec. Richard Pyle 2000,

I am sorry that I am at the moment not able to fully express that as an TCS 
example, perhaps someone may help. I assumed that for a concept there would be 
a recommended label (recommended by the provider, in the absence of a code of 
taxon-concept suffixes there cannot be a canonical form) as part of the concept 
object. Is there not such a label?

If as a data consumer I obtain concept data and want to have a user pick from a 
list, do I have to create my own rules what is a label for the object for human 
consumption? That is not to say that based on atomic data, a user interface may 
create different labels, following local rules. But I believe a generic 
provided label would be hugely useful.

If there is such a thing, I see no problem in also providing spelling 
alternatives.

> In the above list, the variations of "Richard Pyle 2000" are are variations of
> the AccordingTo author (post-SEC.).  PLEASE don't tell me that you think that
> the schema needs to accomodate every possible way that every author who has ever
> cited a taxonomic name in a concept definition has or might be represented!!

Sorry I do. Accomodate implies no contract with data providers to provide even 
a single such variant. But if they are there, they may be hugely useful, I 
believe. I may be wrong, if you believe that it is plain impossible to map  
strings as used in checklists to denote a concept to unambigously map to a 
specific concept.

Imagine the red book lists - how often do people try to copy the spelling from 
there. If the list contains concept-specific names (which I believe most of us, 
me included, hope will pick up in the future), would it not be useful to be 
able to give that spelling for all the sources re-using the concept?

> I even have serious problems with designing the schema to accomodate
> variations of *name* authorships -- let alone concept authorships.

That is a good point I am silently thinking about myself. It makes significant 
sense to keep separate lists for name-without-author variation and authorship 
citation variation. The process trying to map strings to name objects would 
then have to multiply them out. Or you just have a single list and try it the 
other way round.

> For the concepts, there should be only one instance for "Evonymus europaeus sec.
> Richard Pyle 2000", and the "Evonymus europaeus" part should be spelled exactly
> the way Pyle spelled it in his 2000 publication.

If I were to cite this concept, I would correct the name to the ICBN-canonical 
one, and would still ass the "sec..." and I would mean to refer to exactly your 
concept. I think this may again be a different tradition in botany and zoology. 
In botany the importance of "original spelling" is relatively low.
 
> If datasets exist out there that record the concept to which they are
> mapping biological data as "Evonymus europaeus sec. R.Pyle 2000", that
> should *not* be something for TCS to accomodate.  That's a problem at the
> dataset side; not the transfer schema side.

Can you justify that statement? Why do you think it is NOT important to be able 
to find, e.g. GenBank molecular sequences for a taxon concept???

It may be placed in a separate standard, of course, but I think you cannot just 
move it to the information (rather than concept definition) provider. I think 
the name variant isssue is a natural extension of a GBIF name-standard and I 
propose to place it there. It is optional in any case.

Gregor----------------------------------------------------------
Gregor Hagedorn (G.Hagedorn at bba.de)
Institute for Plant Virology, Microbiology, and Biosafety
Federal Research Center for Agriculture and Forestry (BBA)
Königin-Luise-Str. 19           Tel: +49-30-8304-2220
14195 Berlin, Germany           Fax: +49-30-8304-2203



More information about the Tcs-lc mailing list