[tcs-lc] nameObjects, spellings, vernaculars, etc

Richard Pyle deepreef at bishopmuseum.org
Fri May 6 04:21:52 PDT 2005


> If the NameSimple element of TCS IS intended for verbatim spelling
> then it needs another name, for clarity.

The history of "NameSimple" in TCS preceeds LC/Christchurch.  I'm not sure
of its original intend, but I suspect it was created without a lot of
thought to the distinction between "Code-correct" and "verbatim"
name-strings (no criticism intended -- to be honest, I hadn't thought much
about it either before we really started discussing LC).  In Christchurch, I
think the LC breakout groups sort of assumed it would be a canonical
concatenation....but that was before "Label" was introduced as a root
element in LC.  I've asked the question a couple of times (what is the
specific function of NameSimple), and I even re-christened it "NameVerbatim"
in the version of TCS/LC that I sent (it didn't seem to get much
traction...), to achieve exactly what you are suggesting.

> Otherwise it should be
> exactly the same as the Label element of LC and be in the canonical
> form, with another element there for the verbatim or as-published
> spelling.

Why does there need to be two elements, with different names, in different
parts of the overall schema, that share exactly the same purpose?  One could
argue that it would serve the function of canonical name in cases where one
only has the name, and has not (yet) established a link with a full
canonical name object.  But I would counter that argument with the point
that, such cases imply that one has not identified the proper canonical
name, and as such, what else would one have, besides the verbatim name?

That said, I would STRONGLY advocate re-naming to "NameVerbatim", or
"VerbatimSpelling", or something like that.

> A well designed schema should have element names which 'do
> exactly what it says on the tin'

Actually, I think "NameSimple" leaves a lot of latitude for interpretation.

> - OK - I see your point. As it happens, IPNI will only be recording
> the first publication of a name, so the number of orthographic
> variants is limited to the original spelling of the author, plus any
> corrections (or mistakes) made by IPNI rendering that into canonical
> form. But other databases of course will record more than the first
> use of a name. In that case each publication instance can come with
> only one orthographic variant (unless the author has been
> inconsistent within the article or book).

There are cases (more than just a rare few) where a single author will use
more than one spelling in the same publication.  Sometimes this is clearly a
lapsus or printer's error, ad can be safely ignored or mentioned in a
human-readable comment somewhere.  At the other extreme are cases where the
author used two different spellings where it's not so obviously a lapsus. I
can give examples, if you're interested.  But I don't think this is
something that needs to shape the structure of LC.  I would say it should
support only one verbatim spelling per AccordingTo publciation/NameObject
instance.

Also, when you say "original spelling of the author" -- can I safely assume
you mean "original spelling of the scientific name by the original author",
and not "original spelling of the author's name" (e.g., "L." vs.
"Linnaeus")?

> To me it seems simple (I know you will correct me on this point) -
> each concept will have one publication instance and hence one
> orthographic rendering, which may be reproducibly correctable to one
> canonical form.

No corrections!  This is exactly what I feel as well!

> Therefore the LC part of the schema needs to have a
> place where the (single) 'as published' name goes, plus a place
> (Label) where the canonical form goes.

In LC, I assume you mean the "as published" name is the verbatim name as it
appeared in the original description/protologue?  If so, yes!

> I thought this was in the schema already.

I thought that's what "OriginalOrthography" was for (an element I
wholeheartedly support, because this is a special-case "Verbatim" spelling,
separate from the concept instance).

> Multiple versions of the same name-object will be
> mapped onto each other by mapping concepts to concepts, because each
> version should have a publication-instance of some sort.

Yes, but if Names are treated as stand-alone objects (as in v0.95.5), then
the multiple verbatim renderings of the same "name" will also be
cross-linked to each other by virtue of the fact that all of these concept
instances will point to the same "NameObject" (LC instance).  Thus, the
name-links would exist even without the concept-concept mappings.

> I think Rich and I are in agreement here ...

As do I!! :-)

> as to what consitutes a name object, I leave that to the real
> taxonomists

So far (as in my previous), it seems to be:

Botany View:
"GenusOrMonomial Name-Unit [+ species Name-Unit [+ tertiary Name-Unit +
tertiary Name-Rank]]"

Zoological view:
"Name-Unit"

Just so everyone is clear, "Name-Unit" is not simply the string of
characters that form a single component of a scientific name.  Rather,
"Name-Unit" implies a well-defined "object", with multiple inherent
properties such as the creation event (=protologue), and many/most of the
elements in LC.  It's what I would call a "Protonym".

Any other candidates to define a "NameObject"???

Aloha,
Rich




More information about the Tcs-lc mailing list