[tcs-lc] Names as Objects

Tue Mar 8 19:03:00 PST 2005

> no - as you understand later -  they become original concepts
> (but all original concepts have an equivalent nominal concept
> According To NULL)
> the name correction would still be according to someone surely.....

Yes, but my point is that "Nomenclatural Act Credited To..." should be
treated separately from "Concept Circumscription According To...", because
two very different kinds of information are involved.  In one case, the
information pertains to a name object, and affects the string of characters
that should form a label attached to a concept.  In the other case, the
information pertains to the size/shape/position of a concept
circumscription.

> The nominal concept is there so that someone identifying
> something later, without saying what meaning they meant, can use
> the nominal version.

O.K., so why isn't this the *perfect* representation of a "name object"?  As
I understand it, we all agree that a Linnaean name, by itself, with no
reference to specimen objects other than the primary type, does imply a sort
of "fuzzy" concept that essentially boils down to the "sum of all concepts
that have been attached to this name, or have been attached to names that
have been treated as synonyms of this name".  Isn't that exactly what a
Nominal concept is intended to be used for?

> >  As
> >for "correct" -- there are two kinds of corrections -- those
> >related to name
> >objects, and those related to concept circumscriptions.  The former are
> >about "name" corrections as governed by Codes, and the latter are about
> >corrections in order to adhere to more modern concept definitions.
> >
>
> yes but they both result in a biologist having to use a name when
> they mean something i.e. a concept, so treating both as original
> concepts albeit one without a good definition.

O.K., maybe it would be helpful to refer back to the PowerPoint file from
the TCS Wiki:
http://www.soc.napier.ac.uk/doc/Demo_V2.ppt

and the instance document:
http://www.soc.napier.ac.uk/doc/Demo_V2.xml

To a nomenclaturalist, there are the following Name objects:

1. New genus-group name "Aus", originally described in a publication
authored by Linnaeus in 1758, with designated (by monotypy) type species
name, "Aus aus Linnaeus 1758".

2. New species-group name "aus", originally described in a publication
authored by Linnaeus in 1758, who combined it with the genus-group name "Aus
Linnaeus 1758".

3. New species-group name "bea", originally described in a publication
authored by Archer in 1965, who combined it with the genus-group name "Aus
Linnaeus 1758".

4. New species-group name "cea", originally described in a publication
authored by Fry in 1989, who combined it with the genus-group name "Aus
Linnaeus 1758".

5. New genus-group name "Xus", originally described in a publication
authored by Pargiter in 2003, with designated (by monotypy) type species
name, "Aus bea Archer 1965".

To a botanist, there is one additional name object:

6. Combination species-group name "Xus beus", established for the first time
in a publication authored by Pargiter in 2003, who combined it with the
genus-group name "Xus Pargiter 2003", with a basionym relationship to the
species-group name "Aus bea Archer 1965".

Two of these name-objects (#3 & #4) each have among their attributes an
orthographic variant:

3a. Orthographic variant epithet "beus", of the name-object "Aus bea Archer
1965".

4a. Orthographic variant epithet "ceus", of the name-object "Aus cea Fry
1989".

(Whether or not Pyle 1990 should be cited for these gender-matching
corrections is not clear, and needs further discussion.)

Note that neither ICZN, nor ICBN would treat 3a or 4a as distinct "name
objects", but rather as attributes (variants -- corrected, in this case) of
the name objects 3 & 4.

So here is my question to you:

Given the instance document structured as you have it, how do I extract just
the nomenclatural information about these 6 name-objects (two of which
include one orthographic variant each)?

I see 8 "Original" concepts, and 8 "Nominal" concepts.  The Nominal concepts
(as they are formatted here) do not help me, because they do not establish
unambiguous relationships between a combination and its basionym, a genus
and its type species, any link to the original publication, etc.

Most of the nomenclatural information I need can be found among the set of
"Original" concepts, but the nomenclatural information is mixed in with the
concept information. For example:

- only one of the 6 listed vouchers for Aus aus L. is of nomenclatural
interest to me (same applies to other instances

- the only way I can find out the original genus placement of a name like
"Aus bea Archer 1965" is to parse out the first part of the content of
<NameSimple>, and hope it's not a homonym.

- the only way I can tell for sure that "cy1" is not a distinct name object
itself (from a nomenclatural perspective) is to tunnel down and discover the
existence of <Relationship type="is validation of"> -- something I would
have to specifically filter for in order to transform this instance into an
orthographic variant of ca3, rather than a new name object.

- It would take a bit of processing overhead to examine the various
Relationships within cp5 to sort out that this instance represents the first
combination of the species-group name "Aus bea Archer 1965" with the
genus-group name "Aus Linnaeus 1758"., rather than a new species-group name.

I have no doubt that we could come up with a set of logic rules to allow a
robust software tool to munch through this entire instance document and pull
out just the 6 name objects (with their nomenclatural connections) that are
of interest to a nomenclator. But the co-mingling of concept-relevant
information with name-relevant information, and the need to cumbersomely
extract and digest the nomenclaturally important bits out from the concept
bits makes this schema very unappealing to the name camp.

What I am proposing (and hope to do so formally tonight, including a
re-working of the Demo_v2.xml instance document) is a compromise that would
solve a variety of needs simultaneously, without actually changing much of
the existing TCS structure (just some of the business rules)

> yes that's why we introduced nominal concepts but your nominal
> concept does have an according to - to the person who made the
> name change in the publication that name change was proposed.

I think this comes back to my point that "Nomenclatural Act Credited To..."
is fundamentally different from "Concept Circumscription According To..."

> >That's exactly the sort of "concept" we want name-only data to
> >default to.
> >
> yes I agree that name only data used in identification should
> default to nominal concepts i.e. scientific name AccordingTo NULL
> rather than just scientific name.

Yes, but then it's a separate step (finding the corresponding Original
Concept) that is needed to get the nomenclatural details.  When a
nomenclator passes a name object, the implied concept circumscription is the
one you would type as "Nominal".  I think what you are trying to do here is
establish "'Linnaeus 1758' as author of name-object 'Aus'", using the
"AccordingTo" structure that is normally used to mean "'Linnaeus 1758' as
definer of a concept circumscription to which he applied the name 'Aus'".

My point all along is that the label "Aus Linnaeus 1758 SEC. Linnaeus 1758"
has two copies of "Linnaeus 1758" for a reason.  The first "Linnaeus 1758"
is to disambiguate the genus name "Aus" from homonyms, and to point to the
original publication in which the relevant name-object "Aus" was first
established.  The second "Linnaeus 1758" serves the function of specifically
referencing the concept circumscription that Linnaeus intended his name
"Aus" to apply to.  From what it seems you are saying, you are trying to
capture both references to "Linnaeus 1758" in the concept-label "Aus
Linnaeus 1758 SEC. Linnaeus 1758", using only one "AccordingTo" pointer.
And I think that is not a wise approach to modeling these two different
pieces of information.

> but your nominal concepts would not be simple scientific names
> according to we don't whom they would have other information
> about them that the biologist using a name probably had no
> knowledge of or intention of referring to.

The *only* other information I would propose to include in a Nominal Concept
instance would be nomenclatural information (authorship attributes, pointers
to original description, pointer to basionym, perhaps a pointer to the
primary type specimen; etc.).  None of that information would in any way
alter the "fuzzy" nature of the Nominal Concept circumscription.  If the
biologist provided a Linnaean name as governed by one of the Codes, it
doesn't matter whether said biologist intended to refer to the author of the
name, or the year it was published, or its type species, or whatever -- but
the point is, these pieces of information are objectively tied to the name
that the biologist used.  It doesn't in any way affect the "Null"-ness of
the concept circumscription definition.

> >Yes, but you agree that the schema *structure* of
> >RelationshipAssertions
> >could be used for both purposes -- right?
[...]

> yes, but if I wanted to know how a taxonomist had defined his
> taxon then I would need to find and package the relationships
> with the same AccordingTo up with the rest of the TaxonConcept.
> So I would be modelling TaxonConcept as if the relationships
> weren't part of the definition. We were told by taxonomists that
> that's how some of them define their taxa so it seemed
> fundamental to capturing the semantics of a TaxonConcept.

Agreed!  And that's why I think it makes "elegant" sense to de-couple the
"definitive" Relationships from the "interpretive" Relationships, as TCS
currently does.  But the point is this:  you now have a bunch of taxonomists
(not just me) telling you that "Name" attributes aren't part of the concept
definition *either*.  To us, it seems fundamental to modularize the name
objects and their corresponding name-name relationships separately from the
concept circumscriptions and their concept-concept relationships.  So, just
as you have appeased the "other" taxonomists by separating the "definitive"
Relationships from the "interpretive" Relationships, several of us are
asking you (begging you) to find a way to separate the name attributes and
name-name relationships from the concept attributes and concept-concept
relationships.

The extreme approach (which some on this list advocate) is the earlier
"thought-experiment" approach of treating names as separate top-level
objects.  What I am personally proposing is a more moderated approach (which
I am increasingly believing serves both the TCS community and the LC
community more effectively than either the current TCS, or the
names-as-top-level-objects approach), is to harness the intuitive "power" of
the Nominal concept. More on this later tonight.

> >> only if that unique string of characters was published in a
> >> taxonomic publication such as a monograph or as a result of a
> >> name change to satisfy one of the codes and therefore to be
> >> intended to be used by people in the future - not just any name
> >> string - I thought I made that quite clear.....
> >O.K., then let me ask this:  If someone defines a new concept
> >in a way that
> >doesn't conform to the taxonomic publication/monograph/etc. as
> >you scope it
> >in the quoted text above, and uses a name-string that is not
> >identical to
> >the corresponding name (i.e., a misspelling), then where in
> >TCS is the "Name
> >as spelled in this concept definition"?
>
> sorry about this but to be clear - do you mean someone creating a
> new concept in a publication and they published the name wrong
> (mis-spelled it)

Yes.

> or do you mean someone entering someone else's
> concpet (already described in a publication) into a database and
> therefore mis-entering the data

No.

I'm talking about a real concept definition (with vouchers, and character
diagnoses, and such), that was *NOT* "published in a taxonomic publication
such as a monograph or as a result of a name change to satisfy one of the
codes and therefore to be intended to be used by people in the future."
Where do we record the misspelling as used by the author of the new concept
definition?  Do we just ignore it? Or does it go somewhere specific in the
existing TCS?

> >- Nomenclaturalists have an unambiguous way to package the information
> >they're interested in (i.e., Nominal concepts)
>
> then we can't use Nominal concepts for what we meant them to be in
> the first place - which was to allow biologist who use scientific
> names in labelling things without saying what meaning they meant.

Why not???  Attaching purely nomenclatural information to a Nominal concept
instance does not say anything about what the concept "means" (other than
the fact that it includes the primary type specimen of the name itself).
Nothing about what I am proposing attaches *any* circumscriptional "meaning"
to the Nominal concept -- it still represents a fuzzy "sum of all concepts
that might have referred to this name".

> i.e. Aus bus AccordingTo NULL
> because your nominal concepts would have an AccordingTo surely? -

No!!!  "AccordingTo" would still be NULL.  Instead, somewhere within
"NameDetailed", would be something along the lines of "DescribedIn", which
would point to the Code-accepted publication instance in which the name
became nomenclaturally available.  This would unambiguously NOT be confused
with "AccordingTo", which involves a concept circumscription.

The problem I have with treating the "Original" concept instance as the
"bearer of the name", is that it's not clear whether "AccordingTo" means
"DescribedIn", or, "in the concept circumscription sense of".  I believe
that for "Original" Concept instances, it should always mean the latter.
The "DescribedIn" information should be berried somewhere among the
nomenclatural elements of the corresponding "Nominal" Concept.

> the name of the person who proposed the change - doesn't the
> Nomenclaturalists want to record/track that?

Perhaps -- depends on whether the change is a Nom. Nov/Replacement name
(governed by both codes), an official "New Combination" (ICBN), or just an
orthographic variant (though both Codes have rules about orthographic
agreement and such, I don't believe that either keeps track of which
publication instance first implemented the orthographic correction). Where
in the schema each attribute would go for these various cases would depend
on what this group feels is important to nomenclature.

> >- The same set of Name elements can be re-used and referenced
> >by more than
> >one concept object, and hence, smaller data package size for datasets
> >involving multiple concepts attached to the same name
>
> could be ok as long as names are never changed - if we want to
> maintain a true record of concepts as defined or indeed any data
> referencing one of these names.

My idea of "NameVerbatim" as an element of the concept would be used to
capture the actual name-string used by in the concept definition.  The
pointer to the Nominal concept would only imply the "intended" name object
(i.e., "although he misspelled it as 'Aus bea', the author of this concept
was referring to 'Aus bus Linnaeus 1758').  There is one nomenclaturally
correct representation of that name object, which may change over time as
nomenclatural acts are applied to it -- but the "NameVerbatim" element
within every TaxonConcept instance would preserve the "name as used in the
concept definition".

> >- Minimal impact to overall TCS schema (relative to more radical,
> >hither-to-fore hypothetical, alternative approaches).
>
> I agree it's more in-line with TCS thinking except for the points
> I made about the difference between original and nominal concepts.

Right -- those are the points that require more discussion.  I think it
might help if I took the time tonight to re-jigger the Demo_v2.xml instance
document according to how I envision it.

> >As I tried to explain in earlier email, I think there are
> >unambiguous cases
> >of relationships that involve two names (e.g., "Is basionym
> >of", "Is type
> >species of", etc.), which are distinct from relationships that
> >involve two
> >concepts (e.g., "Includes", "Is Congruent", etc.)  And I think it is a
> >mistake to blend these two kinds of relationships together in
> >the same data
> >structure.
>
> I'm not sure you could enumerate these relationships for the
> different types of concept in XML - but I think some of this
> might be application rules.
>
> Anyway it would be easy to interpret them as such. If I have two
> concepts with a name-based relationship between them I clearly
> know it's talking about the name parts of the 2 concepts.

I know it *can* be done this way -- I just think it means a less modular way
of treating names vs. concepts, and therefore less useful to the
nomenclators and others who want to extract only names information.

I'll look at this more later tonight, and try to show you what I mean using
concrete examples.

Aloha,
Rich