[tcs-lc] Modularisation of standards

Richard Pyle deepreef at bishopmuseum.org
Tue Mar 8 03:10:17 PST 2005


Hi Donald,

> Assume a document in which two concepts refer to the same published name
> (using an abbreviated representation of TCS data):

[...]

> However will it ever matter to an application processing such a document
> that the two <Name> elements are the same?  Do we need a better way to
> indicate this than simply relying on the byte-identity of the XML content?

My answers: Yes, Yes, and Yes.

> An alternative representation (if this mattered) would be:
>
> <TaxonConcepts>
>   <TaxonConcept id="tc1">
>     <NameRef id="tn1">
>     <AccordingTo>Smith</AccordingTo>
>   </TaxonConcept>
>   <TaxonConcept id="tc2">
>     <NameRef id="tn1">
>     <AccordingTo>Jones</AccordingTo>
>   </TaxonConcept>
> </TaxonConcepts>
> <TaxonNames>
>   <TaxonName id="tn1">
>     <Label>Aus bus</Label>
>     <CanonicalAuthorship>Black, 1965</CanonicalAuthorship>
>   </TaxonName>
> </TaxonNames>

The version I've been mulling about in my mind would look something more
like this:

<TaxonConcepts>
  <TaxonConcept id="tc0" type="Nominal">
    <Label>Aus bus</Label>
    <CanonicalAuthorship>Black, 1965</CanonicalAuthorship>
    [...and all the other LC bits]
  </TaxonConcept>
  <TaxonConcept id="tc1" Type="Original">
    <NameRef id="tc0">
    <NameVerbatim>Aus bus Black, 1965</NameVerbatim>
    <AccordingTo>Black, 1965</AccordingTo>
  </TaxonConcept>
  <TaxonConcept id="tc2" Type="Revision">
    <NameRef id="tc0">
    <NameVerbatim>Aus bus Black</NameVerbatim>
    <AccordingTo>Smith, 1970</AccordingTo>
  </TaxonConcept>
  <TaxonConcept id="tc3" Type="Revision">
    <NameRef id="tc0">
    <NameVerbatim>Aus bea Blk., 1965</NameVerbatim>
    <AccordingTo>Jones, 1975</AccordingTo>
  </TaxonConcept>
</TaxonConcepts>

This approach would provide for modularization of full nomenclatural data
(i.e., restrict to TaxonConcept type="Nominal"), while simultaneously
forcing all name-only data sources to conform to a Nominal Concept (i.e., no
confusion about plugging into a name directly, vs. plugging into a concept).
It would also simultaneously disentangle "Name" relationships from "Concept"
relationships (the former would be available only in TaxonConcept instances
of Type "Nominal"). Note the "NameVerbatim" element to capture orthographic
variants that do not rise to the level of a "different name" (sensu both
Botany & Zoology). It's open to discussion whether this would include
nomenclatural authorship, but the point is it alleviates the need to create
a Nominal Concept for every single orthographic variant.

The downside is that it *requires* the inclusion of a corresponding
"Nominal" concept for every different name used among the set of non-Nominal
TaxonConcept instances provided in a Dataset package.  If a package
contained a 1:1 relationship between Concepts and names, then there wouldn't
be any substantial reduction of overall package size (infact, a slight
increase in package size).  Conversely, the package size would be decreased
(sometimes substantially) in cases where a DataSet included multiple
concepts that applied the same name (i.e., the name elements would need to
be provided only once for each name; not duplicated for every TaxonConcept
that applied a given name).  And, if Universal Registration is ever
implemented for taxonomic names (i.e., looking ahead ten years), it would be
a no-brainer to substitute the NameRef id="tc0" with a NameRef GUID.

> [Note that it is much harder reliably to assign and police meaningful
> identifiers for name elements if they are fully embedded.  There would
> certainly be no way to enforce a single consistent representation in all
> occurrences for the same <Name> across different <TaxonConcept> elements.]

How would this statement apply to the version I included above?

> Will there ever be any reasons why tools that process TCS data would be
> better served by the more normalised form?  Here I am less sure.
> Do we have any use cases that would drive us that way?

What about the day when Universal Taxon Name registration is implemented?
Following from Roger's point about considering where we might be and what we
might need ten years from now (perhaps sooner?), we shouldn't ignore the
value of ID references to name objects as an integral part of the schema.

Aloha,
Rich





More information about the Tcs-lc mailing list