[tcs-lc] Minor modifications prior TDWG ratification vote

Nozomi Ytow nozomi at biol.tsukuba.ac.jp
Sat Sep 17 16:10:00 PDT 2005


Rich,

> When you said "Rotifera vodka", I assumed you were
> using them as Genusname speciesname, but perhaps you meant Rotifera as a
> Phylum, with uncertain placement within a higher taxon (i.e., not
> Ciliata)?

I meant genusname species epithet.  To avoid confusion, I use
Rus. vodokus (looks nicer) intead of Rotifera vodka.

> I am also not sure whether you are saying that it would be used
> to flag a taxon without reference to a parent?

I meant to record taxonomic assertion with explicit statement
that the author or data provider is unsure placement of the taxon.
It may come with reference to non-direct higher taxa.

> For example: "I don't
> believe that Rotifera is in Ciliata, but I do not know what parent it does
> belong to".

It is what I meant except I used rotifer becaue the author is unsure
placement of what [s]he throw away from ciliates.

> If only "Rotifera" is provided, with no parent, then I do not
> consider it to be "incertae sedis".

It is cause of your difficulty, I suspect, because the author didn't
give name in the scenario.  When the author made his/her statement,
rotifer designate a group of organisms including Rus vodkus excluded
from ciliates.

> I believe it can not be of "uncertain
> position" unless it is so within some parent envelope, even if that envelope
> is of rank "Domain".

Don't go to phylosophy and design of your data structure.  The issue
is whether TCS can pass regression test.  If TCS requires too
inteligent XSLT as your discussion implies, we'd better to look for
more practical schema for exchange.


> If only the name "Rotifera" is provided, with no
> indication of a parent, then I don't see it as having the property of
> "incertae sedis", because I think that property is tied to a parent-child
> *relationship*, not a name or taxon itself.

I mentioned on it in my previous article as an alternative way.  It is
sterightforward logically, but it introduce practical risk to require
semantics check which can be avoided by better desing of shema and
validation mechanism.

> Whether the data provider
> didn't know what parent the taxon should be classified under, or simply
> failed to provide it, seems to make no difference for the interpretation of
> the data.  If the data provider wants to explicitly convey "incertae sedis",
> and doesn't even know what Domain it belongs within, then the parent could
> be given as the "Superdomain" of "Life".

It would be right as design principle for your database, but I think
it is too fundamental position to provide a practical exchange schema
usable with XSLT.

Points you made in previous post in
Message-ID: <IMEKKFHEGHHDDDHKIOJEAENNDEAA.deepreef at bishopmuseum.org>

> You do not need the entire chain of higher taxa to
> determine incertae sedis status -- only a relationship between a child taxon
> and a parent taxon, where the rank difference exceeds the threshold.

There can be cases where we can't give threshold appropriately.  See
also Paul's post.

> Without a parent-child relationship, "incertae sedis" has no
> meaning.

Disagreed.  There are cases where "incertae sedis" is stated because
author can't give the higher taxon.

> Or, is it a case of "Unidientified Ciliata", misidentified as "Rotifera
> vodka"?

I don't think so.

> Or, am I not understanding the example?

I suspect so, as you wrote later post (see above).


> Suppose "we" (TDWG, TCS, whatever) define "incertae sedis" as I did in the
> previous message -- where one of the "root" rank levels has been skipped in
> a direct parent-child relationship (e.g., OrderName is parent of
> GenusName).

I can't suppose it because you assume mandatory ranks.

> If we are given a TCS record involving a GenusName, with "is child of"
> relationship to a TCS record involving a OrderName, then by our definition,
> we recognize it as incertae sedis.

Why user should provide NameObject?  Does the standard requires
NameObject and CanonicalName mandatory for scientific even though
minOccurs is zero?


> However, suppose the data source includes a list of many genera within one
> family, and all but one of these genera are linked to one of several
> subfamilies.  The data source might then choose to treat the one GenusName
> that is not linked to a subfamily as "incertae sedis within Familyname".
> Within the source context, it is an "incertae sedis" placement even though
> it does not skip a root rank, and therefore would not be recognized by a TCS
> parser as "incertae sedis" by logic rules.

It is what I said....

> So, my question was, would the purpose of an explicit incertae sedis
> designation within a TCS record be intended to represent "treated as
> incertae sedis by the source data provider, even though by our definition it
> would not be flagged as incertae sedis"?

I don't know because I don't think your logic rule and your smart
parser is not any part of TCS 1.00.  Or, did I miss something
specified in the schema?  Don't go beyond the schema for retification.

> That is a reason why I could understand the need to capture incertae sedis
> explicitly (i.e., not everyone defines it the same way in all contexts, and
> therefore the TCS record should capture how the data provider defined it on
> a case-by-case basis).

I think now you'r getting my point.


> Or, are we assuming one "standard" definition of incertae sedis, but it is
> too much processing requirement to calculate whether a given name is
> incertae sedis within its parent taxon, because it requires retrieving the
> rank of the parent taxon as well as the current taxon? In this case, my
> point is that incertae sedis has no meaning except in the context of a
> parent-child relationship, so flagging a name as such without also knowing
> details about the parent taxon (including its rank) doesn't provide us with
> useful information.

I think this pattern requires higher bandwidth because it requires to
all TaxonConcept instances to carry higher taxon to assure
that the TaxonConcept instances returned are not incertae sedis.
"Don't mind XML or use other way" is a reasonable attitute, but it is
unacceptable statement in bandwidth limited developing countires.


<snip>
> However, a strict interpretation of the term does not necessarily
> conform to this definition, and indeed it is legitimate to call a genus
> "incertae sedis" within a family if all other genera within the family are
> classified within subfamilies.  If this more strict definition is assumed,
> then I can understand a desire to capture the data source's flagging of
> incertae sedis status.

Unnecessary to assume, but accept or allow.  TCS does not prohibit at
least explicitly.

> But at another level, I don't really understand, from an information
> management perspective, what information value we can derive from the
> "incertae sedis" flag anyway -- whether or not it is defined
> algorithmically, or is explicitly indicated by the data provider/source.
> What does it tell us that is not already self-evident from the other data?

It tells us the fact without the oter data unecessary available.
Don't assume perfect world.  Nitting fragmented data is one of important
applications of TCS.


> > I don't know SEEK-Taxon's requirement, but appropriate support of
> > vernacular names is definitly necessary for GBIF.
> 
> I'm basing my recollection on the first LinneanCore break-out group meeting
> we had in Christchurch (you, me, Chris, Anna, Sally, Gregor, Donald, etc.),
> which (I believe) was not attended by anyone from the TCS group or SEEK. I
> thought we were clear that LC (=NameObject) was limited to Code-governed
> names (and possibly Cultivars and Trade Names -- I don't remember where that
> ended up).

Yes, it is stated so in the TCS v1.00.  We agreed LC/NameObject is
only for scientific name (it made Anna unhappy for misunderstandigs...I
remember it was chaird by Jessie), and hence TaxonConcept is place for
vernacular name at least in TCS v1.00.

> But maybe I remember things incorrectly.  I agree that GBIF needs to
> accomodate vernacular names, and I don't dispute that vernacular names need
> to be accomodated (they are in TCS), or even that they need robust structure
> with name-name relationships & such (not, I believe, well accomodated by
> TCS).  But if they do need more robust structure, then do you think it
> should be built within the Concept substructure, or within NameObject,
> or...?

I lost.  Why vernacular name can fit to NameObject?  We have
TaxonConcept/Name as container for vernacular names.  If it doesn't
fit we may need such as VernacularNameObject, but it would indicate
that "name as concept" principle is broken.  I don't find nothing
wrong in the principle besides difference in opinion about details
such as Code could be a kind of language or locale more generally.

Cheers,
JMS


More information about the Tcs-lc mailing list