[seek-kr-sms] algorithms and the owlfication of taxon

Shawn Bowers sbowers at ucdavis.edu
Wed Nov 2 17:32:29 PST 2005


Nico,

I am trying to catch up on this discussion.  I found this summary to
be very informative. Thanks.

My concern/question is whether the variants of "synonym" relations
taxon is considering are well defined (i.e., in first-order logic or
some other formalism).  Is there a single list of such relations and
can you provide their definitions? Also, I would imagine that any
formal representation of these relations would also require some
formal representation for a taxonomy (in other words, the things that
are being related by these various forms of synonyms).  If this is so,
it would be useful to see that as well.

In response to Serguei: I think that having second-order style
definitions (or allowing instances of instances as in RDF) is not such
a big deal, so long as the various other types of constraints in the
language do not make the whole logic undecidable. There is an
interesting paper by B. Motik titled "On the Properties of
Metamodeling in OWL" that raise some of these issues, which I have not
read, but intend to at some point.

Thanks,
-shawn



Nico Franz wrote:
 > Hi all:
 >
 > I realize we've had this exachange before in some form and it's mostly
 > about playing catch-up on both sides. It's fun (for me too) to think
 > about representing taxonomy in an ontology format and think about the
 > potential services and benefits of such a move. I'll try to state my
 > current perspective in a way that might be helpful.
 >
 > For the Taxon group, the main challenge is actually not the
 > representation of any single classification with all its components
 > ("taxa"), their names and properties, their subcomponents, and
 > interrelationships (parent/child, etc.). We would gain next to nothing
 > from having a single-classification representation function in isolation.
 >
 > We're also not immediately charged with the task of merging parts of or
 > entire (multiple) classifications.
 >
 > Serguei told me yesterday that one of the main benefits of ontology
 > representation is "checking for internal (logical) consistency".
 > Discovery and correction of errors, etc. That is most certainly not what
 > Taxon is trying to do. We know that any given taxonomy is highly
 > idiosyncratic, implicit, and assumptive of a vaguely specified
 > background history involving select competent speakers. A single
 > taxonomic classification might not only turn out to be false in terms of
 > not representing the relationships or properties (composition) of taxa
 > correctly (as subsequent and more refined studies are bound to show).
 > The classifications are probably also highly inconsistent internally in
 > your sense of the work "inconsistent". Meaning, they will mention some
 > specimens but not all that went into the definition of a species, they
 > will mention some species but not all that go into the definition of a
 > genus, things will be left out here and there, partially contradict each
 > other, and so on.
 >
 > The issue here is that we are not charged immediately with improving
 > this state of affairs, i.e. helping taxonomist be better, more
 > transparent taxonomists from here on. Taxon has no "normative ambitions"
 > in terms of telling scientists how to produce classifications using a
 > more complete and formal approach (description logic rules, etc.).
 >
 > So what is Taxon's mandate? Basically, we're charged with building a
 > language and supporting infrastructure that will allow users and (in a
 > second phase) machines to make more sense of the semantic similarities
 > and differences between the components of multiple existing taxonomic
 > classification - to a higher degree of precision than can be achieved
 > using name strings and conventional taxonomic synonymy relationships
 > alone. We're trying to build tools for taxonomic experts to do "brain
 > dumps" on what they know about the classificatory history of their
 > groups of expertise but wouldn't be able to express clearly and
 > comprehensively without our assistance.
 >
 > To that end, we need to be able to import fairly decent representations
 > of at least two hierarchical classifications into a graphic interface.
 > For our purposes those representation do not have to be any more
 > ontology-complying than the original classifications (which largely
 > weren't I would think).
 >
 > Then in a second step, we need to provide taxonomic experts with a more
 > powerful language than "is a synonym of" in order to assess the semantic
 > similarities and differences of elements ("taxa") defined in the two
 > classifications. That language will use terms like "is congruent with",
 > "excludes", "is less inclusive (taxonomically) than", etc. Those
 > assessments require the assessor to be intimately familiar with the
 > written and unwritten idiosyncracies of the two classifications. We're
 > talking about people here whose lifetime work was exactly that -
 > learning the taxonomic history of a specific group as captured in the
 > literature and museum collections. And different experts may still come
 > up with different judgments when confronted with the same two
 > classifications.
 >
 > Then once we have those more semantically informative assessments of
 > interrelationship, we can reap benefits by constructing more powerful
 > searches on biological data, and make more informed choices and when to
 > integrate the information associated with taxonomic names, or when to
 > keep it separate. At that stage we would benefit from being very
 > explicit and consistent about how we handle searches and data
 > integration steps.
 >
 > Just for fun I've attached a blurp from Rich Pyle about a particular bit
 > of taxonomic history concerning a group of fishes. Let me know if this
 > was helpful.
 >
 > Cheers,
 >
 > Nico
 >
 > **********
 > Here's a case in fishes that might meet your needs (family Sparidae):
 >
 > Pagellus calamus Valenciennes in Cuvier & Valenciennes 1830 was
 > described on the basis of four syntypes (MNHN 5565, 5566, A-8101 &
 > 8664). As of 1966, apparently two of these (5566 & 8664) had been lost
 > or destroyed, so only two remained.
 >
 > Calamus pennatula Guichenot 1868 was apparently based on the same series
 > of type specimens as P. calamus -- which means that one would at first
 > assume pennatula to be an objective (homotypic) synonym of calamus.
 >
 > However...
 >
 > Randall & Caldwell (1966) examined the two existing syntypes of P.
 > calamus (MNHN 5565 & A-8101), and discovered that they represented two
 > different species. They selected one of them (A-8101) as the lectotype
 > of P. calamus, and selected the other (MNHN 5565) as the lectotype of of
 > C. pennatula, thereby preserving both names.
 >
 > But there's more:
 >
 > Swainson (1839:171) described the genus-group name Callamus (as a
 > subgenus of Chrysophrys Quoy & Gaimard 1824), the type species of which
 > is Calamus megacephalus Swainson 1839:222 (by monotypy). However
 > according to Jordan & Gilbert (1884:18) and Randall & Caldwell
 > (1966:36), Swainson used the species epithet "megacephalus" only because
 > it was customary at the time to create new species epithets to avoid
 > tautonyms, and his "megacephalus" is treated as a junior synonym of
 > Pagellus calamus Valenciennes in Cuvier & Valenciennes 1830.
 >
 > So...here is a case of one series of syntypes, with two different names
 > based on that same series of syntypes, and two different species
 > represented among that same series. One of those species is the defacto
 > type species of a genus (although I doubt that anyone would ever split
 > the two species into separate genera).
 >
 > And as if that's not enough....
 >
 > Randall & Caldwell also describe a similar situation for Pegallus penna
 > Valenciennes in Cuvier & Valenciennes 1830:209. Among its three existing
 > syntypes, two are what are now considered to be Calamus penna (one of
 > which Randall & Caldwell designated as the lectotype), and the third is
 > identified as C. pennatula.
 > **********
 >
 > Serguei Krivov wrote:
 >
 >> There are many ways to represent biological taxonomies in OWL. The
 >> main problem here is how to avoid a second order style logic i.e.
 >> assigning properties to classes rather then specifying properties of
 >> objects by defining classes. There is temptation to use owl as meta-
 >> language of taxonomy rather then as the language of taxonomy (which it
 >> is intended to be), or say it metaphorically writing OWL interpreter
 >> for OWL.
 >>
 >> I believe this could be easily avoided. Here is how I would represent
 >> the part of taxonomies from Daves design document:
 >>
 >> Each instance of class species would have attributes hasKingdom,
 >> hasPhylum, etc. One could also add hasAuthority, hasReference etc. And
 >> so we describe species exactly as humans do. Now the question is how
 >> to say that all Anthropoda are Animals and all Chordata are Animals.
 >> It is easy in OWL if we use subsumption axioms on anonymous classes:
 >>
 >> this states that anonymous class hasKingdom:Animals (property value
 >> restriction) is subclass of anonymous class hasPhylum:Anthropoda. Now
 >> when subsumption relation is established one could use owl reasoner to
 >> check consistency
 >>
 >> ciao,
 >>
 >> serguei
 >>
 >> 
-------------------------------------------------------------------------------------- 

 >>
 >> Serguei Krivov, Assist. Research Professor,
 >>
 >> Computer Science Dept. & Gund Inst. for Ecological Economics,
 >>
 >> University of Vermont; 590 Main St. Burlington VT 05405
 >>
 >> phone: (802)-656-2978
 >>
 >> -----Original Message-----
 >> From: dave thau [mailto:thau at learningsite.com]
 >> Sent: Wednesday, October 26, 2005 11:22 AM
 >> To: Serguei.Krivov at uvm.edu; bertram
 >> Subject: algorithms and the owlfication of taxon
 >>
 >> Hello,
 >>
 >> Attached are two documents you may find interesting. The first was the
 >>
 >> first assignment in my algorithms class. The puzzle I described yesterday
 >>
 >> is part II.
 >>
 >> Second, when I first started working on SEEK, I tried to pitch OWL as the
 >>
 >> most appropriate representation for the Taxon stuff, but didn't get too
 >>
 >> far. I did a little work doing a couple of representations, and a
 >>
 >> graduate student of Susan Gauch went further in documenting options. This
 >>
 >> dates from about 3 years ago, and we were all just learning OWL DL, so it
 >>
 >> may be poorly informed. But it'll give you a notion of the thinking at
 >>
 >> the time.
 >>
 >> Dave
 >>
 > _______________________________________________
 > Seek-kr-sms mailing list
 > Seek-kr-sms at ecoinformatics.org
 > http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/seek-kr-sms
 >



More information about the Seek-kr-sms mailing list