[seek-kr-sms] OWL - taxonomy
Nico Franz
franz at nceas.ucsb.edu
Mon Oct 31 21:32:28 PST 2005
Hi all:
To continue our OWL-taxonomy exchange I've opted to cut out some
select passages from previous e-mails and also added some new points at
the end.
First, *Bertram* wrote in response to my estimate that taxonomies are
often inconsistent:
"In logic the term 'inconsistent' is quite different from 'incomplete'.
Some examples you refer to above seem to indicate that taxonomies are
often incomplete (which is common and often unavoidable in logic
formalizations) and maybe only occasionally inconsistent (which is much
more problematic in logic).
It would be interesting to see to what extent an individual taxonomy is
consistent with another one, with itself. Also notions of 'relative
completeness' or 'subsumption' might make some sense when applied to
taxonomies.
Here are my concrete questions:
How can we use TAXON within Kepler?
Are we "stuck" with the current use of taxon support in EML, or what can
we do beyond that?
Can we reuse some of the SMS infrastructure of Kepler to deal with TAXON
information?
For the latter, it might be helpful to capture some of the TAXON
information in a form that could be used by SMS.
Maybe we could drive this discussion by a specific use case that is
realistic both in the use of TAXON and in the use of data analysis
steps... Do we already have such a use case??"
**********
*My response to this:*
- I suppose I could use a helpful, exemplified account of what
"inconsistent" means in OWL-DL. But see also my discussion below.
- I almost want to punt on the Taxon/Kepler question. I think for the
moment, and also only if we want to, we should try to see to what extent
we can represent a single taxonomic classification in OWL/DL in general.
I think we all sense that something ought to be possible but haven't
gotten sufficiently specific yet. Once we do, then some useful
Taxon/Kepler options might emerge.
- Agreed?
**********
*Betram* added in another e-mail, concerning the issue of real-life
taxonomic definitions working interchangeably as both classes and instances:
Maybe, or maybe not. Could one not distinguish, e.g., between an
"element as instance" and "element as class"? Things that hold for the
former may not hold for the latter and vice versa. We simply distinguish
between elements/terms/concepts when used at the instance vs. when used
at the class level.
Let me make a simplifying example: Say you've figured out a way to
represent all your information in the form of triples (X,Y,Z). If a term
t occurs in the X position (call it the instance position), it doesn't
say anything about it occurring in the Z position (call it the class
position, provided Y has some "class-valued property" say "hasClass").
So we can distinguish between 't as an instance' and 't as a class'. It
is up to some convention (or axiomatization) to establish a link between
these two uses of t.
Can we express this link that taxonomists make? Do they identify the two
uses of t? Is there never a distinction between a species (name?
concept? element?) when used in the instance sense vs. in the class sense?
**********
To which I respond:
YES, something like this has to happen I think. It does in fact happen
in real life that scientists will "decompose" a multi-natured taxonomic
definition (with e.g. 1. a type specimen, 2. other included specimens or
3. species, and 4. also a diagnosis of distinguishing features for the
organisms). In a particular situation they will refer to and reuse only
parts of that definition and ignore the rest, even and especially when
the rest doesn't really fit their current purpose. To have the "class"
aspects and "instance" aspects of a taxonomic definition to be
optionally dissociable is therefore necessary I think.
**********
I've had a quick look at the "Representing Classes As Property Values"
document. It might well be on the right track. But my feeling is the
examples are still too far away from actual practice to help our case.
My view is that "you guys" know OWL-DL in and out but maybe we should
get a better understanding how taxonomic definitions work, and what
issues are involved. For this purpose I've attached a 2-page PDF with
three hopefully useful examples.
The first page of the PDF shows a character-by-taxon matrix in which 22
species (species concepts, strictly speaking) are evaluated for the
presence or absence of 32 morphological features. Each species here is a
class (I assume also in OWL-speak) with a list of properties that
characterize it (and others that it doesn't have). In some case the
properties are "not applicable" ("-" in the matrix), e.g. when the
question is "wing color red or green?" in a species that happens to be
wingless. Also, some properties have not yet been observed and are
marked as "?". From the set or properties, and using a
phylogeny-generating algorithm, the tree (and classification) on the
second page is inferred.
So let us look at example 1 - the genus concept Cyclanthura (sec. Franz
in this humble scenario). The genus Cyclanthura may be defined by the 15
species it contains. One could also name any or all of the "subclades"
(groups of species that can be traced to one and the same origin in the
tree) and use those to make up the definition. I call this "ostensive
defining", or defining by "pointing at". The species are instances of
the genus. I think this is also called nominalism and probably something
else (in addition) in computer science.
On the other hand, note that the genus Cyclanthura has three distinctive
features - characters 12, 25, and 27. Those are postulated to have
jointly evolved at the time of the genus' origin according to the
distribution of these features in the 15 species and the phylogeny
algorithm used. So alternatively the genus is defined by those features,
which I call "intensional defining", or defining by "describing". The
definition could in principle be understood without any instances listed
(and vise-versa). This is called essentialism (and probably something
else still in computer science).
Almost ANY taxonomic classification that is worthwhile representing -
from Aristotle to today - will have elements showing these two aspects -
ostensive and intensional. Scientists use them together OR separately in
a way that helps them make the most sense out of a particular situation.
The serve different purposes but are both indispensable for representing
the taxonomic information content.
Let's look at example 2 - Cyclanthura pilosa. This is a species concept
that works (1) as a instance of Cyclanthura (see above), but (2) as a
class which has unique properties by itself. Note that Cyclanthura
pilosa does NOT have feature 27 present as predicated by the
higher-level definition (of the genus Cyclanthura). The phylogenetic
analysis postulates the it has "secondarily lost" that property in the
course of evolution (what we call "homoplasy"). Question - is this a
true inconsistency sense OWL-DL? The instance Cyclanthura pilosa was
supposed to have all distinguishing features of its class Cyclanthura
but in fact it does not (anymore). Species in particular have been
called "homeostatic property clusters", meaning that any of their
supposedly defining features could in fact be missing due to
evolutionary change, yet the properties in general are still needed to
define species.
And now example 3 - Ganglionus undulatus. This species concept is used
here in a sense to represent the entire genus concept Ganglionus. This
is called "exemplar approach" and is of course very common. It is what I
mean by "incomplete". There are in fact five species of Ganglionus and
most taxonomists would understand me to know this even if I don't say
so. Every regionally or temporally restricted taxonomic summary will run
into this problem of leaving out things known or presumed to exist. Of
course I can afford to leave out the four other species here because I
am really only interested in the defining properties of the concept
Ganglionus, and one instance is sufficient in this case to illustrate them.
**********
In summary, I hope again that this was helpful. *I* would benefit from
someone in the OWL group telling me what vocabulary is preferred to name
the most significant aspects associated with a typical taxonomic
definition (see examples 1, 2, 3). Is my ostensive/intensional
terminology understandable and acceptable (even if we do not/need not
use it in the future)? How would *you* express the dual nature of
taxonomic definitions? Finally, I fully agree with Bertam's assessment
that class- and instance-characteristics of taxonomic concepts need to
be combinable as well as flexibly dissociable for us to make significant
progress. Feel free to ask any questions about this, and let me know if
it was too little or (more likely) too much at once. What other kinds of
examples might you find helful?
Cheers,
Nico
Nico M. Franz, Ph.D.
Postdoctoral Research Fellow
National Center for Ecological Analysis and Synthesis
MSB, Room # 3411, University of California
Santa Barbara, CA 93106-6150
Phone: (805) 893-5934; Fax: (805) 893-8062; E-mail: franz at nceas.ucsb.edu
Website: http://www.nceas.ucsb.edu/~franz/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Cyclanthura-OWL.pdf
Type: application/pdf
Size: 2213316 bytes
Desc: not available
Url : http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/seek-kr-sms/attachments/20051031/a46f5000/Cyclanthura-OWL-0001.pdf
More information about the Seek-kr-sms
mailing list