[seek-kr-sms] OBOE discussion: current version

Wed Jun 21 21:31:12 PDT 2006

Expecting to become more informed than informant, and being a visitor, I 
will briefly offer the context in which I eagerly follow the discussion 
about observations. It surrounds the ontology of digital images. If 
these are distractions long ago covered in your discussions, please 
forgive me and point me at those if possible.

[Images as observation sets: Events vs. outcomes; instrument calibration]
I believe one could call a digital image  a set of spatially and 
temporally correlated observations. Most modern cameras automatically 
record metadata about the capture event  such as time and date, focal 
length, etc., and even GPS coordinates in some cases. In one view, an 
image is a set of measurements at discrete spatial intervals of stuff 
about the reflectance spectra of objects in the world. Furthermore, with 
enough digging in engineering specifications usually available from the 
manufacturer, it is even possible to know in detail the camera's 
methodology for assigning its measurement values. This involves a set of 
nonlinear functions of wavelength(plus a quantization algorithm), and 
are thus(?) probably beyond OWL. Maybe. But in the case of color, there 
are three internationally agreed upon particular such functions which 
serve as the basis of a three-dimensional color space (the CIE XYZ 
coordinate space) of such functions  For a particular device, the device 
manufacturer enshrines three coordinates of the transfer functions in 
some published metadata called the ICC Color Profile. (Well, they 
publish the \design/ coordinates. If you are really picky, you calibrate 
your device by \measuring/ the \device/ characteristics before you 
measure the world's characteristics. Where does instrument calibration 
fit in a measurement ontology???).

[Color as an observation; type specimens for colors]
Fernando remarked "I'd be
interested to see examples of how color could be defined/described as a
special kind of observation." One answer is that spectral 
distributions---one standard form of color as an observation---can also 
be defined in the CIE XYZ coordinate system and so given by three 
numbers- the coordinates in the standard XYZ basis. The problem is that 
this particular space is defined by properties of human color vision 
(namely color matching properties) and as I remarked in earlier post, it 
is usually difficult to compare it to other models. So Fernando's 
challenge can only be answered accompanied by "what is it that you are 
trying to do with the colors you observe?"---a kind of fitness for 
purpose question which, I suppose, pervades all observations based on a 
model. SFor any Observation, if you can't also express fitness for 
purpose, it is seems to me you are left with very little ability to do 
anything with Observations but compare them to others made to the same 
model.

Maybe an abstract spectral distribution isn't an observation. But 
particular ones, e.g. the reflectance spectra of things in the world 
are. So one could say that the reflectance spectra of some particular 
thing defines some color, much as taxonomists say that the type specimen 
defines the specimen, despite phenotypic variation. In fact, this is the 
foundation of many color models, e.g. the highly standardized Munsell 
and Pantone color systems.

[Picture of what? Color of what?]
The thing we are really struggling with in work on LSID resolution for 
images is exactly what should be regarded as the data of a digital 
image. To me, it's not helpful to say that the data are the numbers the 
device offers you, even in the face of good knowledge of the transfer 
functions. That's because, in the final analysis, those numbers are an 
\encoding/ of the Observation. I find no merit to saying that two image 
representations that are losslessly convertible to one another represent 
two different Observations, any more than one would say that data 
expressed in kilometers defines a different Observation from the same 
data expressed in meters. So I am hoping that OBOE will offer me 
insights into this problem, or show me why it isn't a problem.   Some 
disagree with me and assert that if a measuring device has units, the 
data expressed in those units is the only data. Everything else is an 
artifact. As a recovering algebraist, I'd really like to say that a 
picture is an equivalence class of all the things that can be losslessly 
converted to and from the one the capture device gave me. I don't know 
how to say that digitally.

Bob Morris

p.s. Computational neuroscience attempts to make similar models for all 
sensory (i.e. human observed) data. I don't know if there are presently 
any international standards for representations of smell, touch, taste, 
and sound as there are for vision, but I suppose this doesn't matter if 
one can appropriately define the model against which measurements are 
taken. Surely this is a standard problem for those who model any kind of 
Observation in finite-dimensional spaces of non-linear continuous functions.

=========================

Kennedy, Jessie wrote:
> Hi Folks
> 
>  
> 
> I've been following your recent discussion with interest (although not
> as fully as I'd like). As Bob Morris noted, TDWG are working on similar
> stuff. In addition to the specific SDD work on colours and descriptions
> I've been working with some members of the different TDWG subgroups to
> try and define a core ontology for TDWG. We started from the work on
> standards from the specimen based groups (Darwin Core and ABCD), the
> Taxon names ands concepts group (TCS) and from the Descriptions group
> (SDD), in addition the geospatial group with Read Beaman will be
> contributing. 
> 
>  
> 
> The interesting thing is that everyone is talking about observations,
> and of course there are many interpretations of what an observation
> means (as has also been exemplified on this list). We had events
> initially in our base ontology from which things like observation were
> subclassed but now we've gone for modelling the observation record
> rather than the event, in a similar way to is being discussed here,
> however we've not got into measurements side as much as you have. If
> you're interested in where we've got to with our model (using UML for
> the time being for communicating our understanding on our wiki) please
> have a look at the TDWG ontology pages
> 
> http://wiki.tdwg.org/twiki/bin/view/TAG/TDWGOntology
> 
>  
> 
> We talked about observations as being a record of an identification to a
> taxon concept (possibly any concept) by someone at a given place and
> time  (and it may have an associated measurement like a count but we
> haven't put that in the core as we didn't think it was fundamental to an
> observation maybe we should?). Observations may have associated
> gatherings (which can be independent from observation - again debatable
> by some) if a specimen is collected. The specimen may of course have
> descriptions taken about it (measurements of different kinds) and may be
> located in an collection somewhere. I was quite happy with the
> observation not being a thing (like a specimen) but you seem to be
> talking about taking measurements of an observation... is this a
> measurement of a specimen in the field that isn't collected as a
> voucher? Is it an aggregate measurement such as the average of a
> population in the field, or an average of say the length of the leaves
> on a specimen in the field? Do the differences between these thing
> smatter? I'd be interested to hear more about what you think an
> observation is...whatever I think we have to be clear what we mean
> because we could generalise it so much it could be anything we want it
> to be.
> 
>  
> 
> We've been trying to segment the existing standards as much as possible
> to try and determine the general classes being reused by the different
> schemas to offer as much reusability as possible. In trying to get
> agreement we quickly realised that as soon as we started putting
> properties on classes people would start to disagree with the definition
> of the class so we opted for putting a gloss on the class (this needs
> updated in the current version) and only defining possible relationships
> between classes (to give more sense to our meaning of the class).
> Depending on which of our user community we viewed things from would
> change whether or not a particular relationship was seen to be
> fundamental to the class or not. So we've kept it minimal, but
> multi-purpose I hope.
> 
>  
> 
> We're going to use the ontology in a mini project over the summer to try
> and convert an existing data provider's data to RDF using LSIDs to cross
> reference between the objects. We will extend the bdicore ontology to a
> domain ontology required for this purpose.
> 
>  
> 
> Re your discussion on quantitative and qualitative measurements, in our
> work on plant description ontology we had structure-property-value-unit
> for quantitative measurements and structure-property-state for
> qualitative descriptions. 
> 
> But we had other mechanisms for dealing with more complex 'measurements
> such as ratios e.g leaf length to width.
> 
>  
> 
> Well I'd love to talk more about this with you guys but have too many
> other things just now - most pressing is going off on holiday for a
> break :-)
> 
>  
> 
> Look forward to catching up with you when I get back...
> 
>  
> 
> Jessie
> 
>  
> 
> ________________________________
> 
> From: seek-kr-sms-bounces at ecoinformatics.org
> [mailto:seek-kr-sms-bounces at ecoinformatics.org] On Behalf Of Joshua
> Madin
> Sent: 21 June 2006 20:57
> To: seek-kr-sms at ecoinformatics.org
> Subject: Re: [seek-kr-sms] OBOE discussion: current version
> 
>  
> 
> Based on the comments this morning I have redrawn the core of oboe for
> discussion (attached pdf).  It seems to me that the unit-at-all-cost
> framework will greatly simplify what we are trying to deliver for
> improving data integration, but, as Ferdinando said, this framework will
> be hard to justify and may cause problems down the line.  In the
> attached ontology, I've tried to divide the different notions of
> "measurement".  All this does is restrict the properties that can be
> used on different types of observation.
> 
>  
> 
> I've also included "hasProcedure" as a properties that acts on
> Observations.  This can also act on Measurements due the the subsumption
> hierarchy shown in Figure A.  I think that this is what Ferdinando
> meant, but I'm not sure.
> 
>  
> 
> Cheers.  BTW: I just received new emails from Ferdinado and Matt -- but
> I'll send this anyway.
> 
> Josh 
> 
> 
> 
> 
> 
>  > 1.      Observable is either Entity or Characteristic (at the
> moment).  Characteristic has only one subclass Dimension, which defines
> the set of base quantities such as length, weight, etc. , Dimension
> includes only things measured in quantities. Thus at the moment we are
> missing specification for observations of such characteristics as color,
> smell, taste or anything which is measured in qualitative scale.
> 
> 	This is a question that has come up a lot recently and really
> needs to be confronted with some good examples.  The idea was that
> nominal measurements would just be given unit "name" and a
> characteristic, such as "red".  This would mean having these
> characteristics in an extension ontology such as a "classifiation
> ontology" (which would plug into OBOE's charactersitic).  
> 
> 	I don't think this is right. Simply, the values of that
> observation come from a finite set of color classes (or instances). Not
> a measurement, if we define measurement as comparison with a reference
> unit (meter of tree) using an abstract unit for the dimension (meter for
> length). It is a measurement if we define measurement to encompass
> assigning a class to an observable in a context as the result of
> measuring it. I'd rather call it a "Classification", subclass of
> Observation and siblings of Measurement. And we could have "Ranking" as
> subclass of Classification, where classes must have an ordinal
> relationship. But stretching the definition to make it fit in the
> unit-at-all-costs framework and giving the characteristic the role of
> subsetting the value space doesn't sound right at all. This was the
> thought behind proposing an explicit value space. 
> 
> 	Ordinal measurements may not be as easy to deal with.  It might
> work in the same way as above, but use the unit "rank".  However, the
> ordinal ontology would need to contain constructs that deal with
> "direction" or "magnitude".  For example, "high" is distinct from and of
> greater magnitude than "low".  This ontology would have to be able to
> deal with arbitrary numbers of levels, similar to the way we dealt with
> Observation in OBOE for coping with experimental design.  The idea was
> to remove these kind of things (i.e., characteristics) from the core
> ontology because the way that people want to use them are so variable. 
> 
> 	 
> 
> 	Similar concerns, plus one: I don't think the ordinal
> relationship between classes such as {high, medium, low} has much of a
> chance to be captured in OWL. Nor I think it should be, as you don't do
> much with it in workflows unless it's a real numeric scale (whose
> ordinal properties are also not expressed in OWL, so why bother?). If
> really necessary, we could make such classification hierarchies
> subclasses of "Rank" and use a numeric property for ordering such
> values, but all the logic necessary to do anythingwith it remains
> outside OBOE.
> 
> 	 
> 
> 	 
> 
> ----------------------------------------------------------
> 
> 
> 
> 	 
> 
> 	 
> 
> 	Our definition, if I remember correctly, was :Observation is a
> statement that an Observable has been observed. I think more than this
> is going to color OBOE with restrictions it does not need to have. By
> the way, we model the result of the observation, not the process of the
> observation, and the result is not an event. To annotate a dataset we
> don't need to know anything about the measurement except its results. 
> 
> 	 
> 
> 
> 
> 
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Seek-kr-sms mailing list
> Seek-kr-sms at ecoinformatics.org
> http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/seek-kr-sms

-- 
Robert A. Morris
Professor of Computer Science
UMASS-Boston
http://www.cs.umb.edu/~ram
phone (+1)617 287 6466