[Fwd: Re: [seek-kr-sms] ontology/folder UI design in Kepler]

Tue Mar 1 09:36:11 PST 2005

Some comments:

Laura L. Downey wrote:
>>Shawn writes:
>>I think that this is (or at least was) exactly one of the "missions" in 
>>SEEK: to get scientists involved in creating and using *formal* ontologies.
> 
> 
> Using formal ontologies, yes.  I have definitely seen some excitement when
> semantic mediation has been talked about in a way that will make their jobs
> easier -- of finding other data sets they would not otherwise have found,
> when identifying actors that would be useful to them that they otherwise
> might not have identified etc.  And yes, creating the ontologies themselves
> too, because they know their domains better than we do, but formally
> specifying them so that machines can make use of them? I'm not so sure about
> that from what I've seen.  But again, remember I'm new to the project so
> bringing an outsider perspective and maybe one that needs to be more
> informed.

I think "formally specifying ontologies" is a loaded phrase ... it is 
being used to refer to the languages (such as OWL) and tools (such as 
Protege) that have known deficiencies not only for "domain scientists" 
but also in general for capturing knowledge. OWL is a W3C specification 
that is based on XML and is overly verbose (being expressed in XML) and 
often misused. It is really just an interchange format, and not really a 
language unto itself (it's meant to encompass many languages so as to be 
a good middle-ground for tools that use disparate languages).  Protege 
is a tool that is still young and is just starting to be more widely 
used. It is, however, in many ways still designed for a very small, 
highly technical user group.

Ontology tools should be such that they present a sound and intuitive 
user model (i.e., the conceptual constructs used to create ontologies), 
shielding the user from the underlying interchange format. Most tools 
that are out there essentially present a low-level graphical version of 
the language, not of these higher-level conceptual constructs. A counter 
example is CMAP, however, it's model in my opinion is too unconstrained, 
and offers little support in terms of helping users to create 
well-designed and "consistent" ontologies.

I also think this notion that a domain scientist will "informally" 
construct an ontology and then pass it off to a "knowledge engineer" to 
"make it formal" is (a) not a scalable solution, (b) "passes the buck" 
to an unknown entity (i.e., the non-existent "knowledge engineers"), and 
(c) in general, is not always a sound approach.  (I'm not picking on you 
here Laura -- these are just some observations; and I'm trying to 
stimulate some discussion as to what the approach should be for SEEK.)

I think in SEEK, this notion of a knowledge engineer has been used in 
place of providing useful tools to our users.  I think if anything, the 
"knowledge engineer" should be built into the tool -- which is starting 
to emerge in some other tools, including protege.

I think that the challenge in defining a "formal ontology" for a 
particular domain is that as a user: (1) you need to have a clear 
understanding of the domain, the concepts, their definitions (very 
challenging in general), and (2) you need to understand how to represent 
this information in the knowledge representation language/tool.  If a 
domain scientist gives the knowledge engineer the first item (1), then 
the scientist could have just as well input the information in a 
well-designed ontology tool. If the knowledge engineer gives a vague and 
imprecise description of (1), then the knowledge engineer has no chance 
of doing (2).  My argument is that to "create ways for regular users to 
provide the appropriate input to the knowledge engineers so that items 
are formally specified" essentially means that the "regular users" have 
already specified the ontology -- and they don't need the KE (of course 
this could be an iterative process, where the KE "holds the hand" of the 
scientist through the process -- which is again not going to scale and 
is probably not that practical).

Of course, not only do we want to make (2) easy, we also want tools to 
help scientists/users get to (1). I think there are lots of ways to help 
users get to (1), e.g., by:

- describing a process/methodology, like in object-oriented analysis and 
design that can help one go from a fuzzy conceptualization to a clearer 
model (we want to target scientists, however, instead of software 
designers/developers)

- providing tools to help people "sketch" out their ideas before 
committing to an ontology language (but make it explicit that they are 
doing the "sketch" as part of a process) ... e.g., by allowing some 
free-text definitions mixed with class and property defs, etc. 
Essentially, provide a tool that can facilitate someone to go from 
informal/unclear to formal/clear.

- adopting some known approaches for "cleaning up" an ontology (similar 
to OntoClean, e.g.)

- providing tools that can identify inconsistencies and possible 
"pitfalls" in the ontology (useful for getting to a clearer, more formal 
model)

- providing lots of examples of "well-defined" ontologies

- letting people edit and reuse existing well-formed ontologies (in 
fact, I think that once we have a basic framework, this will be the 
typical model of interaction for many scientists ...  )

In terms of "machine understandable ontologies", this really just means 
that the ontology is captured in one of these ontology languages, like 
OWL.  It doesn't mean that a scientist should have to literally put 
their ontology into this language -- that is the job of the tool. Our 
goal should be to help users specify ontologies using "structured" 
approaches.  That is, essentially in restricted languages that are not 
as ambiguous and not as unconstrained as natural language -- which is 
typically done using graphical tools (box and line diagrams).  Also, the 
user should be completely unaware that their definitions are being 
stored in these low-level languages; which is why the existing tools 
fail for domain scientists / non computer-science folks.

> Is the goal here to figure out a way to allow scientists with no formal
> ontology experience to easily specify formal ontologies in a way that
> machines can make use of them?  That seems like a daunting task to me -- and
> one that would require considerable time and resources.  Didn't I just read
> from Mark (in the IRC convo) that the knowledge engineers themselves have
> trouble with their own tools like Protégé?  Creating and specifying formal
> ontologies is a complex and challenging job even for those trained in it.
> 
> I agree that scientists understand their domains better than others, but
> that doesn't mean they understand how to formally represent that domain in a
> way that can be utilized by a machine.  They user their own experience,
> intuition, and knowledge to create ontologies.  They make decisions and
> understand possible exceptions.  But that is a different task than formally
> specifying that ontology to a rigid set of rules that can be utilized via
> machine processing.  I'm thinking that is still a task to be done by a
> trained knowledge engineer.
> 
> And if we create ways for regular users to provide the appropriate input to
> the knowledge engineers so that items are formally specified in such a way
> that the system can make use of them to the benefit of the regular users, I
> would see that as a definite win and demonstration of the power of semantic
> mediation to make scientists jobs easier.
> 
> Laura L. Downey
> Senior Usability Engineer
> LTER Network Office
> Department of Biology, MSC03 2020
> 1 University of New Mexico
> Albuquerque, NM  87131-0001
> 505.277.3157 phone
> 505.277-2541 fax
> ldowney at lternet.edu
>  
> 
> 
> _______________________________________________
> seek-kr-sms mailing list
> seek-kr-sms at ecoinformatics.org
> http://www.ecoinformatics.org/mailman/listinfo/seek-kr-sms