[seek-kr-sms] Notes from SWDB @ VLDB
Shawn Bowers
bowers at sdsc.edu
Tue Aug 31 08:17:05 PDT 2004
Hi all,
Here are some notes I took from the SWDB workshop. VLDB has started
today, so I will take send notes from that as well...
I presented our paper on Sunday. It went well, and I talked with a few
people who were generally interested. I having also been pushing Kepler
for you Kepler folks.
(Note that the notes are only for talks, papers that seem somewhat
interesting/related.)
Shawn
----------
SWDB'04 NOTES:
--------------
* HCOME: A tool-supported methodology for engineering living ontologies
---------------------------------------------------------------------
Konstantinos Kotis
Pg. 147
A "personal space" tool; maybe something for UI for ontology building
* Data Semantics Revisited
------------------------
Keynote: John Mylopoulos and Alex Borgida
J. Mylopoulos starts:
=====================
"World semantics": Trying to capture the relation between the model
(information source) and the real-world things that are being
modeled by that source.
Semantic data models: Jean-Robert Abrial (workshop in Corsica);
Bracchi, Paolini, Pelagatti; Haniaut and Pirotte; Schmid and
Swenson; in 1975 Chen, Navathe, Roussopoulos and Mylopoulos
Role of semantic data models: Part of the DBMS technology (semantic
DBMSs); Used during/for design (part of design process); part of the
user interface to a database.
How does one use a db where semantics has been factored out? Rely on
a stable env. of users and app programs to know the
semantics. Downside: Legacy data! Hard to mainain, share, etc.
Factoring out semantics is bad in open, changing
environments (like the web)
Semantic Web: Sycara, ODBASE'03. Hypertex data are desinged for
human consumption. Machine processable web data. Layered cake.
Data semantics take 2: Formal ontology. Web-page
annotations. Annotations used by browsers/search engines/e-service
composition.
Lots of work on the expressive languages. Not much on the annotation
and use (applications).
Some concerns: Hard to use technologies for computationally
demanding tasks, e.g., theorem provers, model checkers, deductive
databases, .... Scalability?? Practioners find it hard to use
logical formal languages, e.g., Z, Datalog, ....
We have to carefully blend technologies with methodologies.
A. Borgida ends:
================
Towards other visions of data semantics (alter sem web vision)
What does data semantics mean? New angles on the problem.
The mapping continuum and semantic encapsulation. Intentional
aspects of data semantics.
mapping continuum and semantic encapsulation. Peter Ladkin 1997
(what is modeling in general?). A model is a subject and built for
some purpose (implicit, but important to keep track of). The purpose
is often for answering questions of the model, so you don't have to
of the subject. M is a model of subject S for a purpose P.
*** This paper sounds very interesting! Must get.
We need: methods for building and changing the model. asking and
answering questions in the model. a mapping to help translate
applicable qustions about the subject matter into questiosn about
the model .A way to translate results of the query to the model to
answers about the subject. University database to model enrollment.
An interesing phenomena: the model becomes the reality (you aren't
an employee if you aren't in the db).
Typology of models: E-models (extensional), set-theoretic,
relational e.g.; I-models (intensional), based on an entailment rel,
e.g., ontos, schemas, but also equations; C-models (computational),
query answering by running software, a simulation program (the
queries mean the result of running the tool; like OWL-DL parsers
... the language is defined by the parsers)
Terminology/intension (schema), Assertion/extension (specific
individuals)
Typology of subjects: Physical reality (tricky to define, see
philosophy); human's perception of reality (better); Another
Model!!! A database as a model of the conceptual model or ontology
(of some domain). This makes it possible to make precise the
mapping between model and subject.
Study of mappings. Query languages are usually infinit. So mappings
specified compositionally, at schema level. Form of mapping
specifications (corespondences, GAV/LAV/GLAV) involving queries over
the subject and the model. Correspondences between
individuals. Translating the queries to answer them, via the
mapping.
A correspondence continuum (B.C. Smith 87). Consider: a photo of a
landscape is a model of the landscape (its subject matter);
photocopy of the photo is a model of a model of the landscape; a
digitization of the photocopy, etc., etc. Mappings of mappings of
mappings, ...
Mapping graphs. the graph associated with each mapping continuum is
acyclic and has one or more "roots".
The complete meaning of data in a model includes "composition" of
the mappings to the subject.
Related work: data integration; ontology integration; model
management; peer data management; data provenance. The novlety here
is the emphasis of the semantic side, as opposed to the subject
side (?)
Whence mappings: Lineal mappings should be saved during
design. Other mappings derived.
View: Mappings between models. Mappings are easier to
formalize/discover than the concepts. Instead of annotating things
individual, define mappings and infer the semantics ... ?
An intensional dimension of data semantics. Traditionally data
semantics deals with "what (when)". You really need to understand
"how" and "why". (How is the object used? Why was the data
gathered?)
Answers and intensions. Tropos +i*. Highly speculative. Actors,
goals, and softgoals. Actors like Admin, Planning. Goal like
determin incoming. Goals have clear success criteria. Softgoals
aren't so, e.g., Maximize, Accurate (Determine Size). The design of
goals is determined/explained by the softgoals. You can state how
certain goals either positively/negatively contribute to softgoals.
FormalTropos is a temporal logic language for defining this stuff, a
formalized version of the diagrams shown in the
presentation. Related work: Hippocratic databases [Agrawal02], why
data provenance [Buneman], data semantics in systems involving
workflows and processes.
Conclusions (J. Mylopoulos): Data semantics will remain a core
problem for databases with/without web technologies. Current semweb
research address this with emph. on formal reasoning and
expressiveness. Models and mappings critical research. Ultimately,
the meaning of data needs to be tied down to the intentions of its
designers and users.
questions
=========
Val Tannen: Semantics is a religion. There is a continuum: what is
semantic enough to be called semantic and what is not. It is in the
eye of the beholder (i.e., the user). Claims that too much work on
focusing on the complexity (the religion). The mathematics should be
the real religion: precisesness, derive algorithms, etc. Claims Clio
is a good example of getting from the religion.
* DOGMA Framework
LinkBase, a huge medical ontology for drugs
An associated database, National Drig Code Directory
RIDL (1979): constraint and conceptual update/query part ...
* Context Mediation in the Semantic Web (COIN paper)
--------------------------------------------------
Stuart Madnick
COIN: Focus on resolving semantic conflicts among heter. data
sources
SEMWEB: Focus on making web semantically clear
COIN: system for semantic interop. among heter. sources, COINL based
on FOL/Prolog, to model application ontology and context modifiers.
Context Interchange Architecture (very cool)
- Every source has a "Source Context"
- Shared Ontologies (e.g., Meters and Feet are Lengths)
- Receiver Context (assuming length is in Feet)
- Conversion Libraries (meters to feet)
- Context Mediator (mediates conversion libs, shared ontos, source
context, receiver context) to do context transformation from
Source to Receiver.
Two sides of COIN:
- OWL as COIN's application ontology representation
- COIN as 'meta-ontology' for OWL ontology interoperability
- RuleML for specifying transforms
Design Approach
- Preserve constraint programming engine in the eCOIN prototype
- 3-tier approach: ECOIN unchanged, ontologies in OWL, converted
to internal form
Not available in OWL: COIN modifiers (special type of attribute,
like it has currency, but any currency is okay)
There is an internal working report on this stuff too.
This is written in Prolog!!!
Really need to look at this stuff.
* Interesting Discussion with Alex Borgida and Stuart Madnick about
data conversion
It was mentioned that one can use concrete domains, n-ary predicates
that are defined essentially outside the reasoner, so that the
reasoner (such as fact) can "hand out" the reasoning task to handle
the case. This is useful if, e.g., you want to use the reasoner to
determine whether you need to do a transformation. Sounds like this
stuff has been worked out in the literature; but is an interesting
idea.
* Kenneth Ross paper on Faceted Databases
---------------------------------------
Faceted Hierarchies: entities in multiple classes. Invented by a
Librarian in 20's. Entities can have attributes:
Entity [ID]
hasType [type]
Context (type=context)
Object (type=object) [category,location]
Pot (category=pot) [capacity]
...
Searching faceted databases
Specify criteria from a variety of dimensions
E-commerce: search desired values for one of color, price, size, etc.
Answer set shrinks and can then be further searched
Flamenco: database of images, classified in many dimensions
Querying faceted databases
Design query lang. to allow more complex queries
Preserve "set of entities" abstraction
Compositionality
E.g., no joins
Low data complexity
Conceptually simple
Implementation
Trying to have a complete algebra: given a set of entities; return
a set of entities
Entity algebra
Operators: Selection, Union, Diff, Intersection, Semijoin
Entity sets may be heterogeneous: which attributes are avail?
Attributes in Entity Algebra
Compose queries one op at a time, using class and/or past query
results
At each step, teh system determines which atts are available
Users do not have to figure this out themselves
Queries with select, union, intersecct, are sound and complete.
push selects to classes
... need to look at the paper ...
decidable constraint language (e.g., constraint=constant, but can
expand and does in paper)
Used in an NSF-sponseored Archeological Project and a Human Anatomy
Same infrastructure for both projects
Presenation language separate from the query language
You describe what you want to display of an attribute set
More information about the Seek-kr-sms
mailing list