[seek-kr-sms] Notes from SWDB @ VLDB

Shawn Bowers bowers at sdsc.edu
Tue Aug 31 08:17:05 PDT 2004


Hi all,

Here are some notes I took from the SWDB workshop.  VLDB has started
today, so I will take send notes from that as well...

I presented our paper on Sunday. It went well, and I talked with a few
people who were generally interested.  I having also been pushing Kepler
for you Kepler folks.

(Note that the notes are only for talks, papers that seem somewhat
interesting/related.)

Shawn


----------


SWDB'04 NOTES:
--------------

* HCOME: A tool-supported methodology for engineering living ontologies
  ---------------------------------------------------------------------
  Konstantinos Kotis
  Pg. 147

  A "personal space" tool; maybe something for UI for ontology building


* Data Semantics Revisited
  ------------------------
  Keynote: John Mylopoulos and Alex Borgida

  J. Mylopoulos starts:
  =====================

  "World semantics": Trying to capture the relation between the model
  (information source) and the real-world things that are being
  modeled by that source.

  Semantic data models: Jean-Robert Abrial (workshop in Corsica);
  Bracchi, Paolini, Pelagatti; Haniaut and Pirotte; Schmid and
  Swenson; in 1975 Chen, Navathe, Roussopoulos and Mylopoulos

  Role of semantic data models: Part of the DBMS technology (semantic
  DBMSs); Used during/for design (part of design process); part of the
  user interface to a database.

  How does one use a db where semantics has been factored out? Rely on
  a stable env. of users and app programs to know the
  semantics. Downside: Legacy data! Hard to mainain, share, etc.

	     Factoring out semantics is bad in open, changing
	     environments (like the web)

  Semantic Web: Sycara, ODBASE'03. Hypertex data are desinged for
  human consumption. Machine processable web data. Layered cake.

  Data semantics take 2: Formal ontology. Web-page
  annotations. Annotations used by browsers/search engines/e-service
  composition.

  Lots of work on the expressive languages. Not much on the annotation
  and use (applications).

  Some concerns: Hard to use technologies for computationally
  demanding tasks, e.g., theorem provers, model checkers, deductive
  databases, .... Scalability?? Practioners find it hard to use
  logical formal languages, e.g., Z, Datalog, ....

  We have to carefully blend technologies with methodologies.

  A. Borgida ends:
  ================

  Towards other visions of data semantics (alter sem web vision)

  What does data semantics mean? New angles on the problem.

  The mapping continuum and semantic encapsulation. Intentional
  aspects of data semantics.

  mapping continuum and semantic encapsulation. Peter Ladkin 1997
  (what is modeling in general?). A model is a subject and built for
  some purpose (implicit, but important to keep track of). The purpose
  is often for answering questions of the model, so you don't have to
  of the subject. M is a model of subject S for a purpose P.

     *** This paper sounds very interesting! Must get.

  We need: methods for building and changing the model. asking and
  answering questions in the model. a mapping to help translate
  applicable qustions about the subject matter into questiosn about
  the model .A way to translate results of the query to the model to
  answers about the subject.  University database to model enrollment.
  An interesing phenomena: the model becomes the reality (you aren't
  an employee if you aren't in the db).

  Typology of models: E-models (extensional), set-theoretic,
  relational e.g.; I-models (intensional), based on an entailment rel,
  e.g., ontos, schemas, but also equations; C-models (computational),
  query answering by running software, a simulation program (the
  queries mean the result of running the tool; like OWL-DL parsers
  ... the language is defined by the parsers)

  Terminology/intension (schema), Assertion/extension (specific
  individuals)

  Typology of subjects: Physical reality (tricky to define, see
  philosophy); human's perception of reality (better); Another
  Model!!! A database as a model of the conceptual model or ontology
  (of some domain). This makes it possible to make precise the
  mapping between model and subject.

  Study of mappings. Query languages are usually infinit. So mappings
  specified compositionally, at schema level. Form of mapping
  specifications (corespondences, GAV/LAV/GLAV) involving queries over
  the subject and the model. Correspondences between
  individuals. Translating the queries to answer them, via the
  mapping.

  A correspondence continuum (B.C. Smith 87). Consider: a photo of a
  landscape is a model of the landscape (its subject matter);
  photocopy of the photo is a model of a model of the landscape; a
  digitization of the photocopy, etc., etc. Mappings of mappings of
  mappings, ...

  Mapping graphs. the graph associated with each mapping continuum is
  acyclic and has one or more "roots".

  The complete meaning of data in a model includes "composition" of
  the mappings to the subject.

  Related work: data integration; ontology integration; model
  management; peer data management; data provenance. The novlety here
  is the emphasis of the semantic side, as opposed to the subject
  side (?)

  Whence mappings: Lineal mappings should be saved during
  design. Other mappings derived.

  View: Mappings between models. Mappings are easier to
  formalize/discover than the concepts. Instead of annotating things
  individual, define mappings and infer the semantics ... ?

  An intensional dimension of data semantics. Traditionally data
  semantics deals with "what (when)". You really need to understand
  "how" and "why". (How is the object used? Why was the data
  gathered?)

  Answers and intensions. Tropos +i*. Highly speculative. Actors,
  goals, and softgoals. Actors like Admin, Planning. Goal like
  determin incoming. Goals have clear success criteria. Softgoals
  aren't so, e.g., Maximize, Accurate (Determine Size). The design of
  goals is determined/explained by the softgoals. You can state how
  certain goals either positively/negatively contribute to softgoals.
  FormalTropos is a temporal logic language for defining this stuff, a
  formalized version of the diagrams shown in the
  presentation. Related work: Hippocratic databases [Agrawal02], why
  data provenance [Buneman], data semantics in systems involving
  workflows and processes.

  Conclusions (J. Mylopoulos): Data semantics will remain a core
  problem for databases with/without web technologies. Current semweb
  research address this with emph. on formal reasoning and
  expressiveness. Models and mappings critical research. Ultimately,
  the meaning of data needs to be tied down to the intentions of its
  designers and users.

  questions
  =========

  Val Tannen: Semantics is a religion. There is a continuum: what is
  semantic enough to be called semantic and what is not. It is in the
  eye of the beholder (i.e., the user). Claims that too much work on
  focusing on the complexity (the religion). The mathematics should be
  the real religion: precisesness, derive algorithms, etc. Claims Clio
  is a good example of getting from the religion.


* DOGMA Framework

  LinkBase, a huge medical ontology for drugs
  An associated database, National Drig Code Directory
  RIDL (1979): constraint and conceptual update/query part ...


* Context Mediation in the Semantic Web (COIN paper)
  --------------------------------------------------
  Stuart Madnick

  COIN: Focus on resolving semantic conflicts among heter. data
  sources

  SEMWEB: Focus on making web semantically clear

  COIN: system for semantic interop. among heter. sources, COINL based
  on FOL/Prolog, to model application ontology and context modifiers.

  Context Interchange Architecture (very cool)

    - Every source has a "Source Context"

    - Shared Ontologies (e.g., Meters and Feet are Lengths)

    - Receiver Context (assuming length is in Feet)

    - Conversion Libraries  (meters to feet)

    - Context Mediator (mediates conversion libs, shared ontos, source
      context, receiver context) to do context transformation from
      Source to Receiver.

  Two sides of COIN:

    - OWL as COIN's application ontology representation
    - COIN as 'meta-ontology' for OWL ontology interoperability
    - RuleML for specifying transforms

  Design Approach

    - Preserve constraint programming engine in the eCOIN prototype
    - 3-tier approach: ECOIN unchanged, ontologies in OWL, converted
      to internal form

  Not available in OWL: COIN modifiers (special type of attribute,
  like it has currency, but any currency is okay)

  There is an internal working report on this stuff too.

  This is written in Prolog!!!

  Really need to look at this stuff.

* Interesting Discussion with Alex Borgida and Stuart Madnick about
data conversion

  It was mentioned that one can use concrete domains, n-ary predicates
  that are defined essentially outside the reasoner, so that the
  reasoner (such as fact) can "hand out" the reasoning task to handle
  the case. This is useful if, e.g., you want to use the reasoner to
  determine whether you need to do a transformation. Sounds like this
  stuff has been worked out in the literature; but is an interesting
  idea.


* Kenneth Ross paper on Faceted Databases
  ---------------------------------------

  Faceted Hierarchies: entities in multiple classes. Invented by a
  Librarian in 20's. Entities can have attributes:

  Entity [ID]
    hasType [type]
      Context (type=context)
      Object (type=object) [category,location]
        Pot (category=pot) [capacity]
      ...

  Searching faceted databases
    Specify criteria from a variety of dimensions
    E-commerce: search desired values for one of color, price, size, etc.
      Answer set shrinks and can then be further searched
    Flamenco: database of images, classified in many dimensions

  Querying faceted databases
    Design query lang. to allow more complex queries
    Preserve "set of entities" abstraction
      Compositionality
      E.g., no joins
    Low data complexity
    Conceptually simple
    Implementation
    Trying to have a complete algebra: given a set of entities; return
      a set of entities

  Entity algebra
    Operators: Selection, Union, Diff, Intersection, Semijoin
    Entity sets may be heterogeneous: which attributes are avail?

  Attributes in Entity Algebra
    Compose queries one op at a time, using class and/or past query
      results
    At each step, teh system determines which atts are available
    Users do not have to figure this out themselves

  Queries with select, union, intersecct, are sound and complete.
    push selects to classes
    ... need to look at the paper ...
    decidable constraint language (e.g., constraint=constant, but can
      expand and does in paper)

  Used in an NSF-sponseored Archeological Project and a Human Anatomy
    Same infrastructure for both projects
    Presenation language separate from the query language
      You describe what you want to display of an attribute set







More information about the Seek-kr-sms mailing list