[seek-kr-sms] IMPORTANT information about upcoming SEEK BEAM/KR-SMS/TAXON meeting at Davis,CA, Mar. 7-10, 2005
Mark Schildhauer
schild at nceas.ucsb.edu
Thu Mar 3 14:44:38 PST 2005
Hello everyone,
Enclosed is some general information (Goals, Background, Preparatory
items, and Tentative Agenda) for participants in the upcoming meeting
of the BEAM (Biodiversity and Ecological Analysis and Modeling) group
with the KR/SMS (Knowledge Representation/Semantic Mediation) group of
the SEEK (Science Environment for Ecological Knowledge;
http://seek.ecoinformatics.org). Plus, we will also be covering some
interesting developments in the area of taxonomic name services, and
representatives of SEEK TAXON will be present. Whew, sorry for the
flurry of acronyms.
The main purpose of this meeting is to resume work on developing "Use
Cases" of real ecological relevance to guide, test, and otherwise inform
the technological development being carried out within the SEEK
Project. SEEK's primary approach (as many of you know) is to use some
advanced approaches in Knowledge Representation, in conjunction with
rich metadata-- to clarify the structure, contents and semantics of
ecological data sets and analytical processes (KR/SMS), and to enable
ecological researchers to access tools and services (such as
distributed data and computing capability: EcoGRID) via a scientific
workflow environment (Kepler) This meeting, then, will consist of a
number of ecologists with specific interests in the relationship of
biodiversity and productivity, interacting with computer scientists,
programmers and technologists, to develop useful test cases and provide
feedback on these emerging technologies. While we have an agenda, we
will be flexible about it depending on how things go...
Shawn Bowers and Bertram Ludaescher will be our main hosts at Davis.
Both Bertram and Shawn are computer scientists with strong expertise in
the areas of semantic mediation, scientific database integration,
knowledge representation, and scientific workflow management. Deana
Pennington and I will be there as "scientific/technical" liaisons,
hopefully facilitating the interactions and translation from the
vernacular of computer science to ecological science and vice-versa.
Aimee Stewart and Nico Franz will also be joining us (Aimee on Mo/Tu,
Nico on We) to create connections between the TAXON Working Group and
the BEAM Use cases. *Aimee*-- would you be able to update us about what
have been the major foci and developments on Taxon, especially where
these are relative to taxonomic name resolution for an ecologist using
field data (e.g., with a taxonomic ID column in their data or
metadata). Nico is going to present on Wed as well some specific
clarifications of the types of tools ecologists and systematists might
need to clarify, store, and use taxonomic concept information.
Manu Jiyal (in conjunction with Peter McCartney) has been working on
developing ontologies for spatiotemporal phenomena, and we are hoping to
examine and learn from his (their) progress to date.
If any of the domain scientists (Chalcraft, Cleland, Cox, Suding, Waide,
Weiher) have any interesting papers on Biodiversity/Productivity to
share with the rest of us and establish finer focus for the domain side
of the meeting, please send these out!! Steve-- rumor has it that you
have been thinking about next step analyses, so we'll definitely call on
you to talk about these.
**A. Goals * *
1) *high-level ecological ontologies**: *develop detailed ontologies
for Biodiversity & Productivity-- in terms of relevant concepts and
relationships from the very general theoretical level, and drilling all
the way down to the operational level (algorithms and measurements used
in quantifying biodiversity and productivity).
2) *ontologies of data sets and analyses:* develop detailed ontologies
of the data sets and analyses needed for some "past"
Biodiversity/Productivity research, including clarifying the semantics
and data transformations carried out to merge/integrate/summarize data,
as well as "describing" the analytical components in ways that
sufficiently expose their Inputs/Outputs and "functions" in ways that
will facilitate discover and re-use of these components in alternative
scientific workflows.
3) *ontologies of ecological methodologies: *develop detailed ontologies
that capture the essential features and differentiating nuances of the
field and other methodologies employed in capturing the data to be dealt
with in these investigations (overlaps with task 2 above). (We must
*not* lose sight of the need for "spatiotemporal ontologies", and the
specific capabilities these will provide us with regards to #1-3 here)
3) *scientific workflows:* develop detailed scientific workflows for
some "past" Biodiversity/Productivity research, to formally capture at a
fine grain the steps and operations, with sufficient semantic annotation
to facilitate their transparency and reusability by other researchers.
4) *taxonomic capabilities* identify the specific types of services
that "ecologists" might need on the select use case to improve the
ability to deal with taxonomic names, especially in the context of
historical, long-term and globally distributed biodiversity information
5) *ecological research challenge: * conceive of a "Challenging"
Biodiversity/Productivity analyses, which could be enabled via a
scientific workflow managment (Kepler), and that demonstrates the power
of accessing distributed data and computing resources, and would
otherwise be highly inefficient or intractable to a "typical" individual
researcher.
*B. Approaches*
We will use several ontology editing and creating tools, including
Protege' (http://protege.stanford.edu) and CMAP
(http://www.ihmc.us/users/phayes/CODE) to develop and review our
ontologies. We will also use a "declarative" method of creating
ontologies via a tool called "sparrow".
Kepler (http://www.kepler-project.org) is the scientific workflow
application that we are developing and testing under SEEK, which is
intended for ecologists to discover data and "stored analyses", and
assemble and run new "analyses" (or workflows). We will actively use
Kepler at this meeting to capture the sequence and flow of processes
and data of potentially several biodiversity/productivity analyses.
These will not be operational at this time, but we will attempt to
capture the logic such that the actual execution code can be developed
at a later point.
*C. What to bring*
1) KEY WORDS--Pointers to or digital copies of relevant thesauri,
controlled vocabulary lists, or books (ideally, digital format) from
which we can extract and assemble relevant sets of terms, concepts,
definitions, and their interrrelationships-- bearing on biodiversity,
productivity, ecological methods and measurements, and specific
analytical approaches. This includes any pointers to stuff like GCMD,
or ESA study section controlled lists, etc.
2) DATA--Any datasets that you feel are particularly representative or
necessary for intended analyses to be accomplished within this SEEK
Project. (NOTE: the SEEK staff are interested in the tools and
approaches employed by the scientists, rather than the specific analyses
per se. So, we are happy to provide full confidentiality of the
ecologically relevant data and findings associated with this effort, as
long as we can report on the ways in which SEEK tools were used to
enhance the effectiveness of addressing those ecologically relevant
questions.
3) METHODS and METADATA-- Please think about (and bring) whatever
special methods and other ancillary information is necessary to
interpret any of the above data.
*D. Background materials--*
Notes from our last BEAM meeting in San Diego (Sep. 21-23, 2004) are here:
http://seek.ecoinformatics.org/Wiki.jsp?page=BeamKnowledgeRepSept04
ONTOLOGIES--
A classic and highly readable introduction tobuilding formal
ontologies can be found here:
http://protege.stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.html
Another very short, classic introduction to ontologies by Thomas Gruber,
but somewhat technically phrased:
http://www-ksl.stanford.edu/kst/what-is-an-ontology.html
Rich William's set of ecological ontologies developed for SEEK (in OWL
format) can be found here:
http://cvs.ecoinformatics.org/cvs/cvsweb.cgi/seek/projects/kr-sms/OWLOntologies/
(hint-- click on file name, then "download" to actually get the OWL file)
SCIENTIFIC WORKFLOWS--
Quick paper about semantics and scientific workflows (SEEK Paper
submitted for SSDBM)
http://cvs.ecoinformatics.org/cvs/cvsweb.cgi/~checkout~/kepler/docs/pubs/kepler-SSDBM2005.pdf?rev=1.1&content-type=application/pdf
A draft of the Kepler User Guide, which explains a lot about Scientific
Workflows--
http://cvs.ecoinformatics.org/cvs/cvsweb.cgi/~checkout~/kepler/docs/user/KeplerEndUserDoc.pdf?content-type=application/pdf
*E: Tentative Agenda*
Monday
8:30-9:30 Introductions, Overview of Goals, Status Update (Shawn,
Deana, Mark)
9:30-10:00 Linkages with Taxon (Aimee, Deana, Mark)
10:00-10:30 Update from Manu about Spatiotemporal ontology work
10:30-10:45 Break
10:45-12:30 Work on Domain Ontologies (evaluate Rich's ontologies,
GrOWL, SPARROW)
12:30-1:30 Lunch
1:30-3:30 Cont. Domain Ontologies
3:30-4:00 Break
4:30-5:30 Discussion and Preview for tomorrow (past analyses
and data)
Tuesday
8:30-10:30 Develop Scientific Workflows
10:30-10:45 Break
10:45-12:30 Cont. with Scientific Workflows
12:30-1:30 Lunch
1:30-3:30 Develop Ontologies of Scientific Methods
3:30-4:00 Break
4:00-5:30 Discussion and Preview for tomorrow (new scientific
challenge; new analysis and data needs)
Wednesday
8:30-10:30 Cont. with Ontologies for Scientific Methods
(including presentation by
NicoFranz about Use Case involving Taxonomic
Name Resolution)
10:30-10:45 Break
10:45-12:30 Data Integration Ontologies
12:30-1:30 Lunch
1:30-3:30 Conceive of new BiodivProd challenge addressable
through Kepler/KR/SMS
3:30-4:00 Break
4:00-5:30 Further define next step challenge
Thursday
8:30-10:30 Discussion (ontologies, workflows, next steps,
assignments)
10:30-10:45 Break
10:45-12:00 Continue wrapup, next meeting?, and adjourn
=======================================================================================
--
Mark P. Schildhauer, Ph.D. -- Director of Computing
NCEAS -- National Center for Ecological Analysis and Synthesis
735 State St., Suite 300 Santa Barbara, CA 93101-3351
Email: schild at nceas.ucsb.edu WEB: http://www.nceas.ucsb.edu
Phone: 805-892-2509 FAX: 805-892-2510
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/seek-kr-sms/attachments/20050303/25d59131/attachment.htm
More information about the Seek-kr-sms
mailing list