[seek-kr-sms] IMPORTANT information about upcoming SEEK BEAM/KR-SMS/TAXON meeting at Davis,CA, Mar. 7-10, 2005

Mark Schildhauer schild at nceas.ucsb.edu
Thu Mar 3 14:44:38 PST 2005


Hello everyone,

Enclosed is some general information (Goals,  Background, Preparatory 
items, and Tentative  Agenda) for participants in the upcoming meeting 
of the BEAM (Biodiversity and Ecological Analysis and Modeling)  group 
with the KR/SMS (Knowledge Representation/Semantic Mediation) group of 
the SEEK (Science Environment for Ecological Knowledge; 
http://seek.ecoinformatics.org).  Plus, we will also be covering some 
interesting developments in the area of taxonomic name services, and 
representatives of SEEK TAXON will be present. Whew, sorry for the  
flurry of acronyms.

The main purpose of this meeting is to resume work on developing "Use 
Cases" of real ecological relevance to guide, test, and otherwise inform 
the technological development being carried out within the SEEK 
Project.  SEEK's primary approach (as many of you know) is to use some 
advanced approaches in Knowledge Representation, in conjunction with 
rich metadata-- to clarify the structure, contents and semantics of 
ecological data sets and analytical processes (KR/SMS), and to enable 
ecological researchers to access  tools and services (such as 
distributed data and computing capability: EcoGRID)  via a scientific 
workflow environment (Kepler) This meeting, then,  will consist of a 
number of ecologists with specific interests in the relationship of 
biodiversity and productivity, interacting with computer scientists, 
programmers and technologists, to develop useful test cases and provide 
feedback on these emerging technologies. While we have an agenda, we 
will be  flexible about it depending on how  things go...

Shawn Bowers and Bertram Ludaescher will be our main hosts at Davis.  
Both Bertram and Shawn are computer scientists with strong expertise in 
the areas of semantic mediation, scientific database integration, 
knowledge representation, and scientific workflow management.  Deana 
Pennington and I will be there as "scientific/technical" liaisons, 
hopefully facilitating the interactions and translation from the 
vernacular of computer science to ecological science and vice-versa.

Aimee Stewart and Nico Franz will also be joining us (Aimee on Mo/Tu, 
Nico on We)  to create connections between the TAXON Working Group and 
the BEAM Use cases.  *Aimee*-- would you be able to update us about what 
have been the major foci and developments on Taxon, especially where 
these are relative to taxonomic name resolution for an ecologist using 
field data  (e.g., with a taxonomic ID column in their data or 
metadata).  Nico is going to present on Wed as well some specific 
clarifications of the types of tools ecologists and systematists  might 
need to clarify, store, and use taxonomic concept information.

Manu Jiyal (in conjunction with Peter McCartney) has been working on 
developing ontologies for spatiotemporal phenomena, and we are hoping to 
examine and learn from his (their) progress to date.

If any of the domain scientists (Chalcraft, Cleland, Cox, Suding, Waide, 
Weiher) have any interesting papers on Biodiversity/Productivity to 
share with the rest of us and establish finer focus for the domain side 
of the meeting, please send these out!!  Steve-- rumor has it that you 
have been thinking about next step analyses, so we'll definitely call on 
you to talk about these.

**A. Goals * *

1) *high-level ecological  ontologies**: *develop detailed ontologies 
for Biodiversity &  Productivity-- in terms of relevant concepts and 
relationships from the very general theoretical level, and drilling all 
the way down to the operational level (algorithms and measurements used 
in quantifying biodiversity and productivity).

2) *ontologies of data sets and analyses:* develop detailed ontologies 
of the data sets and analyses needed for some "past" 
Biodiversity/Productivity research, including clarifying the semantics 
and data transformations carried out to merge/integrate/summarize data, 
as well as "describing" the analytical components in ways that 
sufficiently expose their Inputs/Outputs and "functions" in ways that 
will facilitate discover and re-use of these components in alternative 
scientific workflows.

3) *ontologies of ecological methodologies: *develop detailed ontologies 
that capture the essential features and differentiating nuances of the 
field and other methodologies employed in capturing the data to be dealt 
with in these investigations (overlaps with task 2 above).   (We must 
*not* lose sight of the need for "spatiotemporal ontologies", and the 
specific capabilities these will provide us with regards to #1-3 here)

3) *scientific workflows:*  develop detailed scientific workflows for 
some "past" Biodiversity/Productivity research, to formally capture at a 
fine grain the steps and operations, with sufficient semantic annotation 
to facilitate their transparency and reusability by other researchers.

4) *taxonomic capabilities*  identify the specific types of services 
that "ecologists" might need on the select use case to improve the 
ability to deal with taxonomic names, especially in the context of 
historical, long-term and globally distributed biodiversity information

5) *ecological research challenge: * conceive of a "Challenging" 
Biodiversity/Productivity analyses, which could be enabled via a 
scientific workflow managment (Kepler), and that demonstrates the power 
of accessing distributed data and computing resources, and would 
otherwise be highly inefficient or intractable to a "typical" individual 
researcher.

*B.  Approaches*

We will use several ontology editing and creating tools, including 
Protege' (http://protege.stanford.edu) and CMAP 
(http://www.ihmc.us/users/phayes/CODE) to develop and review our 
ontologies.  We will also use a "declarative" method of creating 
ontologies via a tool called "sparrow".

Kepler (http://www.kepler-project.org) is the scientific workflow 
application that we are developing and testing under SEEK, which is 
intended for ecologists to discover data and "stored analyses", and 
assemble and run new "analyses" (or workflows).  We will actively use 
Kepler at this  meeting to capture the sequence and flow of processes 
and data of potentially several biodiversity/productivity analyses.  
These will not be operational at this time, but we will attempt to 
capture the logic such that the actual execution code can be developed 
at a later point.

*C. What to bring*

1) KEY WORDS--Pointers to or digital copies of relevant thesauri, 
controlled vocabulary lists, or books (ideally, digital format) from 
which we can extract and assemble relevant sets of terms, concepts, 
definitions, and their interrrelationships-- bearing on biodiversity, 
productivity, ecological methods and measurements, and specific 
analytical approaches.  This includes any pointers to stuff like GCMD, 
or ESA study section controlled lists, etc.

2) DATA--Any datasets that you feel are particularly representative or 
necessary for intended analyses to be accomplished within this SEEK 
Project.  (NOTE:  the SEEK staff are interested in the tools and 
approaches employed by the scientists, rather than the specific analyses 
per se.  So, we are happy to provide full confidentiality of the 
ecologically relevant data and findings associated with this effort, as 
long as we can report on the ways in which SEEK  tools were used to 
enhance the effectiveness  of addressing those ecologically relevant 
questions.

3) METHODS and METADATA-- Please think about (and bring) whatever 
special methods and other ancillary information is necessary to 
interpret any of the above data.

*D. Background materials--*

Notes from our last BEAM meeting in San Diego (Sep. 21-23, 2004) are here:
http://seek.ecoinformatics.org/Wiki.jsp?page=BeamKnowledgeRepSept04

ONTOLOGIES--
A classic and highly readable  introduction tobuilding  formal 
ontologies can be found here:
http://protege.stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.html 


Another very short, classic introduction to ontologies by Thomas Gruber, 
but somewhat technically phrased:
http://www-ksl.stanford.edu/kst/what-is-an-ontology.html

Rich William's set of ecological ontologies developed for SEEK  (in OWL 
format) can be found here:
http://cvs.ecoinformatics.org/cvs/cvsweb.cgi/seek/projects/kr-sms/OWLOntologies/ 

(hint-- click on file name, then "download" to actually get the OWL file)

SCIENTIFIC WORKFLOWS--

Quick paper about semantics and scientific workflows  (SEEK Paper 
submitted for SSDBM)
http://cvs.ecoinformatics.org/cvs/cvsweb.cgi/~checkout~/kepler/docs/pubs/kepler-SSDBM2005.pdf?rev=1.1&content-type=application/pdf 


A draft of the Kepler User Guide, which explains a lot about Scientific 
Workflows--
http://cvs.ecoinformatics.org/cvs/cvsweb.cgi/~checkout~/kepler/docs/user/KeplerEndUserDoc.pdf?content-type=application/pdf 



*E: Tentative Agenda*

Monday
8:30-9:30       Introductions, Overview of Goals, Status Update (Shawn, 
Deana, Mark)
9:30-10:00      Linkages with Taxon (Aimee, Deana, Mark)
10:00-10:30    Update from Manu about Spatiotemporal ontology work
10:30-10:45    Break
10:45-12:30    Work on Domain Ontologies  (evaluate Rich's ontologies, 
GrOWL, SPARROW)
12:30-1:30      Lunch
1:30-3:30        Cont. Domain Ontologies
3:30-4:00        Break
 4:30-5:30        Discussion and Preview for tomorrow  (past analyses 
and data)

Tuesday
 8:30-10:30      Develop Scientific Workflows  
10:30-10:45     Break
10:45-12:30     Cont. with Scientific Workflows
12:30-1:30       Lunch
1:30-3:30         Develop Ontologies of Scientific Methods
3:30-4:00         Break
4:00-5:30         Discussion and Preview for tomorrow (new scientific 
challenge; new analysis and data needs)

Wednesday  
8:30-10:30       Cont. with Ontologies for Scientific Methods  
(including  presentation by
                         NicoFranz about  Use Case  involving  Taxonomic 
Name Resolution)
10:30-10:45      Break
10:45-12:30      Data Integration Ontologies
12:30-1:30        Lunch
1:30-3:30         Conceive of new BiodivProd challenge addressable 
through Kepler/KR/SMS
3:30-4:00         Break 
4:00-5:30         Further define next step challenge

Thursday
8:30-10:30       Discussion  (ontologies, workflows, next steps, 
assignments)
10:30-10:45     Break
10:45-12:00     Continue wrapup, next meeting?,  and adjourn


======================================================================================= 


-- 
Mark P. Schildhauer, Ph.D. --  Director of Computing
NCEAS --  National Center for Ecological Analysis and Synthesis
735 State St., Suite 300       Santa Barbara, CA   93101-3351	
Email: schild at nceas.ucsb.edu   WEB: http://www.nceas.ucsb.edu
Phone: 805-892-2509            FAX: 805-892-2510

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/seek-kr-sms/attachments/20050303/25d59131/attachment.htm


More information about the Seek-kr-sms mailing list