[seek-dev] Re: GRID etc. questions from European LTER
Matt Jones
jones at nceas.ucsb.edu
Tue Aug 24 11:10:54 PDT 2004
Hi Katharina,
I had some great discussions with Mandy Lane while I was in Edinburgh,
and I am looking forward to building collaborations with our colleagues
in Europe. I was looking for the Alter-Net web site, but had difficulty
finding it -- there's not really any mention of it at the CEH web site.
Can you point me to the right web site?
I'll give some brief answers to your questions below, but I assume
you've seen both the SEEK web site (http://seek.ecoinformatics.org) and
the Kepler site (http://kepler-project.org). All of our code and tools
are openly available through those sites.
Schleidt Katharina wrote:
> Hi Matt,
>
> as already discussed with Mandy Lane (context: UK research-council
> Funding for sister projects), we of the European ALTER-Net Project are
> seeking cooperation with the American LTER and SEEK Communities. From
> what we have been able discern from the various online publications we
> have gotten the impression that you are the EcoGRID expert for the SEEK
> project. When your name also fell in a recent meeting with
> representatives of the AUSTRIAN GRID Initiative who are going to support
> us in gaining an overview of available GRID Middleware components (Do
> you happen to know Mr. Volkert and Mr. Kranzlmüller from Kepler
> University in Linz, Austria?), we decided it was time to get in touch
> with you personally!
>
> As we are the ones in the ALTER-Net project who get to do the grunt work
> (looking for the technology to be used, finding an ontology based on the
> consensus of the European scientists, developing methods for semantic
> mediation), we would like to take the opportunity to ask a few concrete
> questions on subjects, we are working on:
>
> * What is the current status of EcoGRID?
EcoGrid is an evolving framework for allowing diverse data systems to
interoperate. Right now it is a small set of grid services (OGSA
compliant) that allow a uniform syntax for expressing metadata queries,
providing responses, and downloading and uploading data. Ancillary
services such as authentication and access control are part of the
definitions as well. The data objects themselves are described in one
of several metadata languages, but we are mostly focusing on Ecological
Metadata Language (EML) because it is rich enough to allow machine
processing of heterogeneous data sources. However, we also use Darwin
Core and plan to support agency standards like the Bio Data Profile. We
are currently implementing the interfaces for 'writing' data to EcoGrid
nodes, which when complete will finish a large number of the basic data
access interfaces we intend to develop.
We are currently prototyping EcoGrid interfaces to work against four
very diverse data systems: the KNB Metacat system [1], the DiGIR
protocol [2] for access to collections data, the Storage Resource Broker
(SRB), and the Xanthoria system. We have implemented basic EcoGrid
interfaces for all of these services, and think that they are
representative of the types of systems widely used to store
environemntal data.
At the current time we do not have mediators that can do standard query
translation among the various metadata languages, but we are hoping to
incorporate that in the next year -- if possible, I'm hoping we can
leverage our semantic mediation system for this, but we'll see.
> * Is an overview of the architecture of GRID Middleware components
> available? We are having difficulties determining which components
> could be relevant for our project requirements.
I'm not exactly sure what you're asking for here. We use the Globus
Toolkit (GT) as our middleware layer, and there are lots of overviews of
GT around (see http://globus.org). So far we have found GT to be
extremely difficult to use, and we're not using the built-in services
very much. Compared to using plain web services, the GT has been
extremely slow going.
If what you're looking for here is an overview of the EcoGrid, maybe a
recent presentation I gave would help (see [3]).
> * What are the further plans for EcoGRID?
We envision EcoGrid to be a tiered set of interface definitions that can
be implemented by any data provider. EcoGrid will provide a distributed
registry service that allows providers to register their existence and
describe which of the interfaces they support as well as to document
their coverage. EcoGrid will also provide aggregation and indexing
nodes that will collate content from multiple data providers to allow
for efficient searching of large numbers of nodes. Finally, EcoGrid
will define a number of standardized interfaces for registering
computational services that are available. We hope to provide a
scheduling and optimization service that can calculate the best set of
nodes on which to run analyses and models based on a node availability,
data location, and other characteristics. Finally, we are working on
semantic annotations to both data and computational services that will
allow for more effective discovery and integration of those services.
These integration services will likely be manifested as part of a
workflow system called Kepler that we are building with partners, but it
may be incorporated directly into the EcoGrid as well.
> * Is EcoGRID based on the Globus Toolkit (which version?), and what
> is the general relationship between the Globus Alliance and SEEK?
Yes. We've used versions 2.1, 3.0.0, 3.0.1, 3.0.2, and 3.2. We may
move to the WSRF implementations as they stabilize, but that has yet to
be determined. We've had lots of difficulty using the GT, so we have
considered dropping it in favor of plain web services for some time.
The Grid Center project, a member of the Globus Alliance, has devoted
some support to our project, and we hope that this will allow us to more
effectively make use of the GT.
> * Does your GRID implementation currently support "semantic
> mediation", or what are the plans for the future in regards to this?
SEEK has developed a semantic mediation system called Sparrow that
includes both a human-readable syntax for logic statements that can
translate to and from OWL, and a reasoner based on some existing
reasoning tools. Our general approach is outlined in some papers,
notably [4], [5], and [6]. We have prototyped some semantic mediation
tasks in a case study using Sparrow (see [7]), but are not finished
enough to say that we have completed the semantic mediation component.
Currently, none of our EcoGrid or Kepler releases incorporate any of
these semantic mediation tools, but we are hoping to integrate them in
the next year. We are hoping that our late fall 2004 release of Kepler
will have an ontology-driven data and model browsing facility included.
> * Does EcoGRID support the ontologies (in Edinburgh several
> statements made us suspect this)
What do you mean by support? We plan to have ontology-driven searching
allowing indirect references through the EcoGrid, but it currently does
not support this. The current EcoGrid system is a simple abstraction in
front of metadata catalogs that also supports data access and data
queries. We are hoping that this year it will support semantics much
more fully. We have been developing ontologies for describing the
semantics of heterogeneous data sources, and have several fairly
complete top-level ontologies that we've been testing. I think our two
major challenges to making this work will be: 1) developing fully
specified ontologies that cover much of biology and environemntal
science, and 2) annotating large numbers of data sets with references
into these ontologies so that they can be used within the mediation system.
> The answers to these questions would be a great help to us, not only in
> the continuation of our work, but also in our goal of avoiding redundant
> work.
There are never enough people working on this stuff. We would be
excited if you would be interested in collaborating. We run all of our
efforts as open collaborations. The one that has garnered the most
interest is our Kepler Project (http://kepler-project.org) to build an
open framework for scientific workflows. If you were interested in
contributing to that environment or collaborating on other aspects we
would welcome it. Lets start a conversation to see how we might work
together to make best use of our resources.
Cheers,
Matt
References
------------
[1] http://knb.ecoinformatics.org/ and
http://knb.ecoinformatics.org/software/metacat
[2] http://digir.sourceforge.net/
[3]
http://cvs.ecoinformatics.org/cvs/cvsweb.cgi/~checkout~/seek/docs/presentations/jones-ecogrid-lterim-20040728.ppt?rev=1.2&content-type=application/vnd.ms-powerpoint
[4]
http://cvs.ecoinformatics.org/cvs/cvsweb.cgi/~checkout~/seek/projects/kr-sms/docs/ssdbm04_bowers.pdf?rev=1.1&content-type=application/pdf
[5]
http://cvs.ecoinformatics.org/cvs/cvsweb.cgi/~checkout~/seek/projects/kr-sms/docs/scisw03.pdf?rev=1.2&content-type=application/pdf
[6]
http://cvs.ecoinformatics.org/cvs/cvsweb.cgi/~checkout~/seek/projects/kr-sms/docs/DILS04/dils04-Bowers-Ludaescher.pdf?rev=1.1&content-type=application/pdf
[7]
http://cvs.ecoinformatics.org/cvs/cvsweb.cgi/~checkout~/seek/projects/kr-sms/docs/swdb04.pdf?rev=1.2&content-type=application/pdf
>
>
>
> Thanks for your help!
>
>
>
> Herbert Schentz & Kathi Schleidt
>
>
>
> **__________________________________________________________________________________________________***
>
> *kathi schleidt
>
> IT-Entwicklung
> IT-Development
> T: +43-(0)1-313 04/5363
> F: +43-(0)1-313 04/3555
> katharina.schleidt at umweltbundesamt.at
> <mailto:katharina.schleidt at umweltbundesamt.at>
>
> *umwelt**bundesamt*
> Spittelauer Lände 5
> A-1090 Wien
> Österreich/Austria
> http://www.umweltbundesamt.at <http://www.umweltbundesamt.at/>
>
> Die Informationen in dieser Nachricht sind vertraulich und
> ausschließlich für die/den AdressatIn bestimmt. Sollten
> Sie diese Nachricht irrtümlich erhalten haben, benachrichtigen Sie bitte
> umgehend die/den SenderIn und löschen
> Sie das Original. Jede andere Verwendung dieses E-Mails ist untersagt.
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise private
> information. If you have recieved it in error, please notify the sender
> immediately and delete the original. Any other
> use of the email by you is prohibited.
>
>
>
--
-------------------------------------------------------------------
Matt Jones jones at nceas.ucsb.edu
http://www.nceas.ucsb.edu/ Fax: 425-920-2439 Ph: 907-789-0496
National Center for Ecological Analysis and Synthesis (NCEAS)
University of California Santa Barbara
Interested in ecological informatics? http://www.ecoinformatics.org
-------------------------------------------------------------------
More information about the Seek-dev
mailing list