[seek-kr-sms] SMS Partial Design -- Request for Comments

Matt Jones jones at nceas.ucsb.edu
Tue Apr 6 11:39:55 PDT 2004


Bertram,

The short answer to your question is: right now ecogrid data sets do not 
have simple URL-based access because they use SOAP, but we plan on 
having it in the future once we work out the implementation details.

The longer answer is: Right now, the EcoGrid has a web/grid 
service-based approach to dataset access, rather than a web-based 
approach.  In particular, the EcoGrid "get" operation takes as input an 
identifier and sends back the object, and is parallel to the metacat 
'read' action.  The difference lies in how you communicate with the two 
systems.  In metacat, you send make a simple GET or POST request over 
HTTP to a known endpoint.  In EcoGrid, you send a SOAP message over HTTP 
to the known endpoint.  We have discussed (extensively) the advantages 
of using a non-SOAP binding for the EcoGrid "get" operation, instead 
using a straight HTTP binding.  But, there are a number of tehcnical 
difficulties in doing this using current web-services and grid-services 
toolkits because they do not allow you to mix SOAP and non-SOAP 
operations in one WSDL document.  During our last EcoGrid conference 
call, we agreed that 1) we would just deploy a SOAP "get" operation for 
ecogrid intially (which is implemented already), and 2) we would come 
back to the issue of a straight http binding soon, which probably will 
require us to add support to an underlying toolkit such as Apache Axis 
to support plain HTTP bindings mixed in with SOAP bindings.

Matt

Bertram Ludaescher wrote:
> The qformat={knb,xml} is quite nice.
> 
> Reminds me of what I heard about the LSID (life science id). 
> 
> Do we have in SEEK (EcoGrid?) a simple URI-based mechanism to identify 
> datasets, and URI-components that have different ways to "resolve" the 
> URI (similar to the above knb/xml trick)?
> 
> Bertram
> 
> 
>>>>>>"MJ" == Matt Jones <jones at nceas.ucsb.edu> writes:
> 
> MJ> 
> MJ> Hey Bertram,
> 
>>>Looking at the EML file that Matt pointed us to, I see a lot of
>>>structural information e.g., here
>>>
> 
> MJ> http://metacat.nceas.ucsb.edu/knb/servlet/metacat?action=read&qformat=knb&docid=knb-lter-gce.23.4&displaymodule=entity&entitytype=dataTable&entityindex=1
> 
>>>and some temporal, spatial, and taxonomic coverage.
>>>
>>>Is that were we should start?
> 
> MJ> 
> MJ> Yeah, that's a good place to start. Note that the URL I sent is 
> MJ> delivering the MEL in HTML format, but you can easily switch it to 
> MJ> deliver all of the metadata in XML format by changing "qformat=knb" in 
> MJ> the URL to "qformat=xml" in the URL.  The XML format is obviously much 
> MJ> easier to machine parse, and it includes all of the metadata in one 
> MJ> document, rather than breaking it into two or more as the HTML view does.
> MJ> 
> MJ> Matt
> MJ> 
> MJ> Bertram Ludaescher wrote:
> 
>>>Shawn: 
>>>
>>>First, I wish we all had such nice design documents! Great!
>>>
>>>Second, I agree with Rich and Matt that we need to make use of EML as
>>>much as we can. Also Rich's idea of "initializing" the semantic
>>>registration mapping from the EML info makes sense to me.
>>>
>>>So much for the good news ;-)
>>>
>>>Now I think we need to figure out a way to work with some actual EML
>>>examples and see how those and the "ad-hoc" examples that Shawn has
>>>come up with (or is coming up with) fit into a single framework.
>>>
>>>The nice thing about Shawn's examples (e.g., in the DILS paper) is
>>>that they are simple and show the principle approach. 
>>>
>>>Looking at the EML file that Matt pointed us to, I see a lot of
>>>structural information e.g., here
>>>http://metacat.nceas.ucsb.edu/knb/servlet/metacat?action=read&qformat=knb&docid=knb-lter-gce.23.4&displaymodule=entity&entitytype=dataTable&entityindex=1
>>>
>>>and some temporal, spatial, and taxonomic coverage.
>>>
>>>Is that were we should start?
>>>
>>>Shawn, Rich, Matt?
>>>
>>>Bertram
>>>
>>>
>>>
>>>
>>>
>>>
>>>>>>>>"MJ" == Matt Jones <jones at nceas.ucsb.edu> writes:
>>>
> MJ> 
> MJ> I think its a great approach, Rich.  EML does indeed have a lot of the 
> MJ> information that you would want in the the semantic representation, and 
> MJ> grabbing information out of EML woulod reveal a lot about 
> MJ> addiitons/changes to EML that would be useful.  There are a lot of 
> MJ> partial EML documents filled out, but more recently a few pretty 
> MJ> extensively filled out documents.  In particular, there are 181 data 
> MJ> sets in the KNB from the GCE LTER site that have extensive metadata, 
> MJ> including taxonomic, spatial,and temporal coverage, and full metadata on 
> MJ> the data tables.  And they have data available.  Starting with one or 
> MJ> more of these might be useful as a mapping exercise.  For example, this 
> MJ> data set --
> MJ> 
> MJ> http://metacat.nceas.ucsb.edu/knb/servlet/metacat?action=read&qformat=knb&docid=knb-lter-gce.23.4
> MJ> 
> MJ> -- (and others like it) contains abundance data that could be used by 
> MJ> the Garp algorithm if only a research could determine that the 
> MJ> relationships between what Garp requires and what is present in the data 
> MJ> set are met.
> MJ> 
> MJ> Matt
> MJ> 
> MJ> Rich Williams wrote:
> MJ> 
> 
>>>>>I agree that metadata to semantics is a generic issue, not just an issue for
>>>>>EML.  For example, I expect that we'll find it useful to grab the basic
>>>>>structure (syntax) of an actor from the MoML when semantically describing
>>>>>it.  For now, I think EML is particularly important since it's in use and
>>>>>has significant semantic content.  The ontologies currently provide a basic
>>>>>framework that should be able to handle the mapping, though I'm sure that
>>>>>implementing the mapping will reveal plenty of holes in the details.  I'm
>>>>>ready to work on it if there's consensus that this is an important
>>>>>direction.
>>>>>
>>>>>Rich
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>-----Original Message-----
>>>>>>From: Shawn Bowers [mailto:bowers at sdsc.edu]
>>>>>>Sent: Monday, April 05, 2004 9:32 PM
>>>>>>To: Rich Williams
>>>>>>Cc: seek-kr-sms at ecoinformatics.org; Dave Thau; Ilkay Altintas; Joseph
>>>>>>Goguen; Jenny Guilian WANG
>>>>>>Subject: Re: [seek-kr-sms] SMS Partial Design -- Request for Comments
>>>>>>
>>>>>>
>>>>>>
>>>>>>This makes sense to me: do the metadata "harvesting" first to build the
>>>>>>initial "template" (or as you say, high-level mapping); then let the
>>>>>>data or service provider fill in the additional mapping as needed.
>>>>>>
>>>>>>Note also that there may be different types of metadata: EML for
>>>>>>datasets (there are possibly others for datasets, but we seem focused on
>>>>>>EML) and MoML or WSDL for services.  Not sure how much can be obtained
>>>>>>from the service ones.
>>>>>>
>>>>>>I also wonder if the ontologies are already close to being able to
>>>>>>handle the mapping from the high-level EML.  We should look into this.
>>>>>>
>>>>>>Thanks,
>>>>>>Shawn
>>>>>>
>>>>>>Rich Williams wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>Good stuff Shawn!  Here are a few comments on the registration
>>>>>>
>>>>>>mapping part,
>>>>>>
>>>>>>
>>>>>>
>>>>>>>mainly to do with EML.  I think it's important to leverage the
>>>>>>
>>>>>>work done on
>>>>>>
>>>>>>
>>>>>>
>>>>>>>EML and integrate it with the semantics.  We need to establish a mapping
>>>>>>>between EML and the OWL ontologies and capture the semantics that are
>>>>>>>implicit in EML.
>>>>>>>
>>>>>>>I think that a lot of the semantic description of the dataset as a whole
>>>>>>>could be derived from the EML metadata, assuming it is
>>>>>>
>>>>>>reasonably complete.
>>>>>>
>>>>>>
>>>>>>
>>>>>>>For example, information about the spatial and temporal extent of the
>>>>>>>dataset and about the observed taxa should be in the metadata.
>>>>>>
>>>>>>Then rather
>>>>>>
>>>>>>
>>>>>>
>>>>>>>than handing the user an essentially empty mapping, we will
>>>>>>
>>>>>>have initialized
>>>>>>
>>>>>>
>>>>>>
>>>>>>>the mapping as far as possible from the EML metadata.
>>>>>>>
>>>>>>>Given a data set with EML metadata, I see a two-stage semantic
>>>>>>
>>>>>>registration:
>>>>>>
>>>>>>
>>>>>>
>>>>>>>1)	Automatically create high-level (data set and data table level) RDF
>>>>>>>individuals for an EML-described data set.  They will be useful
>>>>>>
>>>>>>for allowing
>>>>>>
>>>>>>
>>>>>>
>>>>>>>a high level search of a data set, which can be rejected if
>>>>>>
>>>>>>there's nothing
>>>>>>
>>>>>>
>>>>>>
>>>>>>>of interest in the RDF individuals before the more detailed semantic
>>>>>>>registration is used.
>>>>>>>
>>>>>>>2)	Create a lower-level semantic registration of individual
>>>>>>
>>>>>>fields in a data
>>>>>>
>>>>>>
>>>>>>
>>>>>>>table.  This will refer to the higher-level EML-based
>>>>>>
>>>>>>individuals for parts
>>>>>>
>>>>>>
>>>>>>
>>>>>>>of the context that do not change from field to field.  When doing a
>>>>>>>semantic query, these individuals will only need to be instantiated and
>>>>>>>queried if thtere is a higher-level match (#1 above).
>>>>>>>
>>>>>>>Given this, in your document, I think it would make sense to
>>>>>>
>>>>>>re-order the
>>>>>>
>>>>>>
>>>>>>
>>>>>>>sequence proposed, so that step 6 happens before steps 2-5.
>>>>>>>
>>>>>>>Rich
>>>>>
>>>>>
>>>>>_______________________________________________
>>>>>seek-kr-sms mailing list
>>>>>seek-kr-sms at ecoinformatics.org
>>>>>http://www.ecoinformatics.org/mailman/listinfo/seek-kr-sms
>>>
> MJ> 
> MJ> 
> MJ> _______________________________________________
> MJ> seek-kr-sms mailing list
> MJ> seek-kr-sms at ecoinformatics.org
> MJ> http://www.ecoinformatics.org/mailman/listinfo/seek-kr-sms
> MJ> 
> MJ> -- 
> MJ> -------------------------------------------------------------------
> MJ> Matt Jones                                     jones at nceas.ucsb.edu
> MJ> http://www.nceas.ucsb.edu/    Fax: 425-920-2439    Ph: 907-789-0496
> MJ> National Center for Ecological Analysis and Synthesis (NCEAS)
> MJ> University of California Santa Barbara
> MJ> Interested in ecological informatics? http://www.ecoinformatics.org
> MJ> -------------------------------------------------------------------

-- 
-------------------------------------------------------------------
Matt Jones                                     jones at nceas.ucsb.edu
http://www.nceas.ucsb.edu/    Fax: 425-920-2439    Ph: 907-789-0496
National Center for Ecological Analysis and Synthesis (NCEAS)
University of California Santa Barbara
Interested in ecological informatics? http://www.ecoinformatics.org
-------------------------------------------------------------------



More information about the Seek-kr-sms mailing list