[seek-kr-sms] SMS Partial Design -- Request for Comments

Bertram Ludaescher ludaesch at sdsc.edu
Tue Apr 6 10:48:04 PDT 2004


The qformat={knb,xml} is quite nice.

Reminds me of what I heard about the LSID (life science id). 

Do we have in SEEK (EcoGrid?) a simple URI-based mechanism to identify 
datasets, and URI-components that have different ways to "resolve" the 
URI (similar to the above knb/xml trick)?

Bertram

>>>>> "MJ" == Matt Jones <jones at nceas.ucsb.edu> writes:
MJ> 
MJ> Hey Bertram,
>> Looking at the EML file that Matt pointed us to, I see a lot of
>> structural information e.g., here
>> 
MJ> http://metacat.nceas.ucsb.edu/knb/servlet/metacat?action=read&qformat=knb&docid=knb-lter-gce.23.4&displaymodule=entity&entitytype=dataTable&entityindex=1
>> 
>> and some temporal, spatial, and taxonomic coverage.
>> 
>> Is that were we should start?
MJ> 
MJ> Yeah, that's a good place to start. Note that the URL I sent is 
MJ> delivering the MEL in HTML format, but you can easily switch it to 
MJ> deliver all of the metadata in XML format by changing "qformat=knb" in 
MJ> the URL to "qformat=xml" in the URL.  The XML format is obviously much 
MJ> easier to machine parse, and it includes all of the metadata in one 
MJ> document, rather than breaking it into two or more as the HTML view does.
MJ> 
MJ> Matt
MJ> 
MJ> Bertram Ludaescher wrote:
>> Shawn: 
>> 
>> First, I wish we all had such nice design documents! Great!
>> 
>> Second, I agree with Rich and Matt that we need to make use of EML as
>> much as we can. Also Rich's idea of "initializing" the semantic
>> registration mapping from the EML info makes sense to me.
>> 
>> So much for the good news ;-)
>> 
>> Now I think we need to figure out a way to work with some actual EML
>> examples and see how those and the "ad-hoc" examples that Shawn has
>> come up with (or is coming up with) fit into a single framework.
>> 
>> The nice thing about Shawn's examples (e.g., in the DILS paper) is
>> that they are simple and show the principle approach. 
>> 
>> Looking at the EML file that Matt pointed us to, I see a lot of
>> structural information e.g., here
>> http://metacat.nceas.ucsb.edu/knb/servlet/metacat?action=read&qformat=knb&docid=knb-lter-gce.23.4&displaymodule=entity&entitytype=dataTable&entityindex=1
>> 
>> and some temporal, spatial, and taxonomic coverage.
>> 
>> Is that were we should start?
>> 
>> Shawn, Rich, Matt?
>> 
>> Bertram
>> 
>> 
>> 
>> 
>> 
>>>>>>> "MJ" == Matt Jones <jones at nceas.ucsb.edu> writes:
>> 
MJ> 
MJ> I think its a great approach, Rich.  EML does indeed have a lot of the 
MJ> information that you would want in the the semantic representation, and 
MJ> grabbing information out of EML woulod reveal a lot about 
MJ> addiitons/changes to EML that would be useful.  There are a lot of 
MJ> partial EML documents filled out, but more recently a few pretty 
MJ> extensively filled out documents.  In particular, there are 181 data 
MJ> sets in the KNB from the GCE LTER site that have extensive metadata, 
MJ> including taxonomic, spatial,and temporal coverage, and full metadata on 
MJ> the data tables.  And they have data available.  Starting with one or 
MJ> more of these might be useful as a mapping exercise.  For example, this 
MJ> data set --
MJ> 
MJ> http://metacat.nceas.ucsb.edu/knb/servlet/metacat?action=read&qformat=knb&docid=knb-lter-gce.23.4
MJ> 
MJ> -- (and others like it) contains abundance data that could be used by 
MJ> the Garp algorithm if only a research could determine that the 
MJ> relationships between what Garp requires and what is present in the data 
MJ> set are met.
MJ> 
MJ> Matt
MJ> 
MJ> Rich Williams wrote:
MJ> 
>> 
>>>> I agree that metadata to semantics is a generic issue, not just an issue for
>>>> EML.  For example, I expect that we'll find it useful to grab the basic
>>>> structure (syntax) of an actor from the MoML when semantically describing
>>>> it.  For now, I think EML is particularly important since it's in use and
>>>> has significant semantic content.  The ontologies currently provide a basic
>>>> framework that should be able to handle the mapping, though I'm sure that
>>>> implementing the mapping will reveal plenty of holes in the details.  I'm
>>>> ready to work on it if there's consensus that this is an important
>>>> direction.
>>>> 
>>>> Rich
>>>> 
>>>> 
>>>> 
>>>>> -----Original Message-----
>>>>> From: Shawn Bowers [mailto:bowers at sdsc.edu]
>>>>> Sent: Monday, April 05, 2004 9:32 PM
>>>>> To: Rich Williams
>>>>> Cc: seek-kr-sms at ecoinformatics.org; Dave Thau; Ilkay Altintas; Joseph
>>>>> Goguen; Jenny Guilian WANG
>>>>> Subject: Re: [seek-kr-sms] SMS Partial Design -- Request for Comments
>>>>> 
>>>>> 
>>>>> 
>>>>> This makes sense to me: do the metadata "harvesting" first to build the
>>>>> initial "template" (or as you say, high-level mapping); then let the
>>>>> data or service provider fill in the additional mapping as needed.
>>>>> 
>>>>> Note also that there may be different types of metadata: EML for
>>>>> datasets (there are possibly others for datasets, but we seem focused on
>>>>> EML) and MoML or WSDL for services.  Not sure how much can be obtained
>>>>> from the service ones.
>>>>> 
>>>>> I also wonder if the ontologies are already close to being able to
>>>>> handle the mapping from the high-level EML.  We should look into this.
>>>>> 
>>>>> Thanks,
>>>>> Shawn
>>>>> 
>>>>> Rich Williams wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>>> Good stuff Shawn!  Here are a few comments on the registration
>>>>> 
>>>>> mapping part,
>>>>> 
>>>>> 
>>>>>> mainly to do with EML.  I think it's important to leverage the
>>>>> 
>>>>> work done on
>>>>> 
>>>>> 
>>>>>> EML and integrate it with the semantics.  We need to establish a mapping
>>>>>> between EML and the OWL ontologies and capture the semantics that are
>>>>>> implicit in EML.
>>>>>> 
>>>>>> I think that a lot of the semantic description of the dataset as a whole
>>>>>> could be derived from the EML metadata, assuming it is
>>>>> 
>>>>> reasonably complete.
>>>>> 
>>>>> 
>>>>>> For example, information about the spatial and temporal extent of the
>>>>>> dataset and about the observed taxa should be in the metadata.
>>>>> 
>>>>> Then rather
>>>>> 
>>>>> 
>>>>>> than handing the user an essentially empty mapping, we will
>>>>> 
>>>>> have initialized
>>>>> 
>>>>> 
>>>>>> the mapping as far as possible from the EML metadata.
>>>>>> 
>>>>>> Given a data set with EML metadata, I see a two-stage semantic
>>>>> 
>>>>> registration:
>>>>> 
>>>>> 
>>>>>> 1)	Automatically create high-level (data set and data table level) RDF
>>>>>> individuals for an EML-described data set.  They will be useful
>>>>> 
>>>>> for allowing
>>>>> 
>>>>> 
>>>>>> a high level search of a data set, which can be rejected if
>>>>> 
>>>>> there's nothing
>>>>> 
>>>>> 
>>>>>> of interest in the RDF individuals before the more detailed semantic
>>>>>> registration is used.
>>>>>> 
>>>>>> 2)	Create a lower-level semantic registration of individual
>>>>> 
>>>>> fields in a data
>>>>> 
>>>>> 
>>>>>> table.  This will refer to the higher-level EML-based
>>>>> 
>>>>> individuals for parts
>>>>> 
>>>>> 
>>>>>> of the context that do not change from field to field.  When doing a
>>>>>> semantic query, these individuals will only need to be instantiated and
>>>>>> queried if thtere is a higher-level match (#1 above).
>>>>>> 
>>>>>> Given this, in your document, I think it would make sense to
>>>>> 
>>>>> re-order the
>>>>> 
>>>>> 
>>>>>> sequence proposed, so that step 6 happens before steps 2-5.
>>>>>> 
>>>>>> Rich
>>>> 
>>>> 
>>>> _______________________________________________
>>>> seek-kr-sms mailing list
>>>> seek-kr-sms at ecoinformatics.org
>>>> http://www.ecoinformatics.org/mailman/listinfo/seek-kr-sms
>> 
MJ> 
MJ> 
MJ> _______________________________________________
MJ> seek-kr-sms mailing list
MJ> seek-kr-sms at ecoinformatics.org
MJ> http://www.ecoinformatics.org/mailman/listinfo/seek-kr-sms
MJ> 
MJ> -- 
MJ> -------------------------------------------------------------------
MJ> Matt Jones                                     jones at nceas.ucsb.edu
MJ> http://www.nceas.ucsb.edu/    Fax: 425-920-2439    Ph: 907-789-0496
MJ> National Center for Ecological Analysis and Synthesis (NCEAS)
MJ> University of California Santa Barbara
MJ> Interested in ecological informatics? http://www.ecoinformatics.org
MJ> -------------------------------------------------------------------



More information about the Seek-kr-sms mailing list