[seek-kr-sms] SMS Partial Design -- Request for Comments
Bertram Ludaescher
ludaesch at sdsc.edu
Tue Apr 6 10:48:04 PDT 2004
The qformat={knb,xml} is quite nice.
Reminds me of what I heard about the LSID (life science id).
Do we have in SEEK (EcoGrid?) a simple URI-based mechanism to identify
datasets, and URI-components that have different ways to "resolve" the
URI (similar to the above knb/xml trick)?
Bertram
>>>>> "MJ" == Matt Jones <jones at nceas.ucsb.edu> writes:
MJ>
MJ> Hey Bertram,
>> Looking at the EML file that Matt pointed us to, I see a lot of
>> structural information e.g., here
>>
MJ> http://metacat.nceas.ucsb.edu/knb/servlet/metacat?action=read&qformat=knb&docid=knb-lter-gce.23.4&displaymodule=entity&entitytype=dataTable&entityindex=1
>>
>> and some temporal, spatial, and taxonomic coverage.
>>
>> Is that were we should start?
MJ>
MJ> Yeah, that's a good place to start. Note that the URL I sent is
MJ> delivering the MEL in HTML format, but you can easily switch it to
MJ> deliver all of the metadata in XML format by changing "qformat=knb" in
MJ> the URL to "qformat=xml" in the URL. The XML format is obviously much
MJ> easier to machine parse, and it includes all of the metadata in one
MJ> document, rather than breaking it into two or more as the HTML view does.
MJ>
MJ> Matt
MJ>
MJ> Bertram Ludaescher wrote:
>> Shawn:
>>
>> First, I wish we all had such nice design documents! Great!
>>
>> Second, I agree with Rich and Matt that we need to make use of EML as
>> much as we can. Also Rich's idea of "initializing" the semantic
>> registration mapping from the EML info makes sense to me.
>>
>> So much for the good news ;-)
>>
>> Now I think we need to figure out a way to work with some actual EML
>> examples and see how those and the "ad-hoc" examples that Shawn has
>> come up with (or is coming up with) fit into a single framework.
>>
>> The nice thing about Shawn's examples (e.g., in the DILS paper) is
>> that they are simple and show the principle approach.
>>
>> Looking at the EML file that Matt pointed us to, I see a lot of
>> structural information e.g., here
>> http://metacat.nceas.ucsb.edu/knb/servlet/metacat?action=read&qformat=knb&docid=knb-lter-gce.23.4&displaymodule=entity&entitytype=dataTable&entityindex=1
>>
>> and some temporal, spatial, and taxonomic coverage.
>>
>> Is that were we should start?
>>
>> Shawn, Rich, Matt?
>>
>> Bertram
>>
>>
>>
>>
>>
>>>>>>> "MJ" == Matt Jones <jones at nceas.ucsb.edu> writes:
>>
MJ>
MJ> I think its a great approach, Rich. EML does indeed have a lot of the
MJ> information that you would want in the the semantic representation, and
MJ> grabbing information out of EML woulod reveal a lot about
MJ> addiitons/changes to EML that would be useful. There are a lot of
MJ> partial EML documents filled out, but more recently a few pretty
MJ> extensively filled out documents. In particular, there are 181 data
MJ> sets in the KNB from the GCE LTER site that have extensive metadata,
MJ> including taxonomic, spatial,and temporal coverage, and full metadata on
MJ> the data tables. And they have data available. Starting with one or
MJ> more of these might be useful as a mapping exercise. For example, this
MJ> data set --
MJ>
MJ> http://metacat.nceas.ucsb.edu/knb/servlet/metacat?action=read&qformat=knb&docid=knb-lter-gce.23.4
MJ>
MJ> -- (and others like it) contains abundance data that could be used by
MJ> the Garp algorithm if only a research could determine that the
MJ> relationships between what Garp requires and what is present in the data
MJ> set are met.
MJ>
MJ> Matt
MJ>
MJ> Rich Williams wrote:
MJ>
>>
>>>> I agree that metadata to semantics is a generic issue, not just an issue for
>>>> EML. For example, I expect that we'll find it useful to grab the basic
>>>> structure (syntax) of an actor from the MoML when semantically describing
>>>> it. For now, I think EML is particularly important since it's in use and
>>>> has significant semantic content. The ontologies currently provide a basic
>>>> framework that should be able to handle the mapping, though I'm sure that
>>>> implementing the mapping will reveal plenty of holes in the details. I'm
>>>> ready to work on it if there's consensus that this is an important
>>>> direction.
>>>>
>>>> Rich
>>>>
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Shawn Bowers [mailto:bowers at sdsc.edu]
>>>>> Sent: Monday, April 05, 2004 9:32 PM
>>>>> To: Rich Williams
>>>>> Cc: seek-kr-sms at ecoinformatics.org; Dave Thau; Ilkay Altintas; Joseph
>>>>> Goguen; Jenny Guilian WANG
>>>>> Subject: Re: [seek-kr-sms] SMS Partial Design -- Request for Comments
>>>>>
>>>>>
>>>>>
>>>>> This makes sense to me: do the metadata "harvesting" first to build the
>>>>> initial "template" (or as you say, high-level mapping); then let the
>>>>> data or service provider fill in the additional mapping as needed.
>>>>>
>>>>> Note also that there may be different types of metadata: EML for
>>>>> datasets (there are possibly others for datasets, but we seem focused on
>>>>> EML) and MoML or WSDL for services. Not sure how much can be obtained
>>>>> from the service ones.
>>>>>
>>>>> I also wonder if the ontologies are already close to being able to
>>>>> handle the mapping from the high-level EML. We should look into this.
>>>>>
>>>>> Thanks,
>>>>> Shawn
>>>>>
>>>>> Rich Williams wrote:
>>>>>
>>>>>
>>>>>
>>>>>> Good stuff Shawn! Here are a few comments on the registration
>>>>>
>>>>> mapping part,
>>>>>
>>>>>
>>>>>> mainly to do with EML. I think it's important to leverage the
>>>>>
>>>>> work done on
>>>>>
>>>>>
>>>>>> EML and integrate it with the semantics. We need to establish a mapping
>>>>>> between EML and the OWL ontologies and capture the semantics that are
>>>>>> implicit in EML.
>>>>>>
>>>>>> I think that a lot of the semantic description of the dataset as a whole
>>>>>> could be derived from the EML metadata, assuming it is
>>>>>
>>>>> reasonably complete.
>>>>>
>>>>>
>>>>>> For example, information about the spatial and temporal extent of the
>>>>>> dataset and about the observed taxa should be in the metadata.
>>>>>
>>>>> Then rather
>>>>>
>>>>>
>>>>>> than handing the user an essentially empty mapping, we will
>>>>>
>>>>> have initialized
>>>>>
>>>>>
>>>>>> the mapping as far as possible from the EML metadata.
>>>>>>
>>>>>> Given a data set with EML metadata, I see a two-stage semantic
>>>>>
>>>>> registration:
>>>>>
>>>>>
>>>>>> 1) Automatically create high-level (data set and data table level) RDF
>>>>>> individuals for an EML-described data set. They will be useful
>>>>>
>>>>> for allowing
>>>>>
>>>>>
>>>>>> a high level search of a data set, which can be rejected if
>>>>>
>>>>> there's nothing
>>>>>
>>>>>
>>>>>> of interest in the RDF individuals before the more detailed semantic
>>>>>> registration is used.
>>>>>>
>>>>>> 2) Create a lower-level semantic registration of individual
>>>>>
>>>>> fields in a data
>>>>>
>>>>>
>>>>>> table. This will refer to the higher-level EML-based
>>>>>
>>>>> individuals for parts
>>>>>
>>>>>
>>>>>> of the context that do not change from field to field. When doing a
>>>>>> semantic query, these individuals will only need to be instantiated and
>>>>>> queried if thtere is a higher-level match (#1 above).
>>>>>>
>>>>>> Given this, in your document, I think it would make sense to
>>>>>
>>>>> re-order the
>>>>>
>>>>>
>>>>>> sequence proposed, so that step 6 happens before steps 2-5.
>>>>>>
>>>>>> Rich
>>>>
>>>>
>>>> _______________________________________________
>>>> seek-kr-sms mailing list
>>>> seek-kr-sms at ecoinformatics.org
>>>> http://www.ecoinformatics.org/mailman/listinfo/seek-kr-sms
>>
MJ>
MJ>
MJ> _______________________________________________
MJ> seek-kr-sms mailing list
MJ> seek-kr-sms at ecoinformatics.org
MJ> http://www.ecoinformatics.org/mailman/listinfo/seek-kr-sms
MJ>
MJ> --
MJ> -------------------------------------------------------------------
MJ> Matt Jones jones at nceas.ucsb.edu
MJ> http://www.nceas.ucsb.edu/ Fax: 425-920-2439 Ph: 907-789-0496
MJ> National Center for Ecological Analysis and Synthesis (NCEAS)
MJ> University of California Santa Barbara
MJ> Interested in ecological informatics? http://www.ecoinformatics.org
MJ> -------------------------------------------------------------------
More information about the Seek-kr-sms
mailing list