[seek-dev] Re: [kepler-dev] When Searching

Rod Spears rods at ku.edu
Tue Oct 5 09:38:29 PDT 2004

The Ecogrid registry will have a certain amount of metadata discribing 
what would be the equivalent of a portal, and although we are in the 
middle of implementing the registry and determining what exactly will be 
in the metadata.... I would imagine it will describe the providers and 
that it is a "herp" enpoint, but I doubt it would describe anything much 
more specific than that. In otherwords, I doubt it would describe that 
it is an endpoint for just "frogs".

Now the SRB and Metacat have metadata associated with each data item. 
When you use the search form today, it uses a hardcoded Metacat endpoint 
and searches the metadata for the value you entered.

The interesting thing about DiGIR is that it doesn't have any metadata* 
(see below) about all the data in each provider. Unlike the other 
datasources, DiGIR providers use a specific schema i.e. DarwinCore. So 
when a search is completed we will be creating the metadata "on fly" to 
describe the providers that found the data in question.

Meaning if a user searched all the Herp providers for a specific species 
of frog they may get 200 data records about this frog. So we will 
condense this data down to the one or more providers that the records 
came from and these providers will display in the list in the "Data" 
tab. Then the user can be more specific about which datasets they want 
to use in their workflow. (Although, after reading Dan's mail, I am not 
sure having a list of providers is really the thing we need)

* DiGIR providers do have metadata that is avilable and there is a 
keyword element ni the returned XML, the contents of that typically does 
not have the fidelity that we need. For example, the provider "MaNIS 
data provider for the Natural History Museum of Los Angeles County." has 
the keywords "museum, specimen, mammal" but it doesn't contain any 
metadata about what kind of mammals.


Dan Higgins wrote:

> Hi Rod,
>    I had some thoughts on a specific example of search and return 
> fields that I thought I would pass on.
>    As part of the GARP workflow, we have the need for an actor to 
> query DiGR to return species occurance information; this informatin is 
> to be used as an input to GARP calculations. I must admit that I 
> currently don't understand DiGR and Darwin Core well enough to even 
> formulate such a query, but I think Ricardo Pereira there at KU may 
> have already been creating such queries for some GARP work. But we 
> need to have an actor which will return occurance/abundance data for a 
> number (~1000) species.
>    The current thinking is that we need to run numerous parallel 
> calculations (one for aach species), so we need to create a list of 
> specied occurance data (DiGR query?), remove a species from the list 
> when GARP has been run for that species, and save the GARP results for 
> later analyses.
>    So, in any case, it seems we need some query capability beyond the 
> Data Search Tab.
> Dan Higgins
> -----
> Rod Spears wrote:
>> I know the "Data" Search Tab and field is somewhat temporary... But 
>> given that is what we have today and will have for sometime, should I 
>> use the text in the search field to search every field in the Darwin 
>> core schema (as Chad has suggested, which is obviously more costly 
>> than search a specific field, but how costly I don't know) or should 
>> I enable it so they can search a specific field? Since these are 
>> scientists would it make sense they know what field they are 
>> searching for?
>> Rod
>> Deana Pennington wrote:
>>> I think you should make all of them searchable.  Deana
>>> Chad Berkley wrote:
>>>> ahh, ok.  that wasn't clear from your first message.  I'm not an 
>>>> ecologist so I'm not completely sure, but it seems like all of them 
>>>> are important.  Is there a penalty for searching them all?  Is 
>>>> there a domain scientist that can chime in on this?
>>>> chad
>>>> Rod Spears wrote:
>>>>> I am in the middle of implementing all this and was wondering what 
>>>>> fields "should" be searched.
>>>>> I have it all working, I am just making it more generic.
>>>>> Rod
>>>>> Chad Berkley wrote:
>>>>>> Hi Rod,
>>>>>> If you're asking what DiGIR fields are currently being searched 
>>>>>> on the grid, the answer is none.  To my knowledge, DiGIR does not 
>>>>>> currently have an ecogrid interface so it is not searched.
>>>>>> chad
>>>>>> Rod Spears wrote:
>>>>>>> When a search is done in the "data" tab what are the searching 
>>>>>>> for in DiGIR?
>>>>>>> ScientificName
>>>>>>> Kingdom
>>>>>>> Phylum
>>>>>>> Class
>>>>>>> Order
>>>>>>> Family
>>>>>>> Genus
>>>>>>> Species
>>>>>>> Subspecies
>>>>>>> Rod
>>>>>>> _______________________________________________
>>>>>>> kepler-dev mailing list
>>>>>>> kepler-dev at ecoinformatics.org
>>>>>>> http://www.ecoinformatics.org/mailman/listinfo/kepler-dev
>>>>> -- 
>>>>> Rod Spears
>>>>> Biodiversity Research Center
>>>>> University of Kansas
>>>>> 1345 Jayhawk Boulevard
>>>>> Lawrence, KS 66045, USA
>>>>> Tel: 785 864-4082, Fax: 785 864-5335
>>>> _______________________________________________
>>>> seek-dev mailing list
>>>> seek-dev at ecoinformatics.org
>>>> http://www.ecoinformatics.org/mailman/listinfo/seek-dev
>> -- 
>> Rod Spears
>> Biodiversity Research Center
>> University of Kansas
>> 1345 Jayhawk Boulevard
>> Lawrence, KS 66045, USA
>> Tel: 785 864-4082, Fax: 785 864-5335

More information about the Kepler-dev mailing list