[kepler-dev] kepler data search

Jing Tao tao at nceas.ucsb.edu
Thu Jul 21 11:04:49 PDT 2005


Hi, Dan:

I did some test and found the difference is depends on query itself.

Here is the kepler's equivalent query to metacat:

<pathquery version="1.2">
<querytitle>Untitled-Search-2</querytitle>
  <returndoctype>eml://ecoinformatics.org/eml-2.0.0</returndoctype>
  <returnfield>dataset/title</returnfield>
  <returnfield>individualName/surName</returnfield>
  <returnfield>entityName</returnfield>
  <querygroup operator="INTERSECT">
    <queryterm searchmode="contains" casesensitive="false">
          <value>Datos</value>
          <pathexpr>dataset/title</pathexpr>
    </queryterm>
    <querygroup operator="UNION">
       <queryterm searchmode="contains" casesensitive="false">
          <value>http://%</value>
        
<pathexpr>dataset/dataTable/physical/distribution/online/url</pathexpr>
       </queryterm>
       <queryterm searchmode="contains" casesensitive="false">
          <value>ecogrid://%</value>
        
<pathexpr>dataset/dataTable/physical/distribution/online/url</pathexpr>
       </queryterm>
         <queryterm searchmode="contains" casesensitive="false">
          <value>srb://%</value>
     
<pathexpr>dataset/spatialRaster/physical/distribution/online/url</pathexpr>
       </queryterm>
    </querygroup>
  </querygroup>
</pathquery>
If I use the query page at 
http://ecogrid.ecoinformatics.org/ogsa/style/skins/dev/querymetacat.html
and put the query in there and it turned out the search time will be 5 
minutes and 30 seconds. It almost as same as kepler search time.

If I simplified the query (get rid of http, ecogrid etal constrain) to:
<pathquery version="1.2">
<querytitle>Untitled-Search-2</querytitle>
  <returndoctype>eml://ecoinformatics.org/eml-2.0.0</returndoctype>
  <returnfield>dataset/title</returnfield>
  <returnfield>individualName/surName</returnfield>
  <returnfield>entityName</returnfield>
  <querygroup operator="UNION">
    <queryterm searchmode="contains" casesensitive="false">
          <value>Datos</value>
          <pathexpr>dataset/title</pathexpr>
    </queryterm>
  </querygroup>
</pathquery>

It turned out the search need 1 minute and 30 seconds.  The above 
results is about postgresql.

Here is some intrested things about oracle. If I point to knb metacat 
which is running on Oracle by
http://knb.ecoinformatics.org/knb/style/skins/dev/querymetacat.html
The first query running need 1minute 30 second
The second query running need 5 seconds!

Here is the table about the result
                                   complex query      simple query
ecogrid(Postgresql)       530 sec                  90 sec
knb(oracle)                    90   sec                   5 sec


Thanks,

Jing



Dan Higgins wrote:

> I was pointing at KNB. If morpho points to the ecogrid metacat, the 
> search takes only 5 sec!!! So the difference with Kepler is even more 
> extreme!
>
> Dan
>
> Jing Tao wrote:
>
>> Hi, Dan:
>>
>> Where did the morpho point to? KNB or ecogrid.ecoinformatics.org? The 
>> two metacats use different dbs.  And the kepler's qurey is complex 
>> than morpho.
>>
>> Jing
>> Dan Higgins wrote:
>>
>>> Hi All,
>>>
>>>    Hey, something is very strange in Kepler data searches. A Kepler 
>>> search for 'datos' now takes about 6 minutes, while a Morpho search 
>>> for 'datos' returns in 20 seconds! (I know they are different 
>>> metacats and there is an ocogrid overhead, but that differnce is huge!)
>>>
>>> Dan
>>>
>
>



More information about the Kepler-dev mailing list