[seek-dev] EcoGrid Query and problems with pubDate

Steve Tekell stekell at lternet.edu
Thu Sep 30 15:04:51 PDT 2004


There seems to be some potential problems with pubDate and maybe other
dates.

eml/dataset/pubDate 
can be a year (2002) or a date (2002-06-21)
(maybe year+month and no day is ok, too, but I'll ignore that case for now)
So, for EML, the datatype of the field is ambiguous.
(I haven't begun to look into the other schemas that I'll be searching yet).

If a user enters for start date 2002-01-01, it generates the condition
<condition operator="GREATER THAN OR EQUALS"
concept="/eml/dataset/pubDate">2002-01-01</condition>

which logically should return anything published in 2002 or later, but it
doesn't.  Items with pubDate=2002 won't be returned.  I assume if an item
had a pubDate of 2002-06-21 it would be returned, but I am only getting
items where the pubDate is a Year instead of Date.

I am guessing that it's doing a String compare on pubDate.  Whereas maybe
collection date is actually comparing dates.  Collection date searches all
time out like the geographic boundary searches since it's storing everything
as strings and doing type conversions on the fly.

I guess one solution is for me to cripple the app to only allow pubDate to
be a Year instead of Date and treat it separately from collection dates.
However, this is a EML/Metacat specific solution.  If other schemas store
pubDate as a Date, then using only Year could cause other problems (invalid
input).


I put up a build, a snapshot of my work in progress, on my dev server so
that you can see this problem as well as see the various performance
problems.  
http://lternet-163.lternet.edu:8080/ecogrid/query.jsp 
Remember this is just an early stage test app.
The results screen currently shows the execution time for the EcoGrid client
query as well as the generated Query XML.  So you can grab the XML of
queries that timeout and do other tests.

Steve





More information about the Seek-dev mailing list