[seek-dev] RE: resultset question

Chad Berkley berkley at nceas.ucsb.edu
Tue Apr 20 09:04:38 PDT 2004


Hi,

Sorry for my late reply...we've been busy with a morpho release.  thanks 
for getting me in gear, Rod.

In metacat, we only return leaf nodes (i.e. the text node child of a 
CDATA element like in response 4 below).  The returnfield functionality 
was originally meant as a convenient way to return enough information 
for a meaningful resultset to display, say, on a web page.  It was not 
meant to return whole document chunks for further processing.  I can see 
how this would be useful, but it would require returning a namespace 
defined chunk so that a parser would know what to do with it.  Metacat 
currently uses the returnfields to build the resultset table, then a 
request must be made for the whole document in order to do further 
processing.

Looking at the responses 1-3 below, to me, they are all invalid and 
potentially problematic.  without a namespace to parse those xml chunks 
off of, the parser is left to just do well-formedness checking and any 
query into these document chunks may fail because we don't know what to 
expect to get back before doing the processing (e.g. an xpath query).

So I guess to make a short answer long, I agree with Peter's assessment 
of sticking with response 4 (which is basically what metacat has done 
all along).

chad


Rod Spears wrote:
> Is anyone better qualified than me, going to address Peter's questions?
> 
> Please someone respond, thanks.
> 
> Rod
> 
> 
> Peter McCartney wrote:
> 
>> it has to be well formed no matter what. so the question is really how 
>> can we identify a namespace for the result set when the content we 
>> stick in there has no hope of being valid? further, how can we define  
>> a set of rules for how the results are to be evaluated against that 
>> namespace yet not be valid?
>> request 1: '*/creator/individualName/surname', '/eml/dataset
>>  
>> Rule1: "content must appear in minimal xml tree needed to accomodate 
>> the informaton"
>>  
>> Rule2: "content must appear in a potentially valid xml tree that 
>> invalidates only due other required elements missing.
>>  
>> rule 3 "conent must appear in a tree that placed in in correct node 
>> ancestry for the declared namespace.
>>  
>>  
>> response 1: meets 1 and 3 and is well formed. Requires just knowledge 
>> of parent ancestry to build.
>> <eml>
>>     <dataset>
>>     <creator>
>>         <individualName>
>>                 <surname>mccartney</surname>
>>                 <surname>jones</surname>
>>         </individualname>
>>     </creator>
>> </dataset>
>> <eml>
>>  
>> response 2: meets 1, 2 and 3 and is well formed. Requires knowledge of 
>> ancestry and index (ie jones is in creator[2] of dataset[1] )
>> <eml>
>>     <dataset>
>>     <creator>
>>         <individualName>
>>                 <surname>mccartney</surname>
>>         </individualname>
>>     </creator>
>>     <creator>
>>         <individualName>
>>                 <surname>jones</surname>
>>         </individualname>
>>     </creator>
>>   </dataset>
>> <eml>
>>  
>>  
>> response 3: meets 3 and is not well formed. rquires knowledge of ancestry.
>>  
>> <eml>
>>     <dataset>
>>     <creator>
>>         <individualName>
>>                 <surname>mccartney</surname>
>>         </individualname>
>>     </creator>
>> </dataset>
>> <eml>
>>     <dataset>
>>     <creator>
>>         <individualName>
>>                 <surname>jones</surname>
>>         </individualname>
>>     </creator>
>> </dataset>
>> </eml>
>>  
>> and just a reminder of where we originally started from (approximately) 
>>  
>> reponse 4: meets no rule, cannot validated, but conveys all the 
>> information to generate format 1 or 3 above using a string tokenizer 
>> and a jDOM. but not option 2.
>> <resultset namespace=eml......>
>>     <returnfield xpath="dataset/creator/individualname/surname">mccartney</returnfield>
>>     <returnfield xpath="dataset/creator/individualname/surname">jones</returnfield>
>> </resultset>
>>  
>> I think we should really ask whether we are making ourselves deal with 
>> some very complicated rules for really no gain in functionality. None 
>> of the results will be valid according to the name space. All of them 
>> are valid if i make up my own namespace for the result set.  Unless we 
>> can hold our selves to the standard where any code or xsl written for 
>> the schema will successfuly process the result set (#2 is the closest 
>> to that, but depending on how loose the code is, all three could work 
>> or none could work), why shouldnt we opt for the easiest rule to 
>> comply with?
>>  
>>  
>> Peter McCartney (peter.mccartney at asu.edu <mailto:peter.mccartney at asu.edu>)
>> Center for Environmental-Studies
>> Arizona State University
>>  
>>
>>     -----Original Message-----
>>     *From:* Saritha Bhandarkar
>>     *Sent:* Friday, April 09, 2004 10:28 AM
>>     *To:* 'seek-dev'
>>     *Cc:* Jing Tao; Peter McCartney; Saritha Bhandarkar
>>     *Subject:* resultset question
>>
>>     Hi,
>>
>>     I had a question about the resultset to be returned by Xanthoria.
>>
>>     The schema of the resultset specifies that a record is of type
>>     ?AnyRecordType? and optionally it may have some element content
>>     from the record. Now, my question here is, if I am to return the
>>     elements specified in the <returnfields> of the query, for the matching records (that is from the matching
>>     eml file), do I need to send it in eml format,  with only relevant
>>     values for requested fields and no values for the fields which are
>>     not requested? Or is it enough to return only the requested fields
>>     with their values, as well-formed xml? Can someone please brief me
>>     on the contents of a record in resultsetType?
>>
>>     Thanks,
>>
>>     Saritha
>>
>>      
>>
>>      
>>
>>      
>>
>>      
>>
>>     Saritha Bhandarkar
>>
>>     Research Assistant
>>
>>     Center for Environmental Studies
>>
>>     ASU-Tempe AZ
>>
>>     saritha.bhandarkar at asu.edu <mailto:saritha.bhandarkar at asu.edu>
>>
>>      
>>
>>      
>>
> 
> -- 
> Rod Spears
> Biodiversity Research Center
> University of Kansas
> 1345 Jayhawk Boulevard
> Lawrence, KS 66045, USA
> Tel: 785 864-4082, Fax: 785 864-5335
> 


-- 
-----------------------
Chad Berkley
National Center for
Ecological Analysis
and Synthesis (NCEAS)
berkley at nceas.ucsb.edu
-----------------------




More information about the Seek-dev mailing list