[seek-dev] RE: resultset question
Chad Berkley
berkley at nceas.ucsb.edu
Tue Apr 20 09:04:38 PDT 2004
Hi,
Sorry for my late reply...we've been busy with a morpho release. thanks
for getting me in gear, Rod.
In metacat, we only return leaf nodes (i.e. the text node child of a
CDATA element like in response 4 below). The returnfield functionality
was originally meant as a convenient way to return enough information
for a meaningful resultset to display, say, on a web page. It was not
meant to return whole document chunks for further processing. I can see
how this would be useful, but it would require returning a namespace
defined chunk so that a parser would know what to do with it. Metacat
currently uses the returnfields to build the resultset table, then a
request must be made for the whole document in order to do further
processing.
Looking at the responses 1-3 below, to me, they are all invalid and
potentially problematic. without a namespace to parse those xml chunks
off of, the parser is left to just do well-formedness checking and any
query into these document chunks may fail because we don't know what to
expect to get back before doing the processing (e.g. an xpath query).
So I guess to make a short answer long, I agree with Peter's assessment
of sticking with response 4 (which is basically what metacat has done
all along).
chad
Rod Spears wrote:
> Is anyone better qualified than me, going to address Peter's questions?
>
> Please someone respond, thanks.
>
> Rod
>
>
> Peter McCartney wrote:
>
>> it has to be well formed no matter what. so the question is really how
>> can we identify a namespace for the result set when the content we
>> stick in there has no hope of being valid? further, how can we define
>> a set of rules for how the results are to be evaluated against that
>> namespace yet not be valid?
>> request 1: '*/creator/individualName/surname', '/eml/dataset
>>
>> Rule1: "content must appear in minimal xml tree needed to accomodate
>> the informaton"
>>
>> Rule2: "content must appear in a potentially valid xml tree that
>> invalidates only due other required elements missing.
>>
>> rule 3 "conent must appear in a tree that placed in in correct node
>> ancestry for the declared namespace.
>>
>>
>> response 1: meets 1 and 3 and is well formed. Requires just knowledge
>> of parent ancestry to build.
>> <eml>
>> <dataset>
>> <creator>
>> <individualName>
>> <surname>mccartney</surname>
>> <surname>jones</surname>
>> </individualname>
>> </creator>
>> </dataset>
>> <eml>
>>
>> response 2: meets 1, 2 and 3 and is well formed. Requires knowledge of
>> ancestry and index (ie jones is in creator[2] of dataset[1] )
>> <eml>
>> <dataset>
>> <creator>
>> <individualName>
>> <surname>mccartney</surname>
>> </individualname>
>> </creator>
>> <creator>
>> <individualName>
>> <surname>jones</surname>
>> </individualname>
>> </creator>
>> </dataset>
>> <eml>
>>
>>
>> response 3: meets 3 and is not well formed. rquires knowledge of ancestry.
>>
>> <eml>
>> <dataset>
>> <creator>
>> <individualName>
>> <surname>mccartney</surname>
>> </individualname>
>> </creator>
>> </dataset>
>> <eml>
>> <dataset>
>> <creator>
>> <individualName>
>> <surname>jones</surname>
>> </individualname>
>> </creator>
>> </dataset>
>> </eml>
>>
>> and just a reminder of where we originally started from (approximately)
>>
>> reponse 4: meets no rule, cannot validated, but conveys all the
>> information to generate format 1 or 3 above using a string tokenizer
>> and a jDOM. but not option 2.
>> <resultset namespace=eml......>
>> <returnfield xpath="dataset/creator/individualname/surname">mccartney</returnfield>
>> <returnfield xpath="dataset/creator/individualname/surname">jones</returnfield>
>> </resultset>
>>
>> I think we should really ask whether we are making ourselves deal with
>> some very complicated rules for really no gain in functionality. None
>> of the results will be valid according to the name space. All of them
>> are valid if i make up my own namespace for the result set. Unless we
>> can hold our selves to the standard where any code or xsl written for
>> the schema will successfuly process the result set (#2 is the closest
>> to that, but depending on how loose the code is, all three could work
>> or none could work), why shouldnt we opt for the easiest rule to
>> comply with?
>>
>>
>> Peter McCartney (peter.mccartney at asu.edu <mailto:peter.mccartney at asu.edu>)
>> Center for Environmental-Studies
>> Arizona State University
>>
>>
>> -----Original Message-----
>> *From:* Saritha Bhandarkar
>> *Sent:* Friday, April 09, 2004 10:28 AM
>> *To:* 'seek-dev'
>> *Cc:* Jing Tao; Peter McCartney; Saritha Bhandarkar
>> *Subject:* resultset question
>>
>> Hi,
>>
>> I had a question about the resultset to be returned by Xanthoria.
>>
>> The schema of the resultset specifies that a record is of type
>> ?AnyRecordType? and optionally it may have some element content
>> from the record. Now, my question here is, if I am to return the
>> elements specified in the <returnfields> of the query, for the matching records (that is from the matching
>> eml file), do I need to send it in eml format, with only relevant
>> values for requested fields and no values for the fields which are
>> not requested? Or is it enough to return only the requested fields
>> with their values, as well-formed xml? Can someone please brief me
>> on the contents of a record in resultsetType?
>>
>> Thanks,
>>
>> Saritha
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Saritha Bhandarkar
>>
>> Research Assistant
>>
>> Center for Environmental Studies
>>
>> ASU-Tempe AZ
>>
>> saritha.bhandarkar at asu.edu <mailto:saritha.bhandarkar at asu.edu>
>>
>>
>>
>>
>>
>
> --
> Rod Spears
> Biodiversity Research Center
> University of Kansas
> 1345 Jayhawk Boulevard
> Lawrence, KS 66045, USA
> Tel: 785 864-4082, Fax: 785 864-5335
>
--
-----------------------
Chad Berkley
National Center for
Ecological Analysis
and Synthesis (NCEAS)
berkley at nceas.ucsb.edu
-----------------------
More information about the Seek-dev
mailing list