[tcs-lc] Names as Objects

Bob Morris ram at cs.umb.edu
Mon Mar 7 18:33:43 PST 2005


Short, short, ago, in a posting very near this topic, someone, probably
Rich, asked

   "I do have one question for the XML-gurus:  How do you represent a
   "Subtype" in XML?  By "Subtype", I mean an unambiguously defined
   specific subset of a larger set of more generalized records.  I.e.,
   "Person" and "Organization" are each subtypes of "Agent".

   Stated another way, if TaxonConcepts can be one of, say, six different
   types -- how do you represent a set of elements in XML that says
   "these elements only apply to TaxonConcept instances of Type 1, but
   not to instances of Types 2-6"?


The answer(s) are perhaps more technical than is appropriate for this 
list, and the first paragraph is a little more general than the second. 
I've constructed an example for Rich which /might/ get at the point he 
asks about, and if it does and remains relevant, I'll post a pointer to it.

Here suffice it to say that there are two quirks of XML-Schema which 
could be considered a nuisance in an (admirable) search to 
guarantee---as opposed to advise (plead?) that structural constraints be 
applied differently to different subclasses of the things represented. 
The first is that, in some not so uncommon places, for reasons that are 
in my opinion silly, XML Schema requires differently described things to 
have different names(*). One result could be that a solution to Rich's 
question might be that in an instance document, there is never mentioned 
of a TaxonConcept but only, say, a TaxonConceptType1 and a 
TaxonConceptType2-6. This particular quirk is sufficiently obscure that 
violating it is not caught until Version 5 of XML Spy despite its 
importance.

A less important quirk is that to add a constraint (so-called "type 
extension"), e.g. require another element, and thereby reduce the class 
of objects described, in XML-Schema can only be done at the end of the 
other constraints. So when you try to "refactor" a schema design to 
support Rich's goal, you sometimes have to move elements around. For 
database applications this is usually harmless, since column order is 
indeterminate in RDBs, but sometimes presents a social affront to human 
readers of the schema or instance documents. On the other hand, there 
should be no such human readers, only schema designers and programmers 
debuggging their xml generation code.

Bob Morris



-- 
Robert A. Morris
Professor of Computer Science
UMASS-Boston
http://www.cs.umb.edu/~ram
phone (+1)617 287 6466




More information about the Tcs-lc mailing list