[tcs-lc] Modularisation of standards - identification of names

Tue Mar 8 04:02:58 PST 2005

Sally Hinchcliffe wrote:
[..]

> Of course, to a computer the inclusion of the second set of 
> information within the name is redundant but we shouldn't 
> underestimate the amount of human eyeballing of XML data goes 
> on. 

Yes we should. Hah, hah, just serious.   :-)

-------------
> ...<something> was the one thing that made implementing TCS a 
> bit of a challenge for IPNI
> 

As Stan Blum often points out, part of the  IETF standards Best Current 
  Practices http://www.rfc-archive.org/getrfc.php?rfc=2026 recommends 
that something not become a Draft Standard (i.e. a Standard) until it 
has had two independent implementations exhibited. TDWG should do that 
too, but it comes with many social issues having to do with the time 
available to implementers. Industrial standards work often includes 
employees who are specifically tasked with support for standards work. 
Biodiversity informatics rarely has this "luxury".
-------------
> 
>>[Note that it is much harder reliably to assign and police meaningful
>>identifiers for name elements if they are fully embedded.  There would
>>certainly be no way to enforce a single consistent representation in all
>>occurrences for the same <Name> across different <TaxonConcept> elements.]
>>
> 

Actually, the unique particle attribution constraint I mentioned in a 
previous post does just this, but IMO is too heavy handed. Also, see my 
"Yes we should" above in which I advocate a serious, but probably 
hopeless, position. (But not more hopeless than the belief that 
everywone who wants name standards can be made to agree that they also 
need concept standards and will therefore happily use them).
> 
> One thing about XML that I've found, if you try and approach it with 
> an OO programmer hat on and make it enforce business rules, 
> then you very quickly get frustrated, or end up with very 
> complicated schemas. 

That's because XML is declarative, not procedural. Enforcement of rules 
in any system is often done by giving a set of procedures which, when 
followed, by definition assure compliance. The best one can hope for 
from a declarative language is usually processing that tells you when 
you violate the description and what is the evidence that such violation 
occurred.

Time to take of your OO programmer hat and put on your Lisp programmer 
hat. :-) But possibly more useful is to rely heavily on databinding 
frameworks (aka Schema Compilers) like Castor 
http://www-128.ibm.com/developerworks/webservices/library/ws-castor/
which are designed to turn the declaritive types and constraints of 
XML-Schema into classes and methods in procedural OOP languages. In 
another ten years, eyeballing XML instance documents will be akin to 
eyeballing assembly language output from compilers. You do it when you 
suspect the compilation tools, not when you suspect the high-level 
specifications.

-------------

Bob Morris

-- 
Robert A. Morris
Professor of Computer Science
UMASS-Boston
http://www.cs.umb.edu/~ram
phone (+1)617 287 6466