[tcs-lc] Modularisation of standards - identification of names

Tue Mar 8 04:15:57 PST 2005

> 
> > Of course, to a computer the inclusion of the second set of 
> > information within the name is redundant but we shouldn't 
> > underestimate the amount of human eyeballing of XML data goes 
> > on. 
> 
> Yes we should. Hah, hah, just serious.   :-)

 ... true .. of course we still have botanists here who draw up their 
delta descriptive documents direct in notepad because they've 
memorised all of the state and character numbers and it's quicker 
that way ... if they ever get their hands on XML there will be no 
stopping them

> -------------
> > ...<something> was the one thing that made implementing TCS a 
> > bit of a challenge for IPNI
> > 
> 
> As Stan Blum often points out, part of the  IETF standards Best Current 
>   Practices http://www.rfc-archive.org/getrfc.php?rfc=2026 recommends 
> that something not become a Draft Standard (i.e. a Standard) until it 
> has had two independent implementations exhibited. TDWG should do that 
> too, but it comes with many social issues having to do with the time 
> available to implementers. Industrial standards work often includes 
> employees who are specifically tasked with support for standards work. 
> Biodiversity informatics rarely has this "luxury".

True (and I did have the luxury of some GBIF funding to do this, 
otherwise it would be languishing on my to do list along with 
everything else)

OTOH I do think that even if we can't implement the draft standards 
to check them, there's a lot to be said for producing example 
documents - not just single value examples but a biggish set of 
data encompassing all of the likely problem records. It's time 
consuming I know, but it's the only way to settle some of the more 
theoretical arguments about what would be 'easier', 'more elegant' 
etc. 

There have been some attempts at this on the LC wiki & I know 
that the TCS have taken up some of the examples from IPNI that I 
put up - this is something to keep working on.
[...]
> > One thing about XML that I've found, if you try and approach it with 
> > an OO programmer hat on and make it enforce business rules, 
> > then you very quickly get frustrated, or end up with very 
> > complicated schemas. 
> 
> That's because XML is declarative, not procedural. Enforcement of rules 
> in any system is often done by giving a set of procedures which, when 
> followed, by definition assure compliance. The best one can hope for 
> from a declarative language is usually processing that tells you when 
> you violate the description and what is the evidence that such violation 
> occurred.
> 
> Time to take of your OO programmer hat and put on your Lisp programmer 
> hat. :-) But possibly more useful is to rely heavily on databinding 
> frameworks (aka Schema Compilers) like Castor 
> http://www-128.ibm.com/developerworks/webservices/library/ws-castor/
> which are designed to turn the declaritive types and constraints of 
> XML-Schema into classes and methods in procedural OOP languages. In 
> another ten years, eyeballing XML instance documents will be akin to 
> eyeballing assembly language output from compilers. You do it when you 
> suspect the compilation tools, not when you suspect the high-level 
> specifications.
> 
 ... you know I had a LISP programmer hat once   *<:-)

I think the point (we're both making) is, we can't make the schema 
do everything, and we will tie ourselves in knots if we try. Much 
better to design a schema which allows us to get it right, and have 
other ways of enforcing rules when we get it wrong.

Sally

*** Sally Hinchcliffe
*** Computer section, Royal Botanic Gardens, Kew
*** tel: +44 (0)20 8332 5708
*** S.Hinchcliffe at rbgkew.org.uk