[Tcs-lc] Human Readable - the thread formally known as 'Name of NomenCode'

Roger Hyam roger at hyam.net
Tue Mar 29 04:28:55 PST 2005


I am working with Jessie and Robert on the schema at the moment and hope 
to get fancier pointers in there. They make sense to me.  I hope we can 
have something to throw open for discussion in the next couple of days.

I am just trying to generate an instance document for a couple of 
species by hand at the moment and I am getting very confused.

What is the relationship type between a species and it's genus? Is it 
'included in' or 'child of' and would the user agent expect the 
reciprocal relationships to be marked up i.e. 'includes' and 'is parent 
of'?
Would all software agents:

   1. expect only up pointing links.
   2. expect only down pointing links.
   3. expect both to be present for the link to be valid.
   4. allow the use of a mixture of the two, relationship sometime shown
      with 'includes' and sometimes with 'included in'.

I define relationships between a species concept and it's genus concept 
but what about the subgenus and sectional stuff?
I could:

   1. mark the species as belonging to the section and section to
      subgenus and subgenus to genus and not specify any other relationships
   2.  join the species to section, subgenus and genus as well as
      joining up the section to subgenus and genus and subgenus to
      genus. i.e. all the includes relationships.
   3. do a mixture of the two. Species always in genus but other things not.

Confused? I am and I am just doing this manually! I find the thought of 
writing software to consume this more scary than handling the links to 
the publications and specimens. Even if we define how these 
relationships should be encoded there is no way for the schema to 
validate it so we will have to write checking code and try and handle 
graceful degradation etc. I think we need to nail this thing down a bit 
- there should only really be one way of encoding a basic taxonomic 
hierarchy and that should be enforced by the schema I think. What do you 
all think?

Has anyone generated instances of recent versions of the TCS (0.9+) 
using real data? If so could they send me one.

Roger

Sally Hinchcliffe wrote:

>Roger wrote:
>  
>
>>Yes I agree that it should be easy to implement but there also needs to 
>>be a gradient of implementations. It should be easy to do simple things 
>>but it should be possible to do more complex things with a bit more effort.
>>
>>I think we are resigned to having some level of normalization in the 
>>schema but I imagine this will rarely be used in a single document 
>>instance. I am thinking that if we could have 'fancy pointers' of some 
>>kind then the work for this reduces greatly. If the pointer in the 
>>schema can contain a string summary of the publication, for example, as 
>>well as a reference to that publication then in a great many 
>>implementations the details of the publication can simply be retrieved 
>>with a second call if they are required.
>>    
>>
>
> - yes with fancy pointers (I'm sure Gregor had a more technical 
>sounding phrase for these but I quite like it!), everything gets a 
>lot simpler all round. And I think this covers Gregor's point re SDD 
>as well. It also solves my problem that if the user asks for taxon 1& 
>2 they also get taxon 3 & 4 just to make the links come out. 
>
>But can we do it with TCS/LC as it stands now? From my understanding 
>the referred to objects (publication references, other taxon 
>concepts) have to be included within the instance document & can't be 
>available via fancy pointers from somewhere else.
>Or have I misread the schema?
>
>Sally
>
>  
>
>>Sally Hinchcliffe wrote:
>>
>>    
>>
>>>Roger wrote:
>>> 
>>>
>>>      
>>>
>>>>I am all for readability and it is something I am just sitting down 
>>>>   
>>>>
>>>>        
>>>>
>>>to 
>>> 
>>>
>>>      
>>>
>>>>look at in the TCS today. This is partially inspired by trying to 
>>>>   
>>>>
>>>>        
>>>>
>>>put 
>>> 
>>>
>>>      
>>>
>>>>together some instance documents over the weekend. This matter does 
>>>>   
>>>>
>>>>        
>>>>
>>>not 
>>> 
>>>
>>>      
>>>
>>>>only include field names but also general structure.
>>>>
>>>>How important do people consider it is to be able to read/hand 
>>>>   
>>>>
>>>>        
>>>>
>>>craft TCS 
>>> 
>>>
>>>      
>>>
>>>>instances - at least simple one?
>>>>   
>>>>
>>>>        
>>>>
>>>I vote for readable, but not necessarily hand-writable. Another thing 
>>>to look out for is making sure it's easy for programs to generate the 
>>>stuff
>>>
>>>readability enhances acceptance - if the instance documents look 
>>>readable then people are more likely to use them, and they will feel 
>>>confident that they will be able to troubleshoot any problems.
>>>
>>>For writability, the main impact is on the wrappers producing the 
>>>XML. When we did this in IPNI, producing the data via templates, the 
>>>problem was keeping track of references within the document - for 
>>>instance references to publications. The way a website like IPNI 
>>>serves data up is as a stream of names with a header at the top and a 
>>>footer at the bottom. It's easiest if each name and its associated 
>>>data can be totally self contained with no need to keep track of a 
>>>second set of data that's being referred to internally within the 
>>>document. It's not impossible (we did handle references to 
>>>publications in the TCS data we served) but the more internal 
>>>references there are to keep track of the harder it is. Unfortunately 
>>>recent discussion seems to be sending us down the internal reference 
>>>root more and more.
>>>So from a generator's point of view this is easy (and I think also 
>>>more human readable):
>>>
>>>start stuff - headers etc.
>>>- taxonname 1
>>> - interesting facts about taxonname 1
>>> - publication information about taxonname 1
>>> - other names related to taxonname 1
>>>- taxonname 2
>>>- interesting facts about taxonname 2
>>>- publication information about taxonname 2
>>>- other names related to taxonname 2
>>>end stuff
>>>
>>>whereas this is hard (but not impossible):
>>>
>>>start stuff
>>>- taxonname 1
>>>  - interesting facts about taxonname 1
>>>  - taxonname 1 published in reference 1
>>>  - taxonname 1 related to taxonname 3
>>>- taxonname 2 
>>>  - interesting facts about taxonname 2
>>>  - taxonname 2 published in reference 2
>>>  - taxonname 2 related to taxonname 4
>>>- taxonname 3
>>>  - interesting facts about taxonname 3
>>>  - taxonname 3 published in reference 3
>>>- taxonname 4
>>>  - interesting facts about taxonname 4
>>>  - taxonname 4 published in reference 4
>>>- reference 1
>>> - details for reference 1
>>>- reference 2
>>> - details for reference 2
>>>- reference 3
>>> - details for reference 3
>>>- reference 4
>>> - details for reference 4
>>>end stuff
>>>
>>>
>>>Of course it may be we're not generating the data in the most 
>>>efficient way ...
>>>
>>>ps my vote would be for NomenclaturalCode. Does exactly what it says 
>>>on the tin...
>>>Sally
>>>
>>>*** Sally Hinchcliffe
>>>*** Computer section, Royal Botanic Gardens, Kew
>>>*** tel: +44 (0)20 8332 5708
>>>*** S.Hinchcliffe at rbgkew.org.uk
>>>
>>>_______________________________________________
>>>Tcs-lc mailing list
>>>Tcs-lc at ecoinformatics.org
>>>http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/tcs-lc
>>>
>>> 
>>>
>>>      
>>>
>>-- 
>>
>>==============================================
>> Roger Hyam
>>----------------------------------------------
>> Biodiversity Informatics
>> Independent Web Development 
>>----------------------------------------------
>> http://www.hyam.net  roger at hyam.net
>>----------------------------------------------
>> 2 Janefield Rise, Lauder, TD2 6SP, UK.
>> T: +44 (0)1578 722782 M: +44 (0)7890 341847
>>==============================================
>>
>>
>>
>>    
>>
>
>*** Sally Hinchcliffe
>*** Computer section, Royal Botanic Gardens, Kew
>*** tel: +44 (0)20 8332 5708
>*** S.Hinchcliffe at rbgkew.org.uk
>
>_______________________________________________
>Tcs-lc mailing list
>Tcs-lc at ecoinformatics.org
>http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/tcs-lc
>
>  
>

-- 

==============================================
 Roger Hyam
----------------------------------------------
 Biodiversity Informatics
 Independent Web Development 
----------------------------------------------
 http://www.hyam.net  roger at hyam.net
----------------------------------------------
 2 Janefield Rise, Lauder, TD2 6SP, UK.
 T: +44 (0)1578 722782 M: +44 (0)7890 341847
==============================================


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/tcs-lc/attachments/20050329/f5139c61/attachment-0001.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: roger.vcf
Type: text/x-vcard
Size: 275 bytes
Desc: not available
Url : http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/tcs-lc/attachments/20050329/f5139c61/roger-0001.vcf


More information about the Tcs-lc mailing list