[seek-kr-sms] Re: [SEEK-Taxon] Question about EML

Joseph Goguen goguen at cs.ucsd.edu
Tue Mar 9 10:59:06 PST 2004


This is fascinating, and a good example of some rather deep social science
research on the nature of hierarchies in social contexts, and more generally,
of standards and infrastructure of all kinds, of which the best current
exposition is *Sorting Things Out* by Bowker and Star (MIT Press); both of
them are UCSD faculty.

  == joseph

******************************************************************************
>Delivered-To: seek-kr-sms at ecoinformatics.org
>From: "Robert A. Morris" <ram at cs.umb.edu>
>Reply-To: ram at cs.umb.edu
>Organization: UMASS-Boston
>User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113
>X-Accept-Language: en-us, en
>Cc: Shawn Bowers <bowers at sdsc.edu>, seek-taxon at ecoinformatics.org,
>        seek-kr-sms at ecoinformatics.org
>X-BeenThere: seek-kr-sms at ecoinformatics.org
>X-Mailman-Version: 2.0.13
>Precedence: bulk
>List-Help: <mailto:seek-kr-sms-request at ecoinformatics.org?subject=help>
>List-Post: <mailto:seek-kr-sms at ecoinformatics.org>
>List-Subscribe: <http://www.ecoinformatics.org/mailman/listinfo/seek-kr-sms>,
>	<mailto:seek-kr-sms-request at ecoinformatics.org?subject=subscribe>
>List-Id: <seek-kr-sms.ecoinformatics.org>
>List-Unsubscribe: <http://www.ecoinformatics.org/mailman/listinfo/seek-kr-sms>,
>	<mailto:seek-kr-sms-request at ecoinformatics.org?subject=unsubscribe>
>List-Archive: <http://www.ecoinformatics.org/pipermail/seek-kr-sms/>
>Date: Sat, 06 Mar 2004 00:11:00 -0500
>X-Spam-Flag: Spam NO
>X-Scanned-By: milter-spamc/0.15.245 (fast.ucsd.edu [132.239.15.4]); pass=YES; Fri, 05 Mar 2004 21:13:13 -0800
>X-Scanned-By: milter-spamc/0.15.245 (gradlab.ucsd.edu [132.239.55.107]); pass=YES; Fri, 05 Mar 2004 21:13:08 -0800
>X-Spam-Status: NO, hits=-4.90 required=5.00
>X-Spam-Level: Level 
>
>
>Nice exposition Matt!
>
>There is a slight further complication in this particular dataset,which 
>I take to be SA003 from the Andrews LTER, described at
>http://www.fsl.orst.edu/lter/data/abstract.cfm?dbcode=SA003&topnav=97
>I can only guess at this without the EML document (and maybe with it), 
>but it seems like a good guess since the paragraph quoted by Shawn is in 
>the "Field Methods" metadata of that URL, and the data snippets match 
>SA003. If I'm wrong, press delete now :-)
>
>The problem begins with the statement in the "Design Methods" metadata 
>for SA003 that the list "represents the collective observations of many 
>people over 20- plus years, but should not be viewed as either current 
>or complete."
>
>Problem part 2 (executive summary): in the data there are no dates 
>assigned to the observations and in the metadata no meaning assigned to 
>"collective observations".
>
>Problem part 2(elaboration): SA003 was compiled in 1995, according to 
>its metadata. Alas, records in SA003 do not carry any date)s) at which 
>the observation(s) represented by the record was(were) made. Thus, for 
>example, suppose the first record happens to represent a claim of a wood 
>duck observed in 1977. If, in 1977 there were another authority for 
>deriving scientific names from common names, and if that authority did 
>not assign Aix sponsa as in SA003, then it might or might not be that 
>the wood duck represents the same concept as had an observer contributed 
>the datum in 1995. Also, even were there such an authority and it gave 
>Aix sponsa, it is possible that a taxonomic revision caused the concept 
>of Aix sponsa to change between 1977 and 1991. Without a date on the 
>origin of the primary key (here the common name) it is quite difficult 
>to compare this claim of the occurence of a wood duck with another such 
>claim in another data set. Probably the only hope is the SEEK 
>probabilistic approach, but I wonder how SEEK would in this case 
>represent the complete lack of knowledge of when the observation was 
>made. For example, I doubt that it is a good idea to assume that within 
>the "20-plus years", all time intervals of the same size are 
>equiprobable. [But wait, Bayes Rule might actually save the duck fat 
>here, as it usually does. Maybe this part of the question is outside the 
>scope of this list and somebody from SEEK could just point me at the 
>probabilistic model? ]
>
>This is a generic problem with checklists which are in this way---and I 
>suppose many other ways---different from specimen records.
>
>
>BTW, Aix sponsa (L.) seems to have no synonyms according to ITIS
>so Aix sponsa's concept seems not have changed between whenever the 
>critter was observed and 1991.  But what about the mapping between 
>common and scientific name? I'm told that only for birds are there 
>widely accepted authorities for assigning common names to scientific 
>names. For other groups, this mapping is the problem that dare not speak 
>its name.
>
>BTW.2. Systematists might say: If it walks like a wood duck, and quacks 
>like a wood duck, then it's a wood duck.
>
>
>-- Bob Morris
>
>Matt Jones wrote:
>
>>Hi Shawn,
>>
>>[Matt's excellent exposition omitted] 
>>Shawn Bowers wrote:
>>
>>>
>>>Hi,
>>>
>>>I recently found this statement in the methods section of an EML 
>>>document:
>>>
>>>      "Nomenclature for common names follow the 1987 edition of the
>>>    National Geographic Society's field guide, 'Bird's of North
>>>    America'. Species codes used are those of the American
>>>    Ornithologist's Union. The USFWS Checklist OF Vertebrates, 1991,
>>>    was used to quantify scientific names from the common names."
>>>
>>>Can anyone help me interpret what these two sentences mean, and how I 
>>>might use the Taxon-group work to "understand/resolve" the actual 
>>>species references in a dataset based on the above sentence? Here is a 
>>>snippet from the corresponding dataset with the only those columns 
>>>that refer to something "taxonomic". (Note that there are actually 23 
>>>columns in the dataset and about 165 rows.)
>>>
>>>
>>>class  tax_order      family    sci_name        aoucode  commonname
>>>-----  ---------      ------    --------        -------  ----------
>>>aves   anseriformes   anatidae  aix sponsa      wodu     wood duck
>>>aves   apodiformes    apodidae  chaetura vauxi  vasw     vauxs wift
>>>aves   ciconiiformes  ardeidae  ardea alba      greg     great egret
>>>...
>>>
>>>I am particularly interested in understanding the relationship between 
>>>the concept XML schema and it's use for "registering" or "mapping" 
>>>this data set to information captured in the concept work.  For 
>>>example, if I want to search for datasets based on taxonomic concepts.
>>>
>>>(You have to be patient with me because I am clueless about these 
>>>issues), but it seems like the common name and aoucode represent 
>>>redundant information: the aoucode is some kind of convention for 
>>>representing the common name, and the class/tax_order/family/sci_name 
>>>uniquely identifies the common name? How would one align this dataset 
>>>with an instantiated taxon concept schema -- in particular, what 
>>>information would need to be available in the instantiated concept 
>>>schema?
>>>
>>>Any help is greatly appreciated,
>>>
>>>Shawn
>>>
>>>
>>>
>>>
>>>
>>>_______________________________________________
>>>seek-taxon mailing list
>>>seek-taxon at ecoinformatics.org
>>>http://www.ecoinformatics.org/mailman/listinfo/seek-taxon
>>
>>
>>
>>_______________________________________________
>>seek-kr-sms mailing list
>>seek-kr-sms at ecoinformatics.org
>>http://www.ecoinformatics.org/mailman/listinfo/seek-kr-sms
>
>-- 
>Robert A. Morris
>Professor of Computer Science
>UMASS-Boston
>http://www.cs.umb.edu/~ram
>phone (+1)617 287 6466
>_______________________________________________
>seek-kr-sms mailing list
>seek-kr-sms at ecoinformatics.org
>http://www.ecoinformatics.org/mailman/listinfo/seek-kr-sms



More information about the Seek-taxon mailing list