[SEEK-Taxon] guids

Nico M. Franz franz at nceas.ucsb.edu
Mon May 24 14:35:38 PDT 2004


Rich:

    I do think there are some misunderstandings, though they can be 
resolved. First, perhaps unlike Taxonomer, SEEK is really a database of 
databases. We need to store - right next to the high-quality 
revision-derived concepts - versions of much coarser summaries across large 
groups (e.g. as maintained by ITIS). Only if we make the threshold for 
something to be a concept low - can we meet the challenge of connecting 
uses of names by entities like ITIS to the meanings of those names as 
specialists think about them. So there's the inflation-of-concepts step. 
Now the secondary inflation-reduction step is partly the responsibility of 
the GUIDs, and partly that of connecting concepts. But much of what we will 
have to do is driven by the availability of data. For example, if we don't 
allow ITIS to have GUIDs for their mostly entirely repetitive information, 
then we'll have an empty database. Demanding "only real differences in 
taxonomic opinion" to stand in the database and be referenced, won't work. 
One COULD choose to mark some identifiers of rather ambiguous, "shallow", 
or "pointing" concepts as unavailable to the public for referencing. 
Higher-quality concepts could have published GUIDs.

    In short, I think that we (SEEK and you) share the same intuitions 
about errors in metadata. But, realistically, we're not in a position to 
assign GUIDs to referenced names ONLY when we have reasons to believe that 
there's different taxonomic carving-up of the world implied. If we did 
that, we'd alienate some of our main providers of information. I personally 
think that the taxonomist's intuitions about what's really out there will 
be expressed mostly through concept relations, not through GUIDs. GUIDs 
will be neat for users, yet it's the connections among them is what 
taxonomists deal with. And solid relations can only exists between 
well-circumscribed (deeper) concepts.

Cheers,

Nico

At 10:36 AM 5/24/2004 -1000, Richard Pyle wrote:

>Hi Nico,
>
>Many thanks for the thoughtful response!
>
> >     I'll jump in. You seem to be mixing up things when you say "new" and
> > "different." In a shallow sense, the subsequently invoked concept is new
> > just by virtue of having a different time stamp. In a deep sense, the
> > added-on information may or may not represent something that the author
> > "came up with" and that wasn't there before.
>
>I guess what I meant by "new" and "different", was the implied scope of
>organisms that would be included within a given concept.  A simple datestap
>would not affect the implied scope of organisms included within the concept.
>If the added-on information you allude to changes the scope of organisms
>that would be included within the concept, then it seems to me that a new
>concept should be defined.  If the added-on information only clarifies the
>boundaries of what is the same scope of organism, then it seems to me to
>simply be a reference back to the original concept (not a new "version" of
>the concept).
>
> >     But what if he or she entered a reference incorrectly as part of the
> > original concept package? Surely there's a sense of "newness" here, i.e.
> > the new recognition of an earlier typo.
>
>Yes, but does the alteration/correction of what really amounts to metadata
>for the concept really require that a new ID be issued (in this case, a new
>ID assigned to a different version of the same concept)?  It seems to me
>that the whole purpose of creating a GUID for concepts is to avoid the need
>to track such trivial changes in metadata.  For example, if my dataset
>pointed to an LSID to indicate a particular concept, it wouldn't matter
>whether the author of a species epithet used to represent that concept was
>spelled "Lacepede", "Lacepede", or "Lacepede" (unless that name was embedded
>as part of the GUID -- which is an entirely different topic of discussion).
>
>Maybe I'm misunderstanding the purpose of the GUID as used for taxonomic
>concepts?
>
> > Still I'd say that's unrelated to
> > taxonomy proper and would call for versioning of one and the same
> > concept, even and particularly in the shallow sense.
>
>I guess my question is:  are the sorts of metadata details at risk of
>needing correction (spelling errors, typos, etc.) the kinds of thing that
>need to be tracked via the GUID itself?  In my mind, the GUID would
>represent a conceptual scope of organisms (i.e., circumscription); and
>therefore if the scope of organisms does not change, then no additional GUID
>is needed to represent a new "version" of the same concept (which is
>different from the situation where two separately-defined concepts may be
>deemed to be congruent).
>
> > If on the other hand we have a
> > statement in a different publication, at a different time, with
> > the same or
> > different circumscription content, there'd be a new (a least shallowly
> > speaking) core entry in the database. That entry could be
> > versioned too if
> > it was transferred with unfortunate mistakes or incompleteness.
>
>O.K., if "version" simply means the correction of inadvertent,
>objectively-discernable errors in metadata, then my feeling is that there is
>really no need to track such metadata corrections within the body of the
>GUID itself.  In the case you mention of a "potentially" different concept,
>then clearly this is a case of a separate concept GUID, which may or may not
>be secondarily mapped as "congruent" with the original GUID.
>
> >     The key distinction here is whether the "later recognition" to change
> > something addresses taxonomic or more mechanical, string-transporting
> > issues. In the later case, I believe we're tending towards versioning; in
> > the former, separate concepts that should then be somehow related to each
> > other.
>
>In summary, I guess my point is that, in my mind at least, the whole purpose
>of the GUID is to get away from having to track the mechanical,
>string-transporting issues and allow focus specifically on the
>circumscription (taxonomic) issues. Let the metadata be tied to the GUID at
>the central registry, and corrected as needed.  There's no harm in
>preserving a log of all changes to metadata, but I don't see why such
>changes would cause the need for the generation of new GUIDs (in the form of
>a new "version" of the same concept).
>
>Again -- it's very possible that I'm missing something here.  But it just
>seems to me that if you're going to go to all the trouble to establish a
>GUID system, then the advantages of doing so (which, in my view, includes
>the alleviation of the need for everyone to indepentently keep track of
>various versions of metadata tied to each concept) should be maximized.
>
>I'm a little concerned that I may not be making my point clear here, so let
>me know if anything doesn't make sense.
>
>Aloha,
>Rich
>
>=======================================================
>Richard L. Pyle, PhD
>Natural Sciences Database Coordinator, Bishop Museum
>1525 Bernice St., Honolulu, HI 96817
>Ph: (808)848-4115, Fax: (808)847-8252
>email: deepreef at bishopmuseum.org
>http://www.bishopmuseum.org/bishop/HBS/pylerichard.html
>
>
>_______________________________________________
>seek-taxon mailing list
>seek-taxon at ecoinformatics.org
>http://www.ecoinformatics.org/mailman/listinfo/seek-taxon




More information about the Seek-taxon mailing list