[tcs-lc] RE: LC/TCS - How many schemas?

Wed Mar 2 15:53:57 PST 2005

It seemed appropriate to migrate this discussion over to the new list that
Matt created -- I hope all the interested parties from the previous
discussion list are getting the posts via this list....(some may have only
been getting the conversation via the various lists that were included in
the CC list -- so perhaps the announcement for this new list should be
forwarded to those other lists, so that interested individuals not
automatically subscribed to this new list will know how to sign up).

Jessie said:
> you are correct in that we could get 400000 concepts and transfer them
with TCS -
> but I don't think we should if we are sensible about what we do - I would
argue
> that the labels on specimens are identifications, not concept definitions.

Just as a semantic point of clarification (in the context of my last couple
of posts) -- I don't think that labels on specimens represent concept
"definitions" either.  But I do believe that the identifer had in mind a
concept (circle), which presumably included the "dot" that is the primary
type specimen of the name that was applied (unless it was a true
misidentification), and also included the "dot" of the specimen being
identified by the label.  That's why I call it a "potential" concept (with
apologies to Walter), as a broader superset containing the subset of
"defined" concepts (which matches your view of a "concept").

The statement:

"Smith identified Specimen 'X' as Taxon 'A'; following the treatment of 'A
SEC. Jones' (according to Smith)"

is, to me, informationally identical to the statement:

"Smith included Specimen 'X' within 'A SEC. Smith'; which is congruent to 'A
SEC. Jones' (according to Smith)"

I just feel that the latter approach yields a more elegant data management
solution.

> How can you name a specimen with a name if you don't know what the name
meant?
> i.e. the concept associated with that name must've already been defined.
Now
> if you choose not to tell me what definition you meant by the name -
that's
> a different matter and I'm dealing with unknown data - but it's still an
> identification.

The identifier knew what it meant in his/her own mind, but in many (most?)
cases did not specifically map it to (or derive it from) a previous
definition.  When I pick up an fish specimen, I look at it and say, "Ah Ha!
I know what that is -- it's Aus bus."  If I sat down and reviewed 20
different published definitions of "Aus bus", I could probably pick the ones
that were more or less congruent with my own view (depending on how well
each of the 20 different published definitions were qualified).  But none of
them specifically entered my mind when I assigned the name "Aus bus" to the
fish I just picked up.  So the "informational reality" is that Richard Pyle
had a concept of a circle that circumscribed both the primary type specimen
of "Aus bus", and the particular specimen he held in his hand and
identified.  Secondarily, Richard Pyle's mental concept could be mapped to a
formal published concept definition (either his own, or someone else's),
which would allow other users to put Richard Pyle's identification into a
broader concept(ual) context.

> We can decide to try and have useful and usable concepts - to me the most
> useful to start are what we refer to as original concepts, i.e. the first
> description of the name and it's associated type specimen and following
> this the "best" current view on particular taxa. Any intermediate concepts
> that have been described would only be created if someone was interested
> in a particular taxon and needed to resolve the different meanings of the
> names in identifications over time - say as defined in different floras
> over a geographoical or temporal range related to their field of study.
>
> Regards resolving concepts then - it depends on the information we have,
> how precise we want to be  and how general any algorithm developed to do
> the resolutuin might be. these are all variable and can be given user
> control - so if you want to match on names then they'll all look the same
> to you but if someone else knows they only want to match on data where
> then concepts are the same then that's what they will get.

On the above two paragraphs our respective goals/desires/perspectives seem
virtually identical.  I think our main differences are in the details of
implementation (details of fundamental structural imprtance, though they may
be...)

Rich