[seek-taxon] question on search taxonic info
Robert K. Peet
peet at unc.edu
Tue Aug 23 20:31:19 PDT 2005
> I'm designing the advanced search screen for kepler and one of the areas to
> search is taxanomic information. I've looked at several search interfaces
> and of course taxon name is always there but sometimes rank is also
> included. In general what are the types of available "fields" you would
> expect to see in a search on taxonomic information? Right now I'm thinking
> of including name and rank but wondered if people really use rank or if I'm
> missing anything critical.
Given that this is for Kepler, I am assuming we are talking about
Ecologists looking for ecological data. If this assumption is wrong, let
Ecologists will not search for rank, though they might search by rank.
To illustrate, I might want to obtain for all vertebrate families that
start with "C" for use as a picklist of possible search criteria in a
data-directed search. As another example, they might want to specify rank
and phylum when asking for an appropriate picklist (eg among vertebrates,
select a family). In this example the user wants to twice constrain by
rank when selecting taxa, such as specifying a taxon at one rank and
looking for all taxa that meet set criteria at a different rank. In
short, rank can be important in selecting taxa to search for, but not in
the actual search itself.
Some of us in SEEK Taxon have given a lot of thought to how ecologists
will want to search for and analyze data with taxonomic names. It would
be worth discussing this for a day at some future meeting. One example is
that EML should be modified to include levels of certainty of
determiniations, and joint assignments of multiple taxa in a determination
with varied levels of certainty (eg, absolutely belongs in the genus
Potentilla and could equally well be either P. simplex of P. canadensis).
Given this, we might want to search for only those cases where the
determiniation is with high certainty (or certainty is not specified).
We also need to be able to specify whether we want only occurrences where
the taxon concept is an exact fit, or whether to include looser and more
ambiguous concept mappings. We might also with to consider whether to
discover low-quality taxa that meet the requirements (when searching for
Potentilla, will we accept Potentilla sp. #1, or only Potentilla species
with well-formed names).
Finally, we might wish to specify a standard taxonomic perspective so we
do not need to add reference to each occurrences, such as in the case of
plants USDA and Flora North America.
> Users will have the choice of searching the metadata, the data, and or
> both the metadata and data for taxonomic information.
Seems like a decision needs to be made on how to package the data being
registered and ultimately searched. In many of our SEEK Taxon discussions
there has been a general assumption that the metadata for each dataset
will include a set of internal identifiers and the concepts they map to,
and if this is the case then you need only search the metadata for most
types of searches. If this is not the case, we ought to have a few
discussions within SEEK about how taxon determinations will be handled.
That said, don't forget that the whole topic of taxon determinations
within EML remains unresolved.
> FYI in EML the current metadata fields dealing with taxanomic
> information are:
> However, there will be other kinds of metadata associated with date files
> that will be available in Kepler, such as DIGIR records so I'm thinking that
> I don't want to get too specific with fields because we can't guarantee that
> every metadata structure we include will have the same fields etc.
Good point, but we should design for well-formed EML data, and then be
flexible enough to handle (albeit less well) other types of data.
Robert K. Peet, Professor & Chair Phone: 919-962-6942
Curriculum in Ecology, CB#3275 Fax: 919-962-6930
University of North Carolina Cell: 919-368-4971
Chapel Hill, NC 27599-3275 USA Email: peet at unc.edu
More information about the Seek-taxon