Test implementation of eml-based species lists for LTER
Wade Sheldon
sheldon at uga.edu
Thu Apr 15 07:01:00 PDT 2004
Matt,
This morning I thought of a good reason to keep both eml species list
implements (i.e. complete hierarchy per record + compact tree form). For the
purpose of species list exchange within LTER it would be somewhat easier to
program against the complete record schema for populating a master list or
centralized database; otherwise, any parser would have to walk each tree and
regenerate all the common ranks that were removed before doing anything
else. That isn't terribly difficult (I do that all the time with RDP trees
in my bioinformatics work), but it's another significant step.
I also think that the first model would be easier for sites struggling with
xml generation to accomplish. Nesting taxonomicClassification tags is enough
of a programming headache without throwing tree generation into the mix. I
agree that the tree form is superior for display of taxonomic coverage,
though, and I will implement that for our data set eml as well now that I
have developed a working algorithm and corresponding SQL views.
--Wade
----- Original Message -----
From: "Wade Sheldon" <sheldon at uga.edu>
To: "Matt Jones" <jones at nceas.ucsb.edu>
Cc: "James Brunt" <jbrunt at lternet.edu>; <eml-dev at ecoinformatics.org>
Sent: Wednesday, April 14, 2004 6:30 PM
Subject: Re: Test implementation of eml-based species lists for LTER
> Matt,
>
> It was a lot harder than I anticipated, but I did manage to write an
> algorithm to generate phylogenetic trees in eml based on taxonomic rank
> name/value pairs. It was a headache sorting out the element nesting while
> dealing with varying taxa depth and nulls (i.e. varying levels of
taxonomic
> completeness for species records in the database), but it works.
>
> I added an "EML (tree)" option and relabeled the original implementation
as
> "EML (list)", so both are available unless we eventually conclude that
only
> one is useful. Here again is the url:
> http://gce-lter.marsci.uga.edu/lter/asp/db/all_species_lists.asp
>
> I made the decision to group records with less detailed higher level taxa
> below records with more detail (i.e. last node on the highest common
> branch), but that was somewhat arbitrary. For example, we have 7 bivalve
> mollusc records, and 6 contain entries for kingdom, phylum, class, order,
> genus, species but one contains entries for kingdom, phylum, class,
> superfamily, family, genus, species. In my eml doc these would be arranged
> in the tree as:
>
> kingdom
> phylum
> class
> orderA
> genus1
> species1
> genus2
> species2
> genus3
> species3
> orderB
> genus4
> species4
> genus5
> species5
> genus6
> species6
> superfamily
> family
> genus7
> species7
> ..
>
> Otherwise in the case above the green mussel record without an order entry
> would appear as the first branch below class, which seemed odd to me (i.e.
> having superfamily branches above order branches in the tree).
>
> Anyway, thanks for the feedback.
>
> --Wade Sheldon, GCE-LTER
>
>
> ----- Original Message -----
> From: "Matt Jones" <jones at nceas.ucsb.edu>
> To: "Wade Sheldon" <sheldon at uga.edu>
> Cc: "James Brunt" <jbrunt at lternet.edu>; <eml-dev at ecoinformatics.org>
> Sent: Tuesday, April 13, 2004 3:05 PM
> Subject: Re: Test implementation of eml-based species lists for LTER
>
>
> > Wade,
> >
> > Looks great. You're on the leading edge yet again :) I think using EML
> > to exchange this sort of information is a great idea. It would also be
> > good to do the literature citation information this way because once
> > people get used to exchanging EML it should be easy to implement.
> >
> > A couple of comments about the taxonomic EML you generated:
> >
> > 1) When you list several taxa that share parent taxa (e.g., they are in
> > the same Class), we had intended that you could nest two subtrees
> > underneath the last identical rank, so that the EML representation is
> > more compact. This means, for example, that many species would be
> > clustered in each genus, many genera in each family, etc up the tree.
> > The only reason that I can see to separate them is if different
> > taxonomicClassification systems apply (e.g., a different authority was
> > used for identifications, circumscriptions, or names), and then they
> > should be in different taxonomicCoverage elements. But if they were all
> > identified using the same system, then creating one tree instead of many
> > I think is better. Are there reasons you did it as you did?
> >
> > 2) You didn't include a taxonomicSystem element. Although optional, I
> > think this is a very important element. It lets the user understand how
> > you did identifications, and provides citations for field guides,
> > taxonomic mongraphs, etc. that were used in identifying and classifying
> > the organisms. The more people do synthetic work, the more important
> > this information is. Bob Morris wrote an interesting email about the
> > pitfalls of species lists to the seek-taxon list that you might find
> > useful and interesting. It was part of a larger thread where we were
> > discussing species lists in general and their utility over the long
> > term, especially with regard to species names versus species concepts.
> > Here's the links that I think are most relevant:
> >
> > Species lists:
> >
http://www.ecoinformatics.org/pipermail/seek-taxon/2004-March/000160.html
> >
> > and the background for that email:
> >
http://www.ecoinformatics.org/pipermail/seek-taxon/2004-March/000159.html
> >
> > Thanks again, Wade.
> >
> > Matt
> >
> >
> > Wade Sheldon wrote:
> > > Matt and James,
> > >
> > > This week I have been making some changes to how we display taxonomic
> > > records on the web. As part of this process, I decided to spend a
little
> > > time following through on a proposal I made during our last IMExec
> > > meeting to test eml as a format for dissemination of species lists
> > > within LTER. As a candidate document style I "trumped up" a species
list
> > > data set containing the species list in taxonomicCoverage format, with
> > > appropriate title, abstract, keywords, project descriptors, etc.
> > > relevant to the purpose. I also included temporalCoverage information
> > > indicating the date the list was generated.
> > >
> > > If you are interested in looking over my test implementation, you can
> > > generate various species lists in eml format using the web forms at:
> > > http://gce-lter.marsci.uga.edu/lter/asp/db/all_species_lists.asp
> > >
> > > I don't currently plan to maintain versioned copies of the species
list
> > > (or sub-lists) in our site data catalog, so the packageID information
is
> > > basically notional; however, we could certainly do so at some point if
> > > there were interest so I left it in. I am primarily throwing this out
as
> > > a candidate schema for cross-site exchange of taxonomic information.
We
> > > will be working on a "eml best practices" document for LTER in the
next
> > > 1-2 months, and I thought it would be good to consider recommendations
> > > for documentation of non-tabular data as well as conventional tabular
> > > data sets. Exchange/submission of bibliographic citations and
personnel
> > > lists are other potential uses of eml we discussed.
> > >
> > > Comments or recommendations would be appreciated.
> > >
> > > --Wade Sheldon, GCE
> >
> > --
> > -------------------------------------------------------------------
> > Matt Jones jones at nceas.ucsb.edu
> > http://www.nceas.ucsb.edu/ Fax: 425-920-2439 Ph: 907-789-0496
> > National Center for Ecological Analysis and Synthesis (NCEAS)
> > University of California Santa Barbara
> > Interested in ecological informatics? http://www.ecoinformatics.org
> > -------------------------------------------------------------------
> >
>
> _______________________________________________
> eml-dev mailing list
> eml-dev at ecoinformatics.org
> http://www.ecoinformatics.org/mailman/listinfo/eml-dev
>
More information about the Eml-dev
mailing list