[seek-taxon] update on KU work

Aimee Stewart astewart at ku.edu
Fri Sep 30 08:48:11 PDT 2005


Hi all,

Here are our notes for the call today, just in case you don't catch it 
all as we speed-read it.

Aimee


- Explored mechanism for automatically generating byte-compiled
  Java classes from a WSDL.
    * Found that it could be done using Apache's BCEL library
      and Javassist.
    * Determined that it would not be useful to have the Web
      Services actor be able to support complex objects because
      any downstream actors that were to use the Web Services
      actor would then require an understanding of those complex
      objects

- After discussing it with Matt, it was decided the best route
  to go would be to add the API methods for the ENM use case
  in a manner that the used simple Java types or arrays of them
    * The WSDL was changed to add the necessary API functions
    * The old SOAP interface layer was tossed out and redesigned
      and redeveloped to completely isolate it from the business
      logic
        - Involved creating adapter classes for the concept
          objects to ensure that Java collection objects are
          serialized as arrays
        - Adding an additional class to manage the interface
          between the business logic layer, which has fewer
          API methods all of which use complex objects
          and the SOAP layer
        - Modification of the client side code

- A couple of the new API methods needed graph like methods
  (getAuthoritativeList, getHigherTaxon)
    * Node/Edge/Graph classes were created that add all edges
      necessary for rapid tree/graph traversal at the database
      level was created
    * Hibernate mappings to those classes where created
    * Added back to the system the idea of an authoritative list

- Modified the object model at the same time to more simply handle
  reference objects from TCS 0.953 and improve performance
  * modified business code, database mappings and code

- Added an n-gram matching algorithm to implement a dictionary for
  matching arbitrary strings to database concepts.  This is currently
  being used in getBestConcept, but will be adopted for other
  search algorithms
 

- Added implementations of the following API methods:
    * getAuthoritativeList: returns all concepts at a particular
                            rank from a particular subtree
    * getSynonymousNames:   returns the list of all synonymous
                            name strings for a concept
    * getHigherTaxon:       returns the a higher taxon at a
                            specified rank for a concept concept
                            below it
    * getBestConcept:       returns a list (of hopefully only 1) concept
                            best matching a particular name within
                            a particular authority

- Modifications to the build process
    * War file generation of the SOAP service for easier
      deployment
    * Automatic generation of Java class files containing the
      SOAP <-> Java binding rules from the WSDL

- Wrote a harvester for ITIS data that takes in a root TSN and
  generates an 0.95.3 instance document from that root to the
  leaf concepts
    * Parses an ITIS XML instance using XPath to generate concepts
      which are then marshalled using the already present JAXB
      0.95.3 marshaller.
    * Incorporated tests for this
    * Found many ITIS XML instances do not conform to the XML
      specifications due to characters not being represented
      by their appropriate entity code.  For those, the entity
      codes are fixed on the fly prior to parsing.

- Wrote a generic import tool that takes in any instance of the TCS
  (in 0.88b or 0.95.3) and populates the database
    * Change in the way that the unmarshalling works sot it uses a
      chain of responsibility rather than factory methods
    * Numerous bugfixes to the 0.88b and 0.95.3
      marshallers/unmarshallers

- Modified the TOS to use the Spring framework to configure the
  application by composition of Java bean objects via dependency
  injection

- Met with CIPRES project in San Diego and agreed on several
  opportunities for collaboration
  1. Data
     *  both projects are creating XML schemas  CIPRes for the
        creation and storage of phylogenetic trees. 
        * CIPRes could adopt taxonomic model requirements from SEEK
          and inform SEEK of its requirements needs to help it fit
          CIPRes needs (there need not be a 1-to-1 mapping b/w our
          object model and TCS)
        * SEEK could accept CIPRes recommendations for incorporating 
sequence
          analysis in our data model
     * CIPRes must create an archival db, seek need not
        * CIPRes could develop and implement archival Treebase repository
          informed by the needs of SEEK TOS
     * Saves resources and expands user base on both projects
     * Jenny Wang (SEEK) and Rami Rifaieh (SDSC ontology) could meet to 
work
       on merging ontologies
     * Formal gathering to merge requirements
  2. CIPRes may be able to improve and adapt SEEK tools for CIPRes and 
other
     projects
     * adopt SCIA (schema matching tool)
     * create workflow functionalities in Kepler
  3. Interface
     * SEEK - Usability
     * CIPRes - dynamic interface generation and user configuration based
       on PISE strategy
  4. Visualization 
     * CIPRes involved in tree visualization, but no deticated people on it
       now, can they help SEEK?

- Hibernate training to tune TOS performance         









-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/seek-taxon/attachments/20050930/95e67822/attachment.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: astewart.vcf
Type: text/x-vcard
Size: 272 bytes
Desc: not available
Url : http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/seek-taxon/attachments/20050930/95e67822/astewart.vcf


More information about the Seek-taxon mailing list