[seek-dev] enm pipeline
Matt Jones
jones at nceas.ucsb.edu
Tue Sep 14 11:59:42 PDT 2004
Hi Deana,
We had the conference call on the ENM pipeline in Kepler this morning.
Its amazing how much stuff still remains to be done. Our notes from the
call are on the SEEK web site, including a list of action items to get
the ENM pipeline done:
http://seek.ecoinformatics.org/Wiki.jsp?page=ENMPipelineConferenceCall14Sep2004
One of the items was tenatively assigned to you -- if you're willing.
We need someone to coordinate getting the environmental data layers into
an EcoGrid node and documented with EML metadata. Then they will be
pulled into the pipelines as needed.
Interestingly, because the ENM pipleline will have so many runs
(500,000) it will be important to be able to distribute the load -- so
it looks like we might be doing some of this stuff on multiple machines.
Its gonna be a tough challenge, espacially because data transfer for
the environmental data layers will be a significant bottleneck. Ricardo
estimates that one species (500 GARP runs) will take between 6-24 hours,
dependging on which env layers are used. So, unless we distribute it,
we're looking at 1000 days to run this thing, obviously unacceptable. So
the tradeoff between distributing the computation and moving the data is
not an easy one to make. But we'll try to work it out.
Could you look over the notes and let me know what you think? Thanks,
Matt
--
-------------------------------------------------------------------
Matt Jones jones at nceas.ucsb.edu
http://www.nceas.ucsb.edu/ Fax: 425-920-2439 Ph: 907-789-0496
National Center for Ecological Analysis and Synthesis (NCEAS)
University of California Santa Barbara
Interested in ecological informatics? http://www.ecoinformatics.org
-------------------------------------------------------------------
More information about the Seek-dev
mailing list