[seek-dev] Re: enm pipeline

Deana Pennington dpennington at lternet.edu
Tue Sep 14 12:50:14 PDT 2004


Attached is a doc that describes the procedure...I'm pretty sure I 
posted it and some sample data to seek-dev quite awhile ago.  I'll have 
to see if I can hunt down the sample data.  I'm going to go ahead and 
load all of the data and existing documentation into the SRB.  It's just 
the EML which will take awhile.

Deana


Matt Jones wrote:

> Hi Deana,
>
> Thanks.  We're going to need some sample data to test with well before 
> Oct 15th (the scheduled deadline for Kepler release), but by no means 
> all of the data.  Once the Kepler pipeline is able to retrieve the 
> data layers from the EcoGrid we should then be able to add arbitrary 
> env data layers to the EcoGrid and have them be accessible.   So you 
> and others can keep adding layers until the Dec workshop.
>
> Thanks for the reminder about the IPCC conversion to GIS layers.  
> Could you describe what needs to happen there more fully -- I didn't 
> see that in the current ENM pipeline, so we probably need to develop a 
> pipeline for it.  It may be another thing we could ask Jianting to 
> work on -- he already agreed to work on finishing up the GIS actors 
> for Kepler.  Thanks.
>
> Matt
>
> Deana Pennington wrote:
>
>> I can work on the data, but not until Sep 27.  If I can figure out 
>> how to do templates in EML, it should go pretty quickly.
>>
>> I think you have forgotten the pipeline that converts the IPCC 
>> climate data to gis layers.
>>
>> Deana
>>
>>
>> Matt Jones wrote:
>>
>>> Hi Deana,
>>>
>>> We had the conference call on the ENM pipeline in Kepler this 
>>> morning. Its amazing how much stuff still remains to be done.  Our 
>>> notes from the call are on the SEEK web site, including a list of 
>>> action items to get the ENM pipeline done:
>>>
>>> http://seek.ecoinformatics.org/Wiki.jsp?page=ENMPipelineConferenceCall14Sep2004 
>>>
>>>
>>> One of the items was tenatively assigned to you -- if you're 
>>> willing. We need someone to coordinate getting the environmental 
>>> data layers into an EcoGrid node and documented with EML metadata.  
>>> Then they will be pulled into the pipelines as needed.
>>>
>>> Interestingly, because the ENM pipleline will have so many runs 
>>> (500,000) it will be important to be able to distribute the load -- 
>>> so it looks like we might be doing some of this stuff on multiple 
>>> machines.  Its gonna be a tough challenge, espacially because data 
>>> transfer for the environmental data layers will be a significant 
>>> bottleneck.  Ricardo estimates that one species (500 GARP runs) will 
>>> take between 6-24 hours, dependging on which env layers are used.  
>>> So, unless we distribute it, we're looking at 1000 days to run this 
>>> thing, obviously unacceptable. So the tradeoff between distributing 
>>> the computation and moving the data is not an easy one to make. But 
>>> we'll try to work it out.
>>>
>>> Could you look over the notes and let me know what you think?  Thanks,
>>>
>>> Matt
>>
>>
>>
>>
>

-- 
********

Deana D. Pennington, PhD
Long-term Ecological Research Network Office

UNM Biology Department
MSC03  2020
1 University of New Mexico
Albuquerque, NM  87131-0001

505-277-2595 (office)
505-249-2604 (cell)
505 277-2541 (fax)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: ENM Input Data Workflows.doc
Type: application/msword
Size: 45056 bytes
Desc: not available
Url : http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/seek-dev/attachments/20040914/c770c795/ENMInputDataWorkflows.doc


More information about the Seek-dev mailing list