[kepler-dev] Workflows non-terminating in nightly build

Matt Jones jones at nceas.ucsb.edu
Tue Feb 7 13:04:53 PST 2006


Maybe you should use the test actors to test for valid results rather 
than just running a larger workflow like PIW.  There's really no way to 
tell what went wrong if PIW etc. fails in its current configuration. 
Using the test actors generates specific exceptions when the test fails 
and so it lets you pinpoint the critical stuff and test if the right 
values are being produced for known inputs. I think that would probably 
fix many of these issues.

Matt

Dan Higgins wrote:
> Hi Efrat,
>    It looks like your workflow 'workflows/srb/srbPhysLoc.xml is 
> 'hanging' in the nightly build (and when I try to run it locally). I 
> would guess that it should run in a few seconds, but it is still 
> 'executing' after ~ 1hr. Can you take a look at it. In the mean time I 
> am going to remove it from the nightly test workflow list (?).
> 
> To Kepler-dev:
>    Since we apparently have had several problems with workflows that do 
> not terminate, we may have to consider including timeouts in our 
> workflows ?  (To handle the cases where actors are not executing 
> locally.) Someone may not always be looking and be able to close an 
> executing workflow
> 
> Dan
> 
> ---
> 
> Chad Berkley wrote:
> 
>> We had that problem before.  Sometimes the PIW WF doesn't finish.  I  
>> can't remember what i did to fix it the last time though.
>>
>> chad
>>
>> On Feb 7, 2006, at 10:22 AM, Dan Higgins wrote:
>>
>>> Hi Ilkay,
>>>    I have been investigating why our nightly build has not been  
>>> working correctly and discovered that we had a bunch of 'hung'  
>>> kepler processes. Apparently this is occurring when the nightly  
>>> build script is trying to execute the SPA PIW workflow. It looks  
>>> like that workflow may never terminate! Could you check it please.  
>>> In the mean time, I am going to try removing it from the list of  
>>> workflows to test (for the time being, anyway).
>>>
>>>    Incidently, in case you are wondering why this problem just came  
>>> up, I thing it is because we previously did not have X11 set up on  
>>> the server. It was just added, and now the server actually tried to  
>>> run the test workflows! (for the fiirst time in a long while).
>>>
>>> Dan
>>>
>>> -- 
>>> *******************************************************************
>>> Dan Higgins                                  higgins at nceas.ucsb.edu
>>> http://www.nceas.ucsb.edu/    Ph: 805-893-5127
>>> National Center for Ecological Analysis and Synthesis (NCEAS)  Marine 
>>> Science Building - Room 3405
>>> Santa Barbara, CA 93195
>>> *******************************************************************
>>>
>>>
>>>
>>
> 
> 

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Matt Jones
jones at nceas.ucsb.edu                         Ph: 907-789-0496
National Center for Ecological Analysis and Synthesis (NCEAS)
UC Santa Barbara     http://www.nceas.ucsb.edu/ecoinformatics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


More information about the Kepler-dev mailing list