[kepler-users] Problems running workflows with newer revisions

Daniel Crawl danielcrawl at gmail.com
Thu Feb 14 11:17:29 PST 2013


Hi Jonathan,

It might be helpful to see the threads' stack traces once the workflow
hangs. You can get this by using jstack, or control-break on Windows or
control-\ on Mac. If there is a deadlock, this could help determine
the cause.

   --dan

On 2/14/13 8:05 AM, Christopher Brooks wrote:
> Jonathan,
> As Edward wrote, a small example would help.
>
> One thing to would be to try the brute force method of determining what
> change or changes caused the problem.
> The way to do this is to do a binary search of the tree by checking out
> different versions and testing them.  It can take awhile, but the amount
> of actual effort is low.
>
> If you are using only actors that are in Ptolemy II and you are not
> using Kepler-only actors, then you could export your model as MoML,
> verify that the model works in r64636 of Ptolemy II and fails in r65658
> and then check out an intermediate version of Ptolemy II.
> (65658-64636)/2+64636 = 65147
> I would do this with
>    svn co -r 65147 https://source.eecs.berkeley.edu/svn/chess/ptII/trunk
> ptII.65147
>    cd ptII.65147
>    export PTII=`pwd`
>    ./configure
>    ant
> Then rerun the model and try either an earlier or later version and
> repeat the above.
>
>
> You could also look at the ChangeLog for the ptII tree at
> http://chess.eecs.berkeley.edu/ptexternal/nightly/ChangeLog.txt
>
> You could also try diffing ptolemy/directors/pn between the version that
> worked and the version that failed.
>
> If your tree uses Kepler actors, then you will need to use the Kepler
> build system to check out different versions of Kepler. Offhand, I'm not
> sure how to do that.
>
> _Christopher
>
> On 2/13/13 8:20 PM, Edward A. Lee wrote:
>>
>> This sounds like a threading bug.
>> If you would like to know how I feel about threads, read this:
>>
>> http://ptolemy.eecs.berkeley.edu/publications/papers/06/problemwithThreads/
>>
>> If you are using only built-in actors, then it really would be great
>> to have reproducible example. And I would really like to fix it. If
>> you have custom actors, then we probably can't help...
>>
>> Without a reproducible example, threading bugs are impossible to fix.
>> (sometimes even with a reproducible example they are impossible to fix).
>> Threads are a _really bad_ concurrency model. Sadly, they dominate
>> concurrency today...
>>
>> Edward
>>
>> On 2/13/13 2:33 PM, Jonathan Boright wrote:
>>> Dear Kepler users,
>>>
>>> We have been using kepler for a while now and have developed a number
>>> of fairly large and complex workflows. We recently updated kepler and
>>> have found that our old workflows 'hang' after only a short time
>>> (less than a minute). As far as I can tell they don't always hang in
>>> the same spot. We have been trying to create a model to post on this
>>> forum that demonstrates this behavior, but as soon as we pare the
>>> model down to a size small enough to post and strip out our
>>> customized code... they tend to work fine. So in lieu of an example
>>> model I'll attempt to describe the strucure of our models, the
>>> symptoms, and some avenues that we have used to try and narrow down
>>> the issues.
>>>
>>> Our old models run in the following svn revision(s):
>>> svn info details:
>>> Working Copy Root Path: /cygdrive/c/Kepler/svn/[build-area etc.]
>>> Revision: 30654
>>> Working Copy Root Path: /cygdrive/c/Kepler/svn/ptolemy/src
>>> Revision: 64636
>>>
>>> They 'hang' when run in the following revision(s)
>>> svn info details:
>>> Working Copy Root Path: /cygdrive/c/Kepler/svn/build-area
>>> Revision: 31428 - 31431 (current)
>>> Working Copy Root Path: /cygdrive/c/Kepler/svn/ptolemy/src
>>> Revision: *65654 - *65658 (current)
>>>
>>> I'll describe one particular model as an example:
>>> The workflow is a hydrlogic model which has the following attributes:
>>> - a PN director.
>>> - the most comon token is a double matrix token ([double]) of size
>>> 360x270.
>>> - many composite actors, some with sdf directors (opaque) and some
>>> without directors (transparant).
>>> - an opaque composite actor made into a class object with many (~30?)
>>> instances of this class.
>>> - multiple Nondeterministic Merge actors (to re-use tokens)
>>>
>>> When this model is run in newer revisions, it runs for a bit (usually
>>> less than 1 minute) and then just stops going forwards... no error
>>> messages... just hung. When we then attempt to stop the model it
>>> gives the message "wrapping up" but hangs there. Sometimes I'm able
>>> to close the window, sometimes I need to kill kepler throught the
>>> task-manager...
>>>
>>> We have noticed some changes in kepler/ptolemy... for example, it is
>>> now possible to make transparant composite actors into classes (which
>>> is useful... thank you). We hypothesized that perhaps something
>>> changed in the way that the PN director handles the opaque composite
>>> actors, and that this change is causing the log-jam (or an
>>> un-satisfied 'race condition'?) ... so we removed all of the sdf
>>> directors making all of the composite actors transparant. This made
>>> some difference in some of our smaller workflows but still the larger
>>> noes hang...
>>>
>>> I'll end this note here and see if I can come up with a concrete
>>> example. In the mean time, any new perspectives or thoughts would be
>>> helpful.
>>>
>>> Thanks in advance.
>>>
>>> Jon Boright
>>>
>>> ---------------------------------
>>> Jonathan Boright
>>> Research Scientist
>>> ISciences, LLC
>>> 61 Main Street, Suite 200
>>> Burlington, VT 0540
>>>
>>>
>>> _______________________________________________
>>> Kepler-users mailing list
>>> Kepler-users at kepler-project.org
>>> http://lists.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-users
>>
>>
>>
>> _______________________________________________
>> Kepler-users mailing list
>> Kepler-users at kepler-project.org
>> http://lists.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-users
>
> --
> Christopher Brooks, PMP                       University of California
> CHESS Executive Director                      US Mail: 337 Cory Hall
> Programmer/Analyst CHESS/Ptolemy/Trust        Berkeley, CA 94720-1774
> ph: 510.643.9841                                (Office: 545Q Cory)
> home: (F-Tu) 707.665.0131 cell: 707.332.0670
>
>
>
> _______________________________________________
> Kepler-users mailing list
> Kepler-users at kepler-project.org
> http://lists.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-users
>



More information about the Kepler-users mailing list