[kepler-dev] [Ptolemy] Re: [Bug 3693] - MultiInstanceComposite actor deadlocks sometimes

Edward A. Lee eal at eecs.berkeley.edu
Sun Dec 21 10:04:50 PST 2008


I have checked in a fix to the problem that MultiInstanceComposite
with PN would sometimes deadlock.

Interestingly, the problem was broader, and the deadlock could have
occurred pretty much any time we had more than one thread obtaining
write permission on the workspace.  This was quite rare, since
in most applications it would only be the UI.  However, conceivably
we could have gotten deadlock during preinitialize if the user tried
to make editing changes at the same time that the preinitialize method
of some higher-order actor was trying to modify in the model.

In the case of MultiInstanceComposite, it modifies the model in its
wrapup method, and when you have more than one instance of
MultiInstanceComposite, the deadlock was quite likely to happen.

Edward


Edward A. Lee wrote:
> 
> I have a diagnosis of this problem, and I believe I have
> a fix, but as usual with threads, I'm not fully confident
> in the solution.  I guess if this sounds reasonable, then
> it would increase my confidence.
> 
> The MultiInstanceComposite apparently triggers a bug
> because its wrapup() method acquires write access to the
> workspace. In your model, multiple threads will be simultaneously
> trying to acquire write access, something that is fairly rare
> in uses of Ptolemy.  This is why we see the bug only with
> uses of MultiInstanceComposite.
> 
> The problem is in the use of Workspace wait(Object obj) method.
> What this method does is release any read permissions that
> the calling thread has on the workspace, call obj.wait(),
> reacquire the read permissions, and return.
> 
> The problem is that almost everything in the tree where
> this is called, it is inside a synchronized block,
> something like this:
> 
>   synchronized(obj) {
>      ...
>      _workspace.wait(obj);
>      ...
>   }
> 
> The problem occurs when the wait(Object obj) method tries
> to reacquire read permissions.  At that point, it holds
> a lock on obj, and blocks until the workspace grants
> read permission.
> 
> If there is a thread waiting for write permission, the read
> permission is not granted.  The problem occurs when another
> thread tries to get a lock on obj while holding read or write
> permission on the workspace. Deadlock.
> 
> 
> 
> I think that the fix is that a thread that calls
> wait(Object obj) should not hold a lock on obj when it makes
> that call... This is counterintuitive to Java programmers,
> because generally you _have to_ hold the lock to call wait().
> Indeed, inside wait(Object obj), it acquires the lock, but
> the key is that it releases that lock before it tries to
> reacquire read permissions, thus preventing the deadlock
> if the calling thread does not already hold the lock.
> 
> I believe this is correct because wait(Object obj) will
> release any lock on obj anyway for an indeterminate amount
> of time while obj.wait() is called.  Thus, no calling method
> can really assume the lock is held across the call
> to wait(Object obj).
> 
> 
> Edward
> 
> Christopher Brooks wrote:
>> Hi Edward,
>> Here's a MultiInstanceComposite model that hangs for me.
>> I've attached a Ptolemy version.
>>
>> The model has PN on the outside with SDF inside the
>> MultiInstanceComposite.  The MultiInstanceComposite has
>> no actors, just a link between the ports, which is rather odd.
>>
>> _Christopher
>>
>> bugzilla-daemon at ecoinformatics.org wrote:
>>> http://bugzilla.ecoinformatics.org/show_bug.cgi?id=3693
>>>
>>>
>>>
>>>
>>>
>>> ------- Comment #4 from crawl at sdsc.edu  2008-12-05 11:12 -------
>>> I was able to reproduce the deadlock in Jianwu's workflow on:
>>>
>>> Windows XP, java 1.6.0_11, Kepler 1.0.0
>>> Mac, java 1.5.0_16, both Kepler 1.0.0 and head
>>> _______________________________________________
>>> Kepler-dev mailing list
>>> Kepler-dev at kepler-project.org
>>> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-dev
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: eal.vcf
Type: text/x-vcard
Size: 351 bytes
Desc: not available
URL: <http://mercury.nceas.ucsb.edu/kepler/pipermail/kepler-dev/attachments/20081221/051b17e8/attachment.vcf>


More information about the Kepler-dev mailing list