[kepler-dev] Thread safety with Ptolemy’s TypedIOPort

Edward A. Lee eal at eecs.berkeley.edu
Fri Oct 9 05:35:55 PDT 2009


Interesting question...

I suspect the problem with your threading director is that it isn't
respecting the semantics of stopFire().  The contract is that the
Manager executes change request only while every actor in the model
is stopped. Specifically, it does so between iterations of the top-level
model. So when the top-level director returns from postfire(), the
Manager assumes it can execute change requests.

The problem is that if you have a thread running independently of the
top-level director, how does the top-level director know when it is safe
to return from postfire()?

The key is that stopFire() is called on every actor in the model
when a change request is registered. In PN, the PN threads respond
to stopFire() by suspending at the next opportunity (typically a
read or a write to a port).  Only after all threads have stopped
does the postfire() method of the director return, allowing the manager
to execute the change request (or maybe it's the fire() method,
I forget).

Hope this helps...

Edward


Colin Enticott wrote:
> Hi,
> 
> Looking into this further (sorry for the delay, I’ve been busy), it 
> looks like all actions on port types obtain a read lock on the 
> workspace. When the manager resolves types, it also obtains a read lock, 
> but makes changes to the port types. Shouldn't it obtain a write lock?
> 
> Thanks,
> Colin
> 
> Colin Enticott wrote:
>>
>> Hi,
>>
>>
>> First of all, I didn’t think that multiple threads would use the same 
>> TypedIOPort object, but here’s the story:
>>
>>  
>>
>> I’ve been developing a new "threading director" for Ptolemy, the 
>> Nimrod/k TDA director[1], and one in every 100 executions of my 
>> rigorous director threading test workflows, I get an exception. This 
>> exception happens when an actor sends a token out a TypedIOPort:
>>
>> (Sorry. The exception is from kepler-1.0.0, so the line numbers differ 
>> from the current version, but the functions in question are identical)
>>
>>  
>>
>> ptolemy.kernel.util.IllegalActionException: Run-time type checking 
>> failed. Token 3 with type int is incompatible with port type: int
>>
>>   in .Composite.Ramp.output
>>
>>         at ptolemy.actor.TypedIOPort._checkType(TypedIOPort.java:750)
>>
>>         at ptolemy.actor.TypedIOPort.send(TypedIOPort.java:472)
>>
>>         at ptolemy.actor.lib.Ramp.fire(Ramp.java:138)
>>
>>         at 
>> org.monash.nimrod.NimrodDirector.NimrodProcessThread.run(NimrodProcessThread.java:347)
>>
>>  
>>
>> The confusing error message is “Token 3 with type int is incompatible 
>> with port type: int”. Looking into this deeper I believe the error 
>> message is being generated after the port state has changed.
>>
>>  
>>
>> The function in question is:
>>
>>  
>>
>> protected void _checkType(Token token) throws IllegalActionException {
>>
>>     int compare = TypeLattice.compare(token.getType(), _resolvedType);
>>
>>  
>>
>>     if ((compare == CPO.HIGHER) || (compare == CPO.INCOMPARABLE)) {
>>
>>         throw new IllegalActionException(this,
>>
>>                 "Run-time type checking failed. Token " + token
>>
>>                         + " with type " + token.getType()
>>
>>                         + " is incompatible with port type: "
>>
>>                         + getType().toString());
>>
>>     }
>>
>> }
>>
>>  
>>
>> I suspect the “compare” method is called before the type is set (or 
>> changed) and the error message is generated after the type is set. 
>> Looking for another thread that could be accessing the TypedIOPort, I 
>> discovered the type checking functionality. Listening to the manager, 
>> it looks like the manager is "resolving types" in response to a 
>> “ChangeRequest”.
>>
>>  
>>
>> Two questions:
>>
>>  
>>
>> Should I use “change requests” in this way? None of the changes will 
>> cause any problems with types and so I could directly modify the 
>> workflow. (I initially used change requests as the main thread holds a 
>> read lock on the workspace when the "manager" fires the “director”)
>>
>>  
>>
>> And, it looks like TypedIOPort is not thread safe. Does this need to 
>> be fixed? Or should type resolving be completely blocked in a threaded 
>> environment? The documentation suggests it should be left to the 
>> director to decide “when it is safe to perform change requests”, but I 
>> cannot see a way of preventing an actor from doing a type check when 
>> sending tokens in a threaded environment.
>>
>>
>> Also, I don't think it is just with my director. It looks like this 
>> issue will arise in the PN environment, if the workflow makes changes 
>> to the workflow.
>>
>>  
>>
>> Thanks,
>>
>> Colin
>>
>>
>> [1] Abramson D, Enticott C, Altintas I., "Nimrod/K: Towards Massively 
>> Parallel Dynamic Grid Workflows", IEEE SuperComputing 2008, Austin, 
>> Texas November 2008
>>
>>
>> -- 
>> Colin Enticott, Research Scientist, Ph: +61 03 9903 2215
>> Room H7.26, Level 7, Building H, Monash University Caulfield 3145, Australia 
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Kepler-dev mailing list
>> Kepler-dev at kepler-project.org
>> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-dev
>>   
> 
> 
> -- 
> Colin Enticott, Research Scientist, Ph: +61 03 9903 2215
> Room H7.26, Level 7, Building H, Monash University Caulfield 3145, Australia 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Kepler-dev mailing list
> Kepler-dev at kepler-project.org
> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: eal.vcf
Type: text/x-vcard
Size: 351 bytes
Desc: not available
URL: <http://mercury.nceas.ucsb.edu/kepler/pipermail/kepler-dev/attachments/20091009/41c7d4ef/attachment.vcf>


More information about the Kepler-dev mailing list