[kepler-dev] Parallel execution of a task with MultiInstanceComposite

Thu Feb 8 15:17:11 PST 2007

Hi,

My recent Kepler workflows (PN models) need to execute a slow task (within 
a pipeline) in several instances to speed the whole workflow up. Related 
to this, I have created an example ptolemy model to demonstrate 
my problem with the MultiInstanceComposite.

So the basic workflow is:

   Ramp -- T1 -- T2

T1 and T2 are the same composite, printing out their id, the incoming 
input and the wallclock time to the stdout. They are sequential tasks (SDF 
composites). T2 runs 10 times slower than T1.

T2 can be parallelized with the Distributor / MultiInstanceComposite / 
Commutator combo, so that the parallelization factor can be given as a 
parameter to the workflow (see par2.xml)

    Ramp -- T1 -- Distributor -- MultiInstanceComposite -- Commutator

    where MultiInstanceComposite contains T2 (only).

I would expect that if N=5, the first 5 outputs of T1 are started to 
be processed by T2 instances as soon as they are available. However, the 
timing of the actions looks like this:

 	task1   item=1 time=0.113
 	task1   item=2 time=0.217
 	task2_1 item=1 time=1.146
 	task2_2 item=2 time=1.246
 	task1   item=3 time=1.33
 	task1   item=4 time=1.434
 	task1   item=5 time=1.538
 	task1   item=6 time=1.642
 	task1   item=7 time=1.746
 	task1   item=8 time=1.851
 	task1   item=9 time=1.955
 	task1   item=10 time=2.059
 	task2_3 item=3 time=2.359
 	task2_4 item=4 time=2.452
 	task2_5 item=5 time=2.568
 	task2_1 item=6 time=2.671
 	task2_2 item=7 time=2.775
 	task2_3 item=8 time=3.368
 	task2_4 item=9 time=3.461
 	task2_5 item=10 time=3.573

That is, T2_[3,4,5] start after T2_2 finishes with its first token.
(A second run is even worse, try it yourself with par2.xml)

If I restrict the PN queue length to max 1, it becomes even worse:
  T2_2 starts after T2_1 finishes with its first token and T2_[3,4,5] start 
after T2_2 finishes with its first token.

 	task1   item=1 time=0.108
 	task2_1 item=1 time=1.12
 	task1   item=2 time=1.246
 	task2_2 item=2 time=2.252
 	task1   item=3 time=2.359
 	task1   item=4 time=2.463
 	task1   item=5 time=2.567
 	task1   item=6 time=2.671
 	task1   item=7 time=2.775
 	task1   item=8 time=2.879
 	task1   item=9 time=2.983
 	task1   item=10 time=3.087
 	task2_3 item=3 time=3.365
 	task2_4 item=4 time=3.47
 	task2_5 item=5 time=3.574
 	task2_1 item=6 time=3.676
 	task2_2 item=7 time=3.781
 	task2_3 item=8 time=4.369
 	task2_4 item=9 time=4.474
 	task2_5 item=10 time=4.579

Note: I have seen another pattern as well, but not what I expect.
Note: I need to restrict the PN queue to 1 to hold back the T1 as much as 
possible because it is a very expansive operation in disk size 
(42GB/item).

So the question: how can I let the above model start processing 5 items 
(nearly) at once?

I have included par1.xml, a fix-numbered parallelization with 
Distributor/Commutator which behaves as one would expect. This 
demonstrates that the MultiInstanceComposite is which starts up strangely.

Thanks in advance for ideas,
Norbert

      Norbert Podhorszki
    ------------------------------------
      University of California, Davis
      Department of Computer Science
      1 Shields Ave, 2236 Kemper Hall
      Davis, CA 95616
      (530) 752-5076
      pnorbert at cs.ucdavis.edu
      ----------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: par1.xml
Type: text/xml
Size: 27019 bytes
Desc: 
URL: <http://mercury.nceas.ucsb.edu/kepler/pipermail/kepler-dev/attachments/20070208/64ef0184/attachment.xml>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: par2.xml
Type: text/xml
Size: 30797 bytes
Desc: 
URL: <http://mercury.nceas.ucsb.edu/kepler/pipermail/kepler-dev/attachments/20070208/64ef0184/attachment-0001.xml>