[kepler-dev] Re: Building a specialized composite actor (Ilkay Altintas)

Wed Oct 13 13:03:09 PDT 2004

The problem we (and other projects doing similar work) are grappling 
with involves long-running, highly parallel scientific jobs.

While we'd like to let scientific end-users control, monitor, and modify 
these jobs using a Kepler GUI, the jobs really need to be run through a 
specialized grid meta-clustering system (currently Nimrod or APST, which 
may call any of a number of sub-systems on specific clusters, such as 
NQS, PBS, SGE, Condor, &c).

Traditional Kepler actors are fine for submitting, controlling, and 
monitoring the jobs.

The problem is that the jobs themselves often compromise a workflow in 
and of themselves, often under the control (as in our case) of legacy 
software that re-arranges the tasks into a sub-workflow.

An end-user scientist can't control the order of execution of these 
tasks (due to the legacy software setup) but still can control whether a 
particular stage (or parallel execution path) executes or not.

This looks like a workflow and should be represented as such in the 
Kepler GUI as some sort of special composite actor to be submitted to 
the grid submission actor. (The actor might be more of an operator than 
a function). Scientists won't be allowed to rearrange the order of the 
pipes, or delete or add actors, but they could turn actors on or off.

This composite actor workflow then gets converted into an XML 
representation (somehow) that is understood by the grid submission 
engine actor, which then submits the XML workflow toApst or Nimrod.

Our simple immediate solution (between EOL and the Resurgence project) 
was to have Kepler actors that have an "on or off" parameter. If the 
actor is "on" it will copy its income XML, and add an additional line to 
that XML to represent its own task. If the actor is "off", it will 
simply copy its input(s) to its output.

In this way, the output of the composite actor would be an XML 
representation describing the selected workflow to the grid submission 
actor, which will then submit it to the grid submission engine. 
Traditional kepler actors (which might be added or reconfigured by the 
user) will then allow monitoring, controlling, re-starting, &c, of this job.

Since we want to present this to scientists (who would be allowed to 
reconfigure other parts of the workflow, but not the composite actor) we 
need to make this composite actor "write-protected." The scientist would 
be allowed to change parameters of actors in this special composite 
actor, but not delete any of the sub-actors, or change the way they are 
connected (since the underlying legacy software that will actually be 
executing this workflow does not support these other configurations or 
the XML files they might produce, breaking the work-flow.) Unlocking the 
actor would bring up a warning dialog that this is for experts and that 
any changes to the composite actor might break legacy software elsewhere 
in the system. This is fairly user-friendly solution, in that it allows 
scientists to use the full power of Kepler for the monitoring parts of 
the system, but prevents them from (accidentally) modifying the special 
composite actor representing the workflow implemented in legacy software.

In the longer term, the semantics of the legacy software might be 
described to Kepler, so that Kepler could figure out which actors were 
write-protected and which weren't. Some sub-actors in this special 
composite actor might be deleteable (i.e., their absence supported by 
legacy software) and others not.

For now, however, simply adding a dialog to write-protect aspects of the 
composite actor's workflow should be good enough.

We were wondering if anyone had suggestions on how best to do this.
--
Werner G. Krebs, Ph.D.
Technical Lead, Encyclopedia of Life Project (http://eol.sdsc.edu)
San Diego Supercomputer Center Dept 0505
University of California, San Diego
9500 Gilman Drive
La Jolla, CA 92093-0505, USA
+1 858 822 3620

>From: "Ilkay Altintas" <altintas at sdsc.edu>
>To: "Kepler-Dev" <kepler-dev at ecoinformatics.org>,
>	<ptolemy-hackers at eecs.berkeley.edu>
>Date: Wed, 13 Oct 2004 09:41:35 -0700
>Subject: [kepler-dev] Building a specialized composite actor
>
>Hi,
>
>For our EOL (Encyclopedia of Life) project, Werner and I 
>have been thinking of building specific type of composite 
>actor that will have some dummy actors. These dummy actors
>will each have a set of parameters and the values of these
>parameters will be reflected in the sub-workflow description.
>
>The (sub-)workflow in this special composite actor would be
>used in two ways:
>
>1. To convert it into a sepecial workflow description and 
>send out the MoML-like description to be submitted to another
>workflow engine.
>
>2. To directly convert this sub-workflow into an XML description
>of a workflow and then submit it to another workflow engine.
>
>I was wondering if you have any suggestions on how this actor 
>can be built for each scenario? It will probably be similar 
>to RunCompositeActor in PtolemyII. 
>
>Thanks in advance,
>Ilkay
>
>
>  
>