[kepler-dev] Kepler Remote Execution Engine

Paul Allen pea1 at cornell.edu
Thu Jun 19 05:53:07 PDT 2008


I second the notion that getting Tristan's start into a wiki that we can 
all edit is critical. I recall Matt saying that we all should be able to 
edit the Kepler-Project.org pages, but I don't recall the details. I've 
tried logging in with my CVS username password, but that doesn't  seem 
to get me very far (logged in but not authorized -- 
_uid=allen,o=unaffiliated_)

I'm going to hold back on comments until we can collaborate via wiki.

-Paul

Christopher Brooks wrote:
> Hi Tristan,
>
> This looks pretty interesting.  Various Ptolemy sponsors have an
> interest in an execution engine as well.
>
> One issue is that it seems like this sort of thing has come up
> before.  I have no experience with grid computing, but it
> seems like an analysis should include comparisons with grid
> computing and differentiate what a Kepler Remote Execution Engine
> (KREE) would need.
>
> Also, things like BOINC (http://boinc.berkeley.edu/) have
> always seem of interest.  I'd love to see an interface to BOINC
>
>
> A project needs a name and I like:
>   Kepler Remote Execution Engine (KREE)
> or
>   Ptolemy Remote Execution Engine (PREE)
>
>
> As a joke:
>   Kepler Remote Execution Engine for Ptolemy (KREEP)
>
> I'm fine with KREE.
>
> For the record, I've included a copy of your page.  Getting this
> on a more accessible wiki somewhere would be good.
>
> _Christopher
>
>
>
> INTRODUCTION:
> One of the high priority things I want for Hydrant is to separate out the
> execution of workflows from hydrant itself. This has a number of advantages:
> * Web site performance doesn't suffer if any computation heavy workflows are
>   executed.
> * An execution server could be used by other applications.
> * Each institution could deploy it's own execution server, allowing users to run
>   workflows on their own institutions servers even if starting the job from
>   Hydrant
> Below is a list of requirements for the different concepts I see in such a
> system, followed by a list of work already done.
>
> BACKEND-FRONTEND INTERACTION
> * Frontend sends a URI for the workflow it wants to run, Backend returns the ID
>   of the newly created job, or throws an error if there is a problem.
> * Frontend can poll the server to get various information.
>   * Job status
>   * Results
>   * Messages (any miscellaneous info)
> * Frontend can implement an interface that will allow it to be sent updates
>   from the server as apposed to polling for them.
> * Frontends can tell the server to delete a job. This should be done once the
>   frontend has retreived all the results.
> * Backend only allows trusted frontends to connect.
>   * Uses certificates.
> ? How to handle file inputs?
>   ? dictionary of inputs and URIs to access those files.
>
> JOBS
> * Each job is linked to the frontend which started it. 
> * Only the frontend which started the job can access it.
> * Job lifecycle:
>   * Starts off in the 'NEW' state.
>   * frontend then posts a number of variables to change for the job.
>     * key:'.workflow.path.to.actor.property', value:'value for property'
>     * File inputs need to be passed in as URIs to a remote resource that the
>       backend can access.
>   * Goes to 'QUEUED' when the frontend tells it to.
>   * When it can be started, goes into 'RUNNING' state.
>   * If it requires user input, goes into 'WAITING' state.
>   * If an error occures during execution, goes into 'FAILED' state.
>   * When complete, goes into 'FINISHED' state.
> * A list of messages is kept for miscellaneous notifications.
> * Job status and results are stored in a database.
> * Jobs can be stopped and deleted at the client's request.
>
> WORKFLOWS
> * Backend stores a list of restricted Actors, and removes/replaces them from a
>   workflow before it's run.
>   * Changes like this should be listed in the Job's messages.
> * Backend stores a list of the third party software that it supports (i.e. R, 
>   Matlab).
> * Workflows are kept on the backend. If the same URI is passed to the backend
>   the backend will use the protocols equivalent to http's If-Modified-Since
>   header to detect if the workflow has changed since it was last downloaded and
>   if not just use the last downloaded version. If the protocol doesn't support
>   such a function, it will always download a new copy of the workflow.
>
> KORE/CORE CHANGES REQUIRED:
> * Easier way to manage output from Actors.
>
> WORK ALREADY DONE:
> Christopher Tuot's group has built the beginings of a Backend which loads a MoML
> passed to it in a POST. It supports multiple 'Frontends' which poll the server
> to get results. Jobs and Workflows are currently only stored in memory and are
> lost when the server restarts.
>
> Jianwu Wang has written a Backend which can load a workflow from a URL.
>
> OTHER DISCUSSION TOPICS:
> * Is it worth exploring an OSGI architecture?
>
>
>
>
>
>
>
> ============== WHITEBOARD ==============
>
> ** Social Networking **
> Current:
>  Hydrant
>  MyExperiment
>
> Features:
> * Visualisation of Workflows
>  - Simple view, no flash or other requirements other than a javascript enabled
>    web browser.
> * Sharing Workflows
> * Start Jobs 
>  - hooks to Execution server
> * Edit Workflows 
>  - hooks to building server
> * Discussion of Workflows/Jobs
> * Tags/Ratings on Workflow/Jobs
>
> Requirements:
> ...
>
> Possible Technologies:
> * google friend connect
> * some form of CMS ?
>
> * Building
> Current:
>  KFlex
>
> * Execution
> Current:
>  Hydrant
>
> Description: A Frontend/Backend modeled server for executing workflows.
>
> Features:
> * Multiple frontends.
> * Administration page
>  - Handle loading of jars
>  - Handle setup of frontends
>
>
> Requirements:
> * A standardised API for the backend.
> * Frontends must implement a frontend API
> * Frontend-Backend Security
>  - Backend should only accept requests from known frontends.
>    Possible solutions: IP security (not ideal)
>                        Certificate based security.
>                        (SecurityPlugin)
> * User segregation...
>  - If a user starts a job only that user should be able to see/control it.
>    Possible solution: only allow servers with trusted certificates to
>                       access the execution engine, and let them handle
>                       user access control.
> * Execution Plugin
>  - An interface that handles execution of workflows. This should be built
>    so that a simple one server execution model can be used to start with
>    but a distributed execution model could be easily implemented later on.
>
> Possible Technologies:
> * Google Web Toolkit for Admin page
>
>
>
> Backend Communication <--a--> Frontend Communication
>           ^
>           |
>           b
>           |
>           v
>     Backend Engine --c--> Database
>                    --d--> Kepler
>
> == Database ==
> Job: 
>      auto id;
>      string workflow_file;
>      string status;
> MonitorRegister:
>         auto id;
>         job job;
>         string host;
>
>
> == Technical Use cases ==
>
> Queue new Job:
>       backend/communication/Backend.java:queueNewJob("http://www.workflowrepo.org/workflow.xml")
>       backend/communication/Backend.java:queueNewJob("http://www.workflowrepo.org/workflow.xml", false)
>       backend/engine/Engine.java:queueNewJob("http://www.workflowrepo.org/workflow.xml", false)
>
> --end--
>
> --------
>
>     
>     Hi all,
>     
>     I've been pondering over the requirements for removing the execution side o
>    f
>     things from the rest of hydrant. I was thinking of putting this on the
>     kepler-project wiki, but I couldn't create any new pages (i logged in with
>     my username i use for the cvs and unaffiliated institution). Can anyone hel
>    p
>     me out here? For now i've just put it in my git repository, which can be
>     accessed here:
>     http://www.hpc.jcu.edu.au/git/?p=jc124742/documents.git;a=blob_plain;f=hydrant-requirements.txt;hb=da7c5bd888faa4591e84509c53ade2291df11db5
>     I know there is a lot of other people wanting similar things to this, so
>     lets start a discussion and get something that we can put into the Kepler
>     Core. If you have any ideas, amendments, criticisms or anything that we can
>     discuss, fire away!
>     
>     Cheers,
>     -Tristan
>     
>     -- 
>     Tristan King
>     Research Officer,
>     eResearch Centre
>     James Cook University, Townsville Qld 4811
>     Australia
>     
>     Phone: +61747816902
>     E-mail: tristan.king at jcu.edu.au www: http://eresearch.jcu.edu.au
> _______________________________________________
> Kepler-dev mailing list
> Kepler-dev at ecoinformatics.org
> http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/kepler-dev
>
>   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/kepler-dev/attachments/20080619/ae28236e/attachment-0001.htm 


More information about the Kepler-dev mailing list