[kepler-dev] a few questions

Tristan King tristan.king at jcu.edu.au
Mon Jan 30 22:56:32 PST 2006


I think I haven't defined my requirements quite finely enough.
Lets look at the use-case i'm working on at the moment:

* Check a directory on the local filesystem every second
* when new files are detected:
	* Build a list of new and modified files
	* Print that list
* go back to start

I've built some test libraries to do this outside of kepler, but
unfortunatly they don't fit into kepler (will give a description on how
it works below for those who're interested).

>From Ilkay's suggestion to use the PN Director to handle the loops, i've
put together the following test workflow:
[PN Director]
[DirectoryPoller]>------->[Display]

And I've figured out how I want the fire() method for my DirectoryPoller
actor to flow (psuedo code of course):

fire() {
	if (directory.hasNewFiles()) {
		output.send(list_of_new_files);
	}
	sleep(1000); // sleep for 1000 ms
}

Now this looks all good, but i've got a few questions before I dive into
the code:

Does the PN director only fire off a new thread of the workflow once the
Display actor's postfire() method finishes?

Ilkay's reply to my last email said:
> You can put a sleep parameter to the actor if you don't want it to
fire to soon.
How is this done? (if you just point me towards an actor that already
uses this sleep parameter i should be able to pull it apart and figure
out how it works)

Also I see a problem with this,
The directory class holds a list of all the files and their last
modified time in memory. Since (correct me if i'm wrong please) the PN
director calls the constructors of the actors everytime a new thread is
started, this list will be empty every iteration and thus every file in
the directory will be seen as a new file.

I could fix this by writing the data to a file and reading it in at the
start of the actor's fire() method, which would be the best solution if
fault tollerance becomes an issue, but for the current use case i'd
prefer to just keep the list in memory.

with this in mind it would make more sence for the fire() method to be
something like:

fire() {
	while(true) {
		if (directory.hasNewFiles()) {
			output.send(list_of_new_files);
		}
		sleep(1000); // sleep for 1000 ms
	}
}

But there are (well.. i think there are) problems with this as well.
since the fire method will never complete, (I assume, from trial and
error) the Display actor's postfire() method will never run.

So...... i'm kinda stuck on the drawing board at the moment. Does anyone
have any thoughts on how I can solve these issues?

I think I'll work on storing the directory information in a file (which
will still involve figuring out how to make the file method sleep) for
now.

Thanks for any help you can give :)
--Tristan

p.s. here's how my other poller worked, and my failing's getting it to
work in kepler:

comprises of a few classes:

PollerManager: manages creation, starting and stopping of
DirectoryPoller instances and adding and removing listeners of
DirectoryPoller events.

DirectoryPoller: extends TimerTask, run method simply checks for new
files in the directory specified and when new files are discovered sends
events (containing a list of all the newly discovered files) to all the
listeners.

event listeners are just classes that implement my PollerEventListener
interface (i.e. a method called eventReceived()).

so what i tried to do was this:
* make an actor that implements PollerEventListener.
* put all the PollerManager stuff in the fire method
* send data to the output port from the eventReceived() method

of course this didn't work due to how the respective actor's fire() and
postfire() methods are executed which i only discovered after trying to
get this working, but that's how you learn :).

If you have any thoughts or questions about anything I did feel free to
let me know :)

On Sun, 2006-01-29 at 20:53 -0800, Ilkay Altintas wrote:
> Hi Tristan,
> 
> On Jan 29, 2006, at 8:41 PM, Tristan King wrote:
> 
> > Hi everyone,
> >
> > Just a few questions.
> >
> > I'm looking to do the following:
> > 	
> > 	for ever {
> > 		Poll a Directory for new files
> > 		when new file is located {
> > 			Run workflow with new files as input
> > 		}
> > 	}
> >
> > What i need to know is:
> >
> > 1. Is this best preformed using a director to preform the loops? Or is
> 
> PN would be the best director to perform this. You can have the  
> directory listing actor to perform infinitely (or until a stop  
> condition occurs), and that actor will keep on producing inputs for the  
> rest of the workflow.
> 
> > it possible to use some sort of flow control actors to direct the flow
> > back to the directory poller? If a director should be used, which one
> > and how should it be configured?
> 
> You don't need a backwards loop. The directory list actors will queue  
> until the next actor consumes it. You can put a sleep parameter to the  
> actor if you don't want it to fire to soon.
> 
> You don't have to do any configuration in the PN actor. Just check that  
> the postfire of the directory listing actor returns true all the time  
> for running it forever (or  until a stop condition to stop it after a  
> while).
> 
> 
> > note that (as implied in my psuedo code) the loop may want to continue
> > forever.
> >
> > 2. Is it possible to run particular sections in seperate threads? i.e.
> > new file is found -> make thread to process new file -> main thread  
> > goes
> > back to polling for new files. Are there any actors that do this
> > already, or do you think it's possible to write one?
> 
> PN director will automatically create different threads for each actor.  
> You don't have to do it.
> 
> > 3. Can workflows be run as a non-gui service? i.e. like a web server.
> 
> Yes.  There is a command line version.
> 
> Also, Efrat has created services to run the GEON workflows as web  
> services using a wrapper similar to the command line version.
> 
> Cheers,
> -ilkay
> 
> > This could be a simple dirty solution to problem 2. i.e. just have the
> > directory polling workflow run a command line version of kepler  
> > starting
> > up a workflow to process the new files.
> >
> > any other thoughts and ideas you might have for my situation could be
> > helpful too :)
> >
> > thanks
> > --Tristan
> >
> > --  
> > Tristan King                            | Ph: (07) 4781 6911
> > Information Technology and Resources    | Email:  
> > Tristan.King at jcu.edu.au
> > James Cook University                   |
> > Townsville QLD 4814                     |
> > Australia                               |
> >
> > _______________________________________________
> > Kepler-dev mailing list
> > Kepler-dev at ecoinformatics.org
> > http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/kepler- 
> > dev
> 
-- 
Tristan King                            | Ph: (07) 4781 6911
DART project team                       | Email: Tristan.King at jcu.edu.au
James Cook University                   | Web: http://dart.edu.au
Townsville QLD 4814                     | http://plone.jcu.edu.au/dart/
Australia                               |





More information about the Kepler-dev mailing list