[kepler-dev] Writing Files inside workflows

Dan Higgins higgins at nceas.ucsb.edu
Thu May 26 10:18:18 PDT 2005


Ilkay,
    Interestingly, the FileWrite actor extends 'LineWriter' which 
extends 'Sink'. A 'sink' is usually the final actor in a workflow; so it 
might be argued that it should not be given an output port (as it was in 
FileWrite). The fact that it is a 'sink' is why it is OK to write the 
file in Postfire and Wrapup rather than in the fire method.

Dan

Ilkay Altintas wrote:

> This interesting. It is again related  to usability of the actors. 
> Both  actors write a string to file but there are places where 
> FileWriter is  useful and places where TextFileWriter is useful.
>
> We need to take these actors and try to parameterize where they are  
> different and why. Then we can have one file writer actor that does  
> both of what these actors do partially and we can know (and control)  
> what the actor does by configuring it from the user interface.
>
> This would be the "Merge" step in the Kepler development principal  
> which can be summarized as:
>     1. Define your requirements;
>           2. Reuse existing development if possible;
>           3. Extend features if needed;
>           4. Add new components if they don’t exist;
>           5. Merge features if they can be generalized.
>
> This can be an evolving (spiral) method to develop more functional 
> and  generic actors.
>
> Just a thought...
> -ilkay
>
>
> On May 25, 2005, at 12:55 PM, Dan Higgins wrote:
>
>> Hi All,
>>     There are currently several actors in Kepler for writing text  
>> files:
>> e.g. 'org.geon.FileWrite' and 'org.resurgence.actor.TextFileWriter'.
>> These actors apparently do the same thing (write a string to a file)  
>> but
>> the details of how they do it can apparently make a big difference! Let
>> me explain.
>>
>>     The GARP workflow "garpModel_ASC_withData.xml" is an example of
>> getting species locations from Digir and then running GARP. The species
>> locations (longitude, latitude pairs) are concatenated into a string  
>> and
>> then written to a file that is an input to the GarpPresampleLayers
>> actor. Originally, the workflow used a FileWriter actor to write the
>> input long,lat file. This seemed to work fine. Recently, however, I
>> tested this workflow again and had all sorts of problems (Kepler
>> crashes, strange error messages, etc.) In trying to figure out the
>> problem, I discovered that the Digir sources have apparently be updated
>> and are now returning more locations than before. The location file is
>> thus longer (and takes more time to write). To summarize a whole bunch
>> of debugging, the problem was apparently the FileWriter! FileWriter
>> writes the file in the postfire method and then doesn't close the file
>> until the wrapup method (in its parent LineWriter class). But I was
>> using it in the middle of a workflow; I really want it to completely
>> write and close the file before the next actor starts reading the file.
>> It thus looks like there are some threading/timing issues with using
>> FileWriter.
>>
>>     Well, it turns out that TextFileWriter does all of its work writing
>> a file INSIDE the 'fire' method. If I replace the FileWriter with the
>> resurgence TextFileWriter, all the problems I was having disappear!
>>
>>     So the lesson appears to be that internal details can be very  
>> important!
>>
>> Dan
>>
>> --  *******************************************************************
>> Dan Higgins                                  higgins at nceas.ucsb.edu
>> http://www.nceas.ucsb.edu/    Ph: 805-893-5127
>> National Center for Ecological Analysis and Synthesis (NCEAS)
>> Marine Science Building - Room 3405
>> Santa Barbara, CA 93195
>> *******************************************************************
>>
>> _______________________________________________
>> Kepler-dev mailing list
>> Kepler-dev at ecoinformatics.org
>> http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/kepler- 
>> dev
>



More information about the Kepler-dev mailing list