[kepler-dev] Writing Files inside workflows
Dan Higgins
higgins at nceas.ucsb.edu
Thu May 26 10:18:18 PDT 2005
Ilkay,
Interestingly, the FileWrite actor extends 'LineWriter' which
extends 'Sink'. A 'sink' is usually the final actor in a workflow; so it
might be argued that it should not be given an output port (as it was in
FileWrite). The fact that it is a 'sink' is why it is OK to write the
file in Postfire and Wrapup rather than in the fire method.
Dan
Ilkay Altintas wrote:
> This interesting. It is again related to usability of the actors.
> Both actors write a string to file but there are places where
> FileWriter is useful and places where TextFileWriter is useful.
>
> We need to take these actors and try to parameterize where they are
> different and why. Then we can have one file writer actor that does
> both of what these actors do partially and we can know (and control)
> what the actor does by configuring it from the user interface.
>
> This would be the "Merge" step in the Kepler development principal
> which can be summarized as:
> 1. Define your requirements;
> 2. Reuse existing development if possible;
> 3. Extend features if needed;
> 4. Add new components if they don’t exist;
> 5. Merge features if they can be generalized.
>
> This can be an evolving (spiral) method to develop more functional
> and generic actors.
>
> Just a thought...
> -ilkay
>
>
> On May 25, 2005, at 12:55 PM, Dan Higgins wrote:
>
>> Hi All,
>> There are currently several actors in Kepler for writing text
>> files:
>> e.g. 'org.geon.FileWrite' and 'org.resurgence.actor.TextFileWriter'.
>> These actors apparently do the same thing (write a string to a file)
>> but
>> the details of how they do it can apparently make a big difference! Let
>> me explain.
>>
>> The GARP workflow "garpModel_ASC_withData.xml" is an example of
>> getting species locations from Digir and then running GARP. The species
>> locations (longitude, latitude pairs) are concatenated into a string
>> and
>> then written to a file that is an input to the GarpPresampleLayers
>> actor. Originally, the workflow used a FileWriter actor to write the
>> input long,lat file. This seemed to work fine. Recently, however, I
>> tested this workflow again and had all sorts of problems (Kepler
>> crashes, strange error messages, etc.) In trying to figure out the
>> problem, I discovered that the Digir sources have apparently be updated
>> and are now returning more locations than before. The location file is
>> thus longer (and takes more time to write). To summarize a whole bunch
>> of debugging, the problem was apparently the FileWriter! FileWriter
>> writes the file in the postfire method and then doesn't close the file
>> until the wrapup method (in its parent LineWriter class). But I was
>> using it in the middle of a workflow; I really want it to completely
>> write and close the file before the next actor starts reading the file.
>> It thus looks like there are some threading/timing issues with using
>> FileWriter.
>>
>> Well, it turns out that TextFileWriter does all of its work writing
>> a file INSIDE the 'fire' method. If I replace the FileWriter with the
>> resurgence TextFileWriter, all the problems I was having disappear!
>>
>> So the lesson appears to be that internal details can be very
>> important!
>>
>> Dan
>>
>> -- *******************************************************************
>> Dan Higgins higgins at nceas.ucsb.edu
>> http://www.nceas.ucsb.edu/ Ph: 805-893-5127
>> National Center for Ecological Analysis and Synthesis (NCEAS)
>> Marine Science Building - Room 3405
>> Santa Barbara, CA 93195
>> *******************************************************************
>>
>> _______________________________________________
>> Kepler-dev mailing list
>> Kepler-dev at ecoinformatics.org
>> http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/kepler-
>> dev
>
More information about the Kepler-dev
mailing list