[kepler-users] Help understanding Kepler paradigm for batch processing images

Jeremy Douglass jeremydouglass at gmail.com
Mon Jul 13 10:49:50 PDT 2009


Ustun and Bertram --

Thanks for your help. I think I've gotten simple batch image
processing working, and I'm attaching the workflow for reference. One
way is to send a list of files names to an actor like Image Converter
is:

(DDF Director)
 Directory Listing -->
     \--> Ramp ------> Array Element --> Image Converter

This works as long as I manually specify the Ramp firing limit as the
number of expected files ahead of time -- in order to make it dynamic,
I suspect that I should be able to run an array length on the
Directory listing and pipe that to the Ramp, but I haven't figured out
how to add the input correctly yet -- simple adding an input port
named firingCountLimit had no effect, so for now it only works on a
directory of known size.

To do a similar batch process with an image operation such as Image
Rotate, I needed some extra processing -- actors that address images
as external file strings work differently from actors that load them
into memory as internal objects:

DDF Director)
 Directory Listing -->
     \--> Ramp ------> Array Element --> String Replace --> Convert
URL to Image --> Image Rotate

The String Replace actor after Array Element to replay ^ with file://.
This is because Directory Listing produces an array of unmarked path
strings, but actors like Convert URL to Image require a prefix to
distinguish files from URLs.

Reflecting on what I learned from this and how my first days with
Kepler differed from my expectations: I think I'd hoped that the stock
set of Kepler actors would hide some of the interface complexity
through standardization of inputs and outputs (e.g. all file listings
actors producing and consuming strings with file:// prepended by
default, or e.g. all image actors having a converturl port built in so
that they can all be expected to accept the same kinds of input). I
expected concepts like "some files" or "thing that processes images"
to be internally consistent. Instead, it appears that the path from
any Kepler actor to any other actor is a unique processing problem
that requires carefully inspection of low level representations -- at
any given step, is a file represented as {"/my/file.txt"},
{"name=/my/file.txt"}, /my/file.txt, file://my/file.txt, a binary
object, etc., and how do you connect two actors by transforming any
one form into any other form using a variety of helper actors?

My tentative conclusion is that Kepler is not currently the place to
be reimplementing our many low-level directory-of-images processing
scripts. Instead, it is a place to wire execution of those
preexisting scripts together as a high-level set of black boxes with
well defined inputs and outputs. This might explain why none of the
examples that ship with Kepler demonstrate how to process more than 1
image internally (compare OS X Automator, in which batch file
processing is the fundamental use case). Does this match up with other
people's use cases for Kepler, or am I off base?

best always,
Jeremy


On Sun, Jul 12, 2009 at 10:48 PM, Ustun Yildiz<yildiz at cs.ucdavis.edu> wrote:
> Just a quick question, are you using PN director ?
>
>
>
> On Sun, Jul 12, 2009 at 10:41 PM, Jeremy
> Douglass<jeremydouglass at gmail.com> wrote:
>> Ustun,
>>
>> Thanks. Because I'm just trying to batch process the files, I need to
>> produce a series of whatever Image Converter will consume -- which I
>> assume (?) based on the 03 example that ships with Kepler is just a
>> plain file path e.g. "/User/test/file1.PNG". Essentially, I want Image
>> Converter receive 5 file path strings and run 5 times.
>>
>> What OS are you running Kepler on? If the Directory Listing returns a
>> different data structure depending on the OS, perhaps I should be file
>> this as a bug?
>>
>> Right now I'm trying to adding a "name" column to the array before
>> feeding it to Record Disassembler by using the Concatenate Arrays
>> actor, but no luck so far....
>>
>> -- Jeremy
>>
>>
>> On Sun, Jul 12, 2009 at 10:07 PM, Ustun Yildiz<yildiz at cs.ucdavis.edu> wrote:
>>> Hi:
>>>
>>> I am sorry but I guess I gave you a wrong advise in my previous message.
>>> I have the impression that Directory Listing-->Array to Sequence give
>>> you the elements as separate tokens. Your number 1) guess is correct
>>> below.
>>>
>>> I tried to put Display after Array to Sequence, I have a name tag to
>>> filter with Record Dissasemler. I guess different computers generate
>>> different listings. Looking at your output, you just need the name of
>>> the files right? and not complete directory/file_name elements ?
>>>
>>> -Ustun.
>>>
>>> On Sun, Jul 12, 2009 at 9:52 PM, Jeremy
>>> Douglass<jeremydouglass at gmail.com> wrote:
>>>> Thank you Ustun -- I'm trying a Record Dissasembler with an added
>>>> "name" output port now, but it isn't working.
>>>>
>>>> When I add just a Record Disassembler where you suggested (Directory
>>>> Listing ---> Array to Sequence ---> Record Disassembler), I'm getting
>>>> "Exception: ptolemy.data.stringToken." Once I use the GUI to add an
>>>> output port named "name" to the Disassembler I get the more verbose
>>>> error "Type resolution failed because of an error during type
>>>> inference in .test, because: Invalid type for input port in
>>>> .test.Record Disassembler." Adding the Disassembler directly to the
>>>> Directory Listing gives me similar errors, only the exception is
>>>> arrayToken instead of stringToken.
>>>>
>>>> Right now I'm wildly guessing that the problem is either:
>>>> 1) perhaps my directory listing results don't actually come with a
>>>> "name" element column, and so there is no name element for the
>>>> Disassembler to filter on -- as this apparently works for you, perhaps
>>>> returning a single column array is OS X specific behavior of Directory
>>>> listing, and perhaps there is a way to add this column myself through
>>>> an array operation? ...or
>>>> 2) Perhaps you didn't mean I should use Array to Sequence, but rather
>>>> Array to Elements in some fashion?
>>>>
>>>> Any advice greatly appreciated. For your ref, here is what I'm seeing
>>>> if I run Directory Listing -> Display:
>>>>
>>>> {"/Users/test/Picture 1.PNG", "/Users/test/Picture 2.PNG",
>>>> "/Users/test/Picture 3.PNG", "/Users/test/Picture 4.PNG",
>>>> "/Users/test/Picture 5.PNG"}
>>>>
>>>> ...and if I run Directory Listing -> Array to Sequence -> Display:
>>>>
>>>> /Users/test/Picture 1.PNG
>>>> /Users/test/Picture 2.PNG
>>>> /Users/test/Picture 3.PNG
>>>> /Users/test/Picture 4.PNG
>>>> /Users/test/Picture 5.PNG
>>>>
>>>> -- Jeremy
>>>>
>>>>
>>>> On Sat, Jul 11, 2009 at 1:06 PM, Ustun Yildiz<yildiz at cs.ucdavis.edu> wrote:
>>>>> I have the impression that you try to apply an operation (actor) to
>>>>> the elements in a directory. In order to capture the names of such
>>>>> elements and pipeline them into subsequent actors, I use
>>>>>
>>>>> Directory Listing ---> Array to Sequence ---> Record Dissasembler
>>>>>
>>>>> The first takes a directory name and the last outputs single names (as
>>>>> many as the elements). The purpose of the Record Dissasembler is to
>>>>> capture "name" element in the output of Array to Sequence actor. You
>>>>> have to change the output port of Record Dissasembler to "name".
>>>>>
>>>>> Ustun.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Jul 11, 2009 at 12:51 PM, Jeremy
>>>>> Douglass<jeremydouglass at gmail.com> wrote:
>>>>>> I'm a new Kepler and struggling to apply a single actor (e.g. Image
>>>>>> Converter) to a batch of objects (e.g. a directory of images, as
>>>>>> represented by the array generated by Directory Listing).
>>>>>>
>>>>>> I've read the documentation and searched the forums, and tried
>>>>>> modifying examples, but I'm not understanding something paradigmatic,
>>>>>> and would appreciate a tip: how do you pipe an array of files through
>>>>>> single-file actors?
>>>>>>
>>>>>> EXAMPLE:
>>>>>>
>>>>>> In Kepler 1.0.0, 03-ImageDisplay.xml uses an SDC director, and specifies:
>>>>>>   Constant Image Filename --> Image Converter --> ImageJ
>>>>>>
>>>>>> In an attempt to converter a *directory* of images, I tried adding a
>>>>>> Directory Listing actor:
>>>>>>   Constant Dirname --> Directory Listing --> Image Converter --> ImageJ
>>>>>>
>>>>>> ...however this generates an error, as Image Converter will only act
>>>>>> on a string, not an array. So I tried
>>>>>>   Constant Dirname --> Directory Listing --> Array to Sequence -->
>>>>>> Image Converter --> ImageJ
>>>>>>
>>>>>> ...which almost works if I hand-specified the number of Array elements
>>>>>> in the directory ahead of time), but still only processed the first
>>>>>> file listed in the string (Picture 1.png) -- I'm assuming it just
>>>>>> throws the rest of the string away.
>>>>>>
>>>>>> I'm not understanding something about how one processes a list in a
>>>>>> Kepler workflow -- I've looked through Array Operation and Iterative
>>>>>> Operation actors, and none of them seem to be analogous to an
>>>>>> imperative programming loop. Am I using the wrong kind of director?
>>>>>> Something else?
>>>>>>
>>>>>> Thanks for your help.
>>>>>>
>>>>>> BACKGROUND: My larger goal is to adapt a frame based image processing
>>>>>> workflow into Kepler (it is currently a combination of python,
>>>>>> javascript, and bash scripts). To illustrate, one part of the workflow
>>>>>> takes a heterogeneous collection of images (all file types and sizes)
>>>>>> and copies them to a new directory as a collection of standardized
>>>>>> collection jpegs. Another part of the workflow dumps a movie to a
>>>>>> frame directory of jpegs, then runs multiple image measurement and
>>>>>> image processing operations (e.g. ImageJ, MATLAB) on each frame and
>>>>>> saves the results to a text file.
>>>>>> _______________________________________________
>>>>>> Kepler-users mailing list
>>>>>> Kepler-users at kepler-project.org
>>>>>> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-users
>>>>>>
>>>>>
>>>>
>>>
>> _______________________________________________
>> Kepler-users mailing list
>> Kepler-users at kepler-project.org
>> http://mercury.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-users
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 03-ImageDisplay-revised-b.xml
Type: text/xml
Size: 68008 bytes
Desc: not available
URL: <http://lists.nceas.ucsb.edu/kepler/pipermail/kepler-users/attachments/20090713/5f3f4502/attachment.xml>


More information about the Kepler-users mailing list