[kepler-users] Exec Component Binary Data

Brown, David M JR david.brown at pnnl.gov
Mon Apr 22 09:29:25 PDT 2013


These programs I'm using are more using C like fwrite and fread commands from stdin and stdout. So think of a program that first writes a magic header '0xDEADBEEF' then writes an integer being the length of the next block then the block of floats. The reading program does something similar. The reading program reads the magic header verifies its content then reads the integer and reads in the block of floats.

==== WRITER ====

#include <stdio.h>

int main(int argc, char **argv)
{
   int header = 0xDEADBEEF;
    int num_floats = 32;
    float floats[32];
    fwrite(&header, sizeof(int), 1, stdout);
    fwrite(&num_floats, sizeof(int), 1, stdout);
    fwrite(floats, sizeof(float), 32, stdout);
    return 0;
}

==== READER ====

#include <stdio.h>

int main(int argc, char **argv)
{
    int header;
    int num_floats;
    int check;
    int *floats;
    fread(&header, sizeof(int), 1, stdin);
    if(header != 0xDEADBEEF)
        return -1;
    check = fread(&num_floats, sizeof(int), 1, stdin);
    if(check != 1)
        return -1;
    floats = malloc(sizeof(float)*num_floats);
    check = fread(floats, sizeof(float), num_floats, stdin);
    if(check != num_floats)
        return -1;
    printf("read %d floats\n", num_floats);
    return 0;
}

These are simple readers and writers that are doing similar things to the code I'm trying to use in Kepler. The number of floats may need to be extended to something more like 256 or larger though. I would think the Exec component could be extended just by specifying the type of the output port as to what the Exec module should expect from the output. I've seen other components do this sort of return type manipulation. I believe the StringTo component is one example that does this.

Thanks,
- David Brown

From: Christopher Brooks [mailto:cxh at eecs.berkeley.edu] 
Sent: Monday, April 22, 2013 8:51 AM
To: Jianwu Wang
Cc: Brown, David M JR; kepler-users at kepler-project.org
Subject: Re: [kepler-users] Exec Component Binary Data

I have not thought that hard about how to pass binary data, probably the thing to do would be to enhance the Exec actor to optionally pass an ObjectToken that would contain the binary data.

Offhand, I don't know of anything in the StringToken that should prevent binary data from being used, but there could be issues.  Some simple test cases would help here.

One thing is that reading data from files can result in transformation of the data.  ptolemy/util/FileUtilities.java has a method that safely reads a file and outputs it using ByteArrayOutputStream.

BTW - I'm used to Unix passing data as files that have newline characters and using tools and scripts on textual data.  A few line-based tools tend to fail if the file does not have at least one newline at the end of the file.  Interestingly, the find command can generate an output that has file names separated by null characters and xargs can process that data with the -0 option.  For example, these commands find file names with spaces in them and then call xargs to run a command.  

find . -name "* *" -print0 | xargs -0 ls -l

(Of course, one could just do find . -name "* *" -ls, but the point is to show how to pass filenames with spaces to xargs)
On 4/19/13 6:06 PM, Jianwu Wang wrote:
Hi David,

    External execution assume the data for output port is string and process it accordingly. So I don't think it can correctly send binary data from its output port. 

     One way work around is to redirect the binary output into a file and the next 'External execution' actor will read data from the file. If it fits your needs, I can send a demo workflow for it.

Best wishes

Sincerely yours

Jianwu Wang, Ph.D.
jianwu at sdsc.edu
http://users.sdsc.edu/~jianwu/

Assistant Project Scientist
Scientific Workflow Automation Technologies (SWAT) Laboratory
San Diego Supercomputer Center
University of California, San Diego
San Diego, CA, U.S.A. 
On 4/18/13 11:02 AM, Brown, David M JR wrote:
To whom it may concern,
 
I've got a set of binary tools that use standard unix style pipes to communicate and process data. So to write this up in Kepler I chose to use the external execution component and run each command separately connecting one components standard output to the other components standard input. However, the communication isn't in text its binary. I'm having issues where the first part of the process doesn't recognize the header sent to the program as valid.
 
Are there any issues with sending binary data from external execution components output port?
 
Thanks,
- David Brown
 
 
 



_______________________________________________
Kepler-users mailing list
Kepler-users at kepler-project.org
http://lists.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-users




_______________________________________________
Kepler-users mailing list
Kepler-users at kepler-project.org
http://lists.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-users


-- 
Christopher Brooks, PMP                       University of California
Academic Program Manager & Software Engineer  US Mail: 337 Cory Hall
CHESS/iCyPhy/Ptolemy/TerraSwarm               Berkeley, CA 94720-1774
cxh at eecs.berkeley.edu, 707.332.0670           (Office: 545Q Cory)


More information about the Kepler-users mailing list