<div>Hi Claas,</div><div><br></div>So I looked at your workflow and executed it, after getting the EML paths properly configured for my system. The workflow ran as expected, and produced the path to the data fram on output Display2. This is as expected. I then added to the workflow by adding another RExpression actor, which I modified to have an input port called '<span style="font-size:13px;font-family:arial,sans-serif">Input_Variables</span>' and a script that just called 'summary(<span style="font-size:13px;font-family:arial,sans-serif">Input_Variables</span>)'. When run, this new downstream R actor got your dataframe fine, and was able to run the summary function and display the output. So, from what I can tell, everything is working in Kepler 2.3.<div>
<br></div><div>The issues you may have encountered include:</div><div> -- maybe there was a typo in your <span style="font-size:13px;font-family:arial,sans-serif">Input_Variables name, either for the port or in the script</span></div>
<div><font face="arial, sans-serif"> -- maybe you were expecting the actual data to flow over the port, which is not how the R actor works for data frames (which are just passed by reference after having been written to disk) -- so you have to convert data frames to a more low-level type if you want to pass them directly to actors other than the R actor. But given that you want to be using the R actor, I expect this isn't a problem for you.</font></div>
<div><font face="arial, sans-serif"> -- there may be a data oriented error, where specific data values are causing the script to abort; I would try segmenting the data in half, run it on each half, and see if the error persists on one but not the other half. I think this is your most likely issue. </font></div>
<div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif">The other thing you can do to make debugging easier is to wrap your R script is a try() function to trap any errors that are generated. Normally, an error in R causes the Kepler workflow to fail, which makes it hard to find the error. If you surround your script with a 'try()' call, you can then use 'geterrmessage()' to display the error message, which will then show up in your output display window. You should be able to track down the error with that information. For example:</font></div>
<div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif"> try( </font></div><div><font face="arial, sans-serif"> myfunction(baddataframe) </font></div><div><font face="arial, sans-serif"> ) </font></div>
<div><font face="arial, sans-serif"> geterrmessage()</font></div><div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif">That code will show the error that occurs when "myfunction" is passed te "baddataframe" data frame.</font></div>
<div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif">My modified workflow is attached for reference.</font></div><div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif">Matt<br>
</font><br><div class="gmail_quote">On Sat, May 19, 2012 at 12:38 AM, sabsirro <span dir="ltr"><<a href="mailto:sabsirro@arcor.de" target="_blank">sabsirro@arcor.de</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Am Freitag, 18. Mai 2012, 08:01:00 schrieben Sie:<br>
> Dear Claas,<br>
Dear Matt<br>
<br>
Thanks for that kind offer to have a look into my workflow.<br>
<div><br>
><br>
> Can you confirm that you are getting the data you want out of the EML actor<br>
> and into the first R actor? I often configure the EML actor to output data<br>
> "As Column Vector" when passing to R, as that makes it easy to manipulate<br>
> in R.<br>
</div>So do I, and yes I can confirm that the data I want comes out into the first<br>
actor.<br>
<div>> As we can't see your workflow, it's hard to determine exactly where<br>
> the problem lies.<br>
> If you can confirm that you are successfully making your<br>
> data frame in the first R actor, you should see it being produced on the<br>
> output port (assuming you named the port the same as your R variable<br>
> holding the data frame (e.g., "Output"). In any case, if you send along<br>
> the workflow kar file it will be a lot easier to help.<br>
<br>
</div>The data I am dealing with is not published yet and it is not mine. So in<br>
order to hand you over a copy of all the files you need to have a look into<br>
whats wrong in my workflow I reduced the dataset to 20 rows. Then I tested it<br>
again with the reduced dataset and it worked fine.<br>
<br>
But it still doesn't work with my full data set where each of the 10 columns<br>
has 1700 entries. This sounds like some memory problem to me. Or it might be<br>
some restriction in size of an object that can be handed down from one actor<br>
to another? Any ideas about that? I created a dummy dataset with 1700 entries<br>
so you can test it. See the enclosed "rar" file for "eml" "kar" and "csv" file<br>
of my workflow.<br>
<br>
I noticed that I also get some Warnings all the time. I dont know what that<br>
means:<br>
<br>
------------------O<br>
[null] MC 10:15:04,490: [WARN]: In catch block of _read: String index out of<br>
range: -1 [org.ecoinformatics.seek.R.RExpression]<br>
-------------------O<br>
<br>
<br>
Best regards Claas<br>
<div><div><br>
<br>
<br>
><br>
> Matt<br>
><br>
> On Fri, May 18, 2012 at 2:33 AM, sabsirro <<a href="mailto:sabsirro@arcor.de" target="_blank">sabsirro@arcor.de</a>> wrote:<br>
> > Dear Kepler Users<br>
> ><br>
> ><br>
> > I want to use a R actor to collect several columns of data which come out<br>
> > of<br>
> > an EMLtoDataset actor. I created 10 input and 1 output port for that R<br>
> > actor.<br>
> > Within the R actor i create a data frame like:<br>
> ><br>
> > Output=as.data.frame(cbind(input1, input2, input3, ... , input10))<br>
> ><br>
> > I can access this data frame from within the R actor which tells me that<br>
> > it<br>
> > gets properly created. Now I want to hand the data frame through the<br>
> > created<br>
> > output port to a following R actor for further processing of the data<br>
> > frame.<br>
> ><br>
> > I connected the Output Port to another R actor where the input port is<br>
> > named<br>
> > "Input_Variables". Within that actor I just call "Input_Variables" to<br>
> > control<br>
> > if the transfer was successful. This should show up the data farame but I<br>
> > get<br>
> > an error which tells me that "Input_Variables" does not exist.<br>
> ><br>
> > If I connect a display to the port of the data frame on the first actor<br>
> > and execute the workflow the display does not open. I think this means<br>
> > that<br>
> > nothing comes out of this port.<br>
> ><br>
> > It would be very nice if someone could help me with that problem.<br>
> ><br>
> > Best regards Claas<br>
> > _______________________________________________<br>
> > Kepler-users mailing list<br>
> > <a href="mailto:Kepler-users@kepler-project.org" target="_blank">Kepler-users@kepler-project.org</a><br>
> > <a href="http://lists.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-users" target="_blank">http://lists.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-users</a></div></div></blockquote></div><br></div>