[kepler-users] R actor problem

Matt Jones jones at nceas.ucsb.edu
Mon May 21 14:52:11 PDT 2012


Hi Claas,

So I looked at your workflow and executed it, after getting the EML paths
properly configured for my system. The workflow ran as expected, and
produced the path to the data fram on output Display2.  This is as
expected.  I then added to the workflow by adding another RExpression
actor, which I modified to have an input port called 'Input_Variables' and
a script that just called 'summary(Input_Variables)'.  When run, this new
downstream R actor got your dataframe fine, and was able to run the summary
function and display the output.  So, from what I can tell, everything is
working in Kepler 2.3.

The issues you may have encountered include:
  -- maybe there was a typo in your Input_Variables name, either for the
port or in the script
  -- maybe you were expecting the actual data to flow over the port, which
is not how the R actor works for data frames (which are just passed by
reference after having been written to disk) -- so you have to convert data
frames to a more low-level type if you want to pass them directly to actors
other than the R actor.  But given that you want to be using the R actor, I
expect this isn't a problem for you.
  -- there may be a data oriented error, where specific data values are
causing the script to abort; I would try segmenting the data in half, run
it on each half, and see if the error persists on one but not the other
half.  I think this is your most likely issue.

The other thing you can do to make debugging easier is to wrap your R
script is a try() function to trap any errors that are generated.
 Normally, an error in R causes the Kepler workflow to fail, which makes it
hard to find the error.  If you surround your script with a 'try()' call,
you can then use 'geterrmessage()' to display the error message, which will
then show up in your output display window.  You should be able to track
down the error with that information.  For example:

   try(
      myfunction(baddataframe)
   )
   geterrmessage()

That code will show the error that occurs when "myfunction" is passed te
"baddataframe" data frame.

My modified workflow is attached for reference.

Matt

On Sat, May 19, 2012 at 12:38 AM, sabsirro <sabsirro at arcor.de> wrote:

> Am Freitag, 18. Mai 2012, 08:01:00 schrieben Sie:
> > Dear Claas,
> Dear Matt
>
> Thanks for that kind offer to have a look into my workflow.
>
> >
> > Can you confirm that you are getting the data you want out of the EML
> actor
> > and into the first R actor?  I often configure the EML actor to output
> data
> > "As Column Vector" when passing to R, as that makes it easy to manipulate
> > in R.
> So do I, and yes I can confirm that the data I want comes out into the
> first
> actor.
> > As we can't see your workflow, it's hard to determine exactly where
> > the problem lies.
> > If you can confirm that you are successfully making your
> > data frame in the first R actor, you should see it being produced on the
> > output port (assuming you named the port the same as your R variable
> > holding the data frame (e.g., "Output").  In any case, if you send along
> > the workflow kar file it will be a lot easier to help.
>
> The data I am dealing with is not published yet and it is not mine. So in
> order to hand you over a copy of all the files you need to have a look into
> whats wrong in my workflow I reduced the dataset to 20 rows. Then I tested
> it
> again with the reduced dataset and it worked fine.
>
> But it still doesn't work with my full data set where each of the 10
> columns
> has 1700 entries. This sounds like some memory problem to me. Or it might
> be
> some restriction in size of an object that can be handed down from one
> actor
> to another? Any ideas about that? I created a dummy dataset with 1700
> entries
> so you can test it. See the enclosed "rar" file for "eml" "kar" and "csv"
> file
> of my workflow.
>
> I noticed that I also get some Warnings all the time. I dont know what that
> means:
>
> ------------------O
> [null] MC 10:15:04,490: [WARN]: In catch block of _read: String index out
> of
> range: -1 [org.ecoinformatics.seek.R.RExpression]
> -------------------O
>
>
> Best regards Claas
>
>
>
> >
> > Matt
> >
> > On Fri, May 18, 2012 at 2:33 AM, sabsirro <sabsirro at arcor.de> wrote:
> > > Dear Kepler Users
> > >
> > >
> > > I want to use a R actor to collect several columns of data which come
> out
> > > of
> > > an EMLtoDataset actor. I created 10 input and 1 output port for that R
> > > actor.
> > > Within the R actor i create a data frame like:
> > >
> > > Output=as.data.frame(cbind(input1, input2, input3, ... , input10))
> > >
> > > I can access this data frame from within the R actor which tells me
> that
> > > it
> > > gets properly created. Now I want to hand the data frame through the
> > > created
> > > output port to a following R actor for further processing of the data
> > > frame.
> > >
> > > I connected the Output Port to another R actor where the input port is
> > > named
> > > "Input_Variables". Within that actor I just call "Input_Variables" to
> > > control
> > > if the transfer was successful. This should show up the data farame
> but I
> > > get
> > > an error which tells me that "Input_Variables" does not exist.
> > >
> > > If I connect a display to the port of the data frame on the first actor
> > > and execute the workflow the display does not open. I think this means
> > > that
> > > nothing comes out of this port.
> > >
> > > It would be very nice if someone could help me with that problem.
> > >
> > > Best regards Claas
> > > _______________________________________________
> > > Kepler-users mailing list
> > > Kepler-users at kepler-project.org
> > > http://lists.nceas.ucsb.edu/kepler/mailman/listinfo/kepler-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nceas.ucsb.edu/kepler/pipermail/kepler-users/attachments/20120521/b33e6e55/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 204VolumeCalculationNew.kar
Type: audio/midi
Size: 11487 bytes
Desc: not available
URL: <http://lists.nceas.ucsb.edu/kepler/pipermail/kepler-users/attachments/20120521/b33e6e55/attachment-0001.kar>


More information about the Kepler-users mailing list