[rkward-devel] performance

Tue Jun 8 14:15:43 UTC 2010

Hi,

meik michalke schrieb:
> as one result of our ongoing workshop, a participant argued that while rkward 
> provides a really comfortable interface for R development, it does consume 
> valuable processing ressources to an extent that when it comes to actual 
> huge/complex calculations, he wouldn't consider doing it with rkward.
> 
> i guess it's the implementation of the R console and the way it communicates 
> with the backend that produces the noticeable overhead. is there something we 
> could do about this? perhaps making it possible to clone the workspace as a 
> new "native" R session if you need to boost performance? or the possibility to 
> switch off rkward's R console wrapper?

ATM, I am aware of only one use case, where performance in RKWard is 
*seriously* degraded when compared to plain R. This is when running a 
long loop in .GlobalEnv like this:

   i <- 1
   for (i in 1:100000) { i+i }

Note when you now do

   rm (i)
   for (i in 1:100000) { i+i }

the loop should finish much faster, and performance should be roughly 
back on par with plain R. The reason is that RKWard uses "active 
bindings" to keep track of changes in R objects. In the first example, i 
is replaced with an active binding, and slows down each access to i, 
considerably. In the second example, the "i" used in the loop is just a 
regular R object.

I assume, this is the performance problem that you are seeing.

In fact, that is pretty embarassing, and needs to be fixed, sooner or 
later. Here's the reminding ticket for the issue: 
https://sourceforge.net/tracker/?func=detail&aid=1810061&group_id=50231&atid=459010 
. By now I have a new idea on how to approach this, which should result 
in virtually no performance loss at this point (without using active 
bindings).

It is possible to work around this, though, by using local variables. E.g.
   i <- 1
   local({
     for (i in 1:100000) { i+i }
   })
Here, a local object "i" will be used, which RKWard will not try to 
track (and the global object "i" will remain untouched). Or - perhaps 
more realistically - place all your code into functions. After all, 
that's generally a good idea in the first place.

Ok, there are a few other cases, where there is overhead. For instance, 
loading libraries/packages is typically a good bit slower in RKWard, but 
then, that is mostly a one-time thing per session. Also, RKWard is 
considerably slower, when there is a huge amount of output. And there's 
an unavoidable memory-overhead, of course, simply for having RKWard 
loaded. But I guess none of these is what you are talking about, right?

Regards
Thomas