[rkward-devel] performance
Thomas Friedrichsmeier
thomas.friedrichsmeier at ruhr-uni-bochum.de
Tue Jun 8 14:15:43 UTC 2010
Hi,
meik michalke schrieb:
> as one result of our ongoing workshop, a participant argued that while rkward
> provides a really comfortable interface for R development, it does consume
> valuable processing ressources to an extent that when it comes to actual
> huge/complex calculations, he wouldn't consider doing it with rkward.
>
> i guess it's the implementation of the R console and the way it communicates
> with the backend that produces the noticeable overhead. is there something we
> could do about this? perhaps making it possible to clone the workspace as a
> new "native" R session if you need to boost performance? or the possibility to
> switch off rkward's R console wrapper?
ATM, I am aware of only one use case, where performance in RKWard is
*seriously* degraded when compared to plain R. This is when running a
long loop in .GlobalEnv like this:
i <- 1
for (i in 1:100000) { i+i }
Note when you now do
rm (i)
for (i in 1:100000) { i+i }
the loop should finish much faster, and performance should be roughly
back on par with plain R. The reason is that RKWard uses "active
bindings" to keep track of changes in R objects. In the first example, i
is replaced with an active binding, and slows down each access to i,
considerably. In the second example, the "i" used in the loop is just a
regular R object.
I assume, this is the performance problem that you are seeing.
In fact, that is pretty embarassing, and needs to be fixed, sooner or
later. Here's the reminding ticket for the issue:
https://sourceforge.net/tracker/?func=detail&aid=1810061&group_id=50231&atid=459010
. By now I have a new idea on how to approach this, which should result
in virtually no performance loss at this point (without using active
bindings).
It is possible to work around this, though, by using local variables. E.g.
i <- 1
local({
for (i in 1:100000) { i+i }
})
Here, a local object "i" will be used, which RKWard will not try to
track (and the global object "i" will remain untouched). Or - perhaps
more realistically - place all your code into functions. After all,
that's generally a good idea in the first place.
Ok, there are a few other cases, where there is overhead. For instance,
loading libraries/packages is typically a good bit slower in RKWard, but
then, that is mostly a one-time thing per session. Also, RKWard is
considerably slower, when there is a huge amount of output. And there's
an unavoidable memory-overhead, of course, simply for having RKWard
loaded. But I guess none of these is what you are talking about, right?
Regards
Thomas
More information about the Rkward-devel
mailing list