[RkWard-devel] length() and na.rm

Stefan Rödiger stefan_roediger at gmx.de
Thu Aug 17 14:19:35 UTC 2006


Am Donnerstag, 17. August 2006 11:15 schrieb Thomas Friedrichsmeier:
> On Thursday 17 August 2006 13:49, Stefan Rödiger wrote:
> > Moreover I wanted to add Skewness and Kurtosis which both are not in stat
> > or base but in the  moments package. The way to go would be via require
> > ... . So my problem is:
> > 1) is it reasonable to include it as a part of descriptive statistic?
>
> I don't know for sure. However I tend to think we should rather produce
> smaller plugins, each doing only a limited set of functions.
> Of course some will prefer larger, combined interfaces. But wherever
> possible those should be created by embedding several small plugins in a
> large one. We'll have to see whether this is practical in all instances,
> but for now I'd say rather make those functions a separate plugin.
>

I totally consent. Moreover it makes things more handsome and the reuse of 
code should also be easier.
I was reading the package description of "moments" and I would say it really 
makes more sense to create a separate plug-in for this.
Anyway I found this plug-in is an very important one since this is what I 
would use first to look at my raw data. Therefore the idea behind the thumb 
images and so on. It would really describe a lot. But "keep it simple" is 
truly better. I can use "embed" anyway later on as option.

> > 2) if yes, should I use the package or use a formula from a textbook? {if
> > I use the package, how can it be handled that the user doesn't need to
> > bother where he can get it and will it effect the performance of RKWard
> > in long term ("load <-> unload, memory usage, ..., stability)}
>
> I'm not too much concerned about memory usage, performance, etc. However,
> having to install a package is not nice, but as long as it is limited to
> the cases, where the functionality is actually needed, I think it's ok
> (something like):
>
> if (options$skew || options$kurtosis) require (moments)
>
> It's unfortunate those functions are not part of the "standard" R packages.

Yes, and this makes things inconvenient. I thinks a possibility for the future 
would be to create something like "first-run wizard" for this. Which means 
the corresponding packages are downloaded at the first run. If they are not 
available the plug-ins should be disabled. Otherwise you would get tons of 
bug reports. But I don't know how to handle package updates and the sever 
issue. Do you think my idea makes things to complicated?

> However, I think recoding them from textbook formulas is generally not the
> way to go. After all, loading extensions from libraries is one of the key
> concepts of R.
>

True, I just searched for a way to avoid loading extensions MANUALLY. I know 
that this is a key concept of R as you can remember from the pro-linux.de 
article (customize RWKard :=) )

> > 3) moreover I would like to include "mode" (see
> > http://en.wikipedia.org/wiki/Mode_(statistics)) but there is no such
> > function from R.
>
> Seems strange there is no such function in R. Here's a formula:

Strange but true. I was wondering too.

>
> names(table(x))[which(table(x)==max(table(x)))]
>

Yep.

> BTW, I found the formula on this page (German):
> http://www.wiwi.uni-bielefeld.de/~wolf/learning-net/webserver/rechendienst.
>operationen.php?operation=Modus&daten=
>

Nice, I like it.

> Could possibly be optimized to
>
> rk.temp.freqtable <- table (x)
> names (rk.temp.freqtable)
> [which(table(rk.temp.freqtable)==max(table(rk.temp.freqtable)))]
>
> which looks even more convoluted, but avoids calculating "table (x)" over
> and over again.
>

If you say so ;). 

> > 4) last but not least I would like to include the posibility to include
> > several plots but in as thumbnail view. We could reuse the code from the
> > histogram with  par(mfrow = c(n,m) and a boxplot too. I think this gives
> > a good overview of data. (To bad that we don't have (yet ;) ) the svg
> > support, this would be great if people want to use these images so thta
> > they can resize them and so on).
>
> If you look at rk.graph.on (), it's really just
>
> function ()
> {
>     filename <- rk.get.tempfile.name(prefix = "graph", extension = ".png")
>     png(file.path(filename))
>     cat(paste("<img src=\"", filename, "\"><br>", sep = ""),
>         file = rk.get.output.html.file(), append = TRUE)
> }
>

Fine, looks good and easy to understand.

> png (...) would take additional arguments width, and height, so we could
> extend rk.graph.on (), to include those parameters (or simply to accept
> a '...'-parameter and pass it on to png ()).
> I'll add something to CVS later, and then write back.

Why not JUST doing via the GUI as parameter instead of extending rk.graph.on 
()? That should be sufficient too.


> Of course SVG-support would be much better. I hope to find the time to look
> into this soon.
>

Yes, but as you pointed out before. Its not simple and the question is 
what people can to with it right now. OOo doesn't support it and KWord only in 
basics. Editing is possible via inkscape and so on but finally you have to 
export them again to png or so. To be honest it's enough to keep it in mind 
as longterm project since some "geek"-knowledge is needed to work with svg at 
the moment.

> > BTW, I posponed my work on the distribution plug-ins. I struggle a lot
> > with the graphic part and at some parts I see problems do due a lack of
> > theoretical knowledge from my side. Anyway. Kolmogorov-Smirnov,
> > Anderson-Darling and some others are on their way.
>
> No problem. Do whatever you feel comfortable with. We should remember to
> revisit some of the existing distribution plugin-ins, though, before the
> next release. As far as I recall, there were some remaining problems (at
> least this one:
> http://sourceforge.net/tracker/index.php?func=detail&aid=1476070&group_id=5
>0231&atid=459007 , but possible there were some more small ones? I don't
> quit remember). Did you keep track of what still needs to be improved
> better than I did?

Very good aspect and fore sure I can remember. This clumsiness from last time 
should not happen again. To avoid this I'll test all my stuff intensive 
before and then I'll sent my work.

Maybe I should finally learn how to use csv :( .


>
> Regards
> Thomas


Regards,
Stefan.

Thanks for the comments.




More information about the Rkward-devel mailing list