[RkWard-devel] Desc stats skweness and kustosis

Thomas Friedrichsmeier thomas.friedrichsmeier at ruhr-uni-bochum.de
Tue Feb 27 12:59:21 UTC 2007


On Tuesday 27 February 2007 10:08, I. Soumpasis wrote:
> 1. Why is this difference? My data does not contain NAs as you can see.

Let's see. moments::kurtosis has kurtosis as:
	n <- length(x)
        n * sum((x - mean(x))^4)/(sum((x - mean(x))^2)^2)
Whereas in stat.desc and in e1071 it is:
	sum((x - mean(x))^4)/(length(x) * var(x)^2) - 3
Now the "-3" is easily explained away, as this is the difference between 
plain "kurtosis" and "excess kurtosis". I suppose, to avoid confusion, here, 
we could show an additional field "excess kurtosis" in the results which is 
simply kurtosis - 3.
There's another subtle difference, though:
	var (x)
is really
	sum ((x - mean(x))^2) / (length(x) - 1)
note the "-1". This produces a further difference for small samples.
Frankly, I don't know, which one is more correct.

> 2. Although a descriptives plugin exists, how do you see the idea of using
> the stat.desc additionally? Seems to me more informative, but the user
> looses the choice to choose only one thing eg, mean or median.

Actually, we already have two plugins for descriptives, already (the other 
is "Basic Statistics". Before adding a third, I think we should take a look 
at:
1) Which features do both plugins already provide?
2) Which features are available in one but not the other?
3) What are the differences in the GUI of these plugins and their output, 
which seem to be good ideas, which seem to be not so good ones (default 
settings, grouping, etc.)?
4) Which features would be nice to have in this context, but available in 
neither?
Then:
5) Create a list of features that a descriptive statistics plugin should have.
6) Create a mockup (i.e. don't worry about any details or R code, yet, just 
create a raw .xml GUI description) of what such a plugin would look like.
7) Does the mockup look like it would be useable? Is it overly complex? Is it 
easily possible to find the "most commonly needed" options among all the 
others (also, how would the output be formatted; will the results be easy 
to "see" in the output, or will it be too much at once; should be output be 
grouped into several tables, ...)? Is there a "natural" way to split it into 
two (non-overlapping) plugins?
Finally:
8) Where can the required functionality be found in R, preferentially without 
requiring too many packages, or only requiring them, if truly needed.

That's quite a bit of work, and some difficult questions (which is why I never 
started on this), but I guess it's probably a good idea to consider these 
points before creating yet another descriptive statistics plugin.

Regards
Thomas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://mail.kde.org/pipermail/rkward-devel/attachments/20070227/b7c7287b/attachment.sig>


More information about the Rkward-devel mailing list