[RkWard-devel] Desc stats skweness and kustosis
Thomas Friedrichsmeier
thomas.friedrichsmeier at ruhr-uni-bochum.de
Tue Feb 27 12:59:21 UTC 2007
On Tuesday 27 February 2007 10:08, I. Soumpasis wrote:
> 1. Why is this difference? My data does not contain NAs as you can see.
Let's see. moments::kurtosis has kurtosis as:
n <- length(x)
n * sum((x - mean(x))^4)/(sum((x - mean(x))^2)^2)
Whereas in stat.desc and in e1071 it is:
sum((x - mean(x))^4)/(length(x) * var(x)^2) - 3
Now the "-3" is easily explained away, as this is the difference between
plain "kurtosis" and "excess kurtosis". I suppose, to avoid confusion, here,
we could show an additional field "excess kurtosis" in the results which is
simply kurtosis - 3.
There's another subtle difference, though:
var (x)
is really
sum ((x - mean(x))^2) / (length(x) - 1)
note the "-1". This produces a further difference for small samples.
Frankly, I don't know, which one is more correct.
> 2. Although a descriptives plugin exists, how do you see the idea of using
> the stat.desc additionally? Seems to me more informative, but the user
> looses the choice to choose only one thing eg, mean or median.
Actually, we already have two plugins for descriptives, already (the other
is "Basic Statistics". Before adding a third, I think we should take a look
at:
1) Which features do both plugins already provide?
2) Which features are available in one but not the other?
3) What are the differences in the GUI of these plugins and their output,
which seem to be good ideas, which seem to be not so good ones (default
settings, grouping, etc.)?
4) Which features would be nice to have in this context, but available in
neither?
Then:
5) Create a list of features that a descriptive statistics plugin should have.
6) Create a mockup (i.e. don't worry about any details or R code, yet, just
create a raw .xml GUI description) of what such a plugin would look like.
7) Does the mockup look like it would be useable? Is it overly complex? Is it
easily possible to find the "most commonly needed" options among all the
others (also, how would the output be formatted; will the results be easy
to "see" in the output, or will it be too much at once; should be output be
grouped into several tables, ...)? Is there a "natural" way to split it into
two (non-overlapping) plugins?
Finally:
8) Where can the required functionality be found in R, preferentially without
requiring too many packages, or only requiring them, if truly needed.
That's quite a bit of work, and some difficult questions (which is why I never
started on this), but I guess it's probably a good idea to consider these
points before creating yet another descriptive statistics plugin.
Regards
Thomas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://mail.kde.org/pipermail/rkward-devel/attachments/20070227/b7c7287b/attachment.sig>
More information about the Rkward-devel
mailing list