[RkWard-devel] Desc stats skweness and kustosis

I. Soumpasis nono.231 at gmail.com
Tue Feb 27 16:49:03 UTC 2007


2007/2/27, Thomas Friedrichsmeier <thomas.friedrichsmeier at ruhr-uni-bochum.de
>:
>
> On Tuesday 27 February 2007 10:08, I. Soumpasis wrote:
> > 1. Why is this difference? My data does not contain NAs as you can see.
>
> Let's see. moments::kurtosis has kurtosis as:
>         n <- length(x)
>         n * sum((x - mean(x))^4)/(sum((x - mean(x))^2)^2)
> Whereas in stat.desc and in e1071 it is:
>         sum((x - mean(x))^4)/(length(x) * var(x)^2) - 3
> Now the "-3" is easily explained away, as this is the difference between
> plain "kurtosis" and "excess kurtosis". I suppose, to avoid confusion,
> here,
> we could show an additional field "excess kurtosis" in the results which
> is
> simply kurtosis - 3.
> There's another subtle difference, though:
>         var (x)
> is really
>         sum ((x - mean(x))^2) / (length(x) - 1)
> note the "-1". This produces a further difference for small samples.
> Frankly, I don't know, which one is more correct.


I had noticed both differences but did not know what is more correct. Also
differences appear at skewness
moments
 (sum((x - mean(x))^3)/n)/(sum((x - mean(x))^2)/n)^(3/2)
e1071
sum((x - mean(x))^3)/(length(x) * sd(x)^3)
stat.desc
Skew <- sum((x - mean(x))^3)/(length(x) * sqrt(var(x))^3)
which seem that there are both right ways to calculate skewness, the first
is
http://upload.wikimedia.org/math/4/6/7/4674bef620c81d954614d510c0bfba13.png
and the other two
http://upload.wikimedia.org/math/1/8/1/181e50d50e7a11d858745c975f03445c.png


> 2. Although a descriptives plugin exists, how do you see the idea of using
> > the stat.desc additionally? Seems to me more informative, but the user
> > looses the choice to choose only one thing eg, mean or median.
>
> Actually, we already have two plugins for descriptives, already (the other
> is "Basic Statistics". Before adding a third, I think we should take a
> look
> at:
> 1) Which features do both plugins already provide?
> 2) Which features are available in one but not the other?
> 3) What are the differences in the GUI of these plugins and their output,
> which seem to be good ideas, which seem to be not so good ones (default
> settings, grouping, etc.)?
> 4) Which features would be nice to have in this context, but available in
> neither?
> Then:
> 5) Create a list of features that a descriptive statistics plugin should
> have.
> 6) Create a mockup (i.e. don't worry about any details or R code, yet,
> just
> create a raw .xml GUI description) of what such a plugin would look like.
> 7) Does the mockup look like it would be useable? Is it overly complex? Is
> it
> easily possible to find the "most commonly needed" options among all the
> others (also, how would the output be formatted; will the results be easy
> to "see" in the output, or will it be too much at once; should be output
> be
> grouped into several tables, ...)? Is there a "natural" way to split it
> into
> two (non-overlapping) plugins?
> Finally:
> 8) Where can the required functionality be found in R, preferentially
> without
> requiring too many packages, or only requiring them, if truly needed.
>
> That's quite a bit of work, and some difficult questions (which is why I
> never
> started on this), but I guess it's probably a good idea to consider these
> points before creating yet another descriptive statistics plugin.
>
> I can not have a trusted opinion on this. What I personally miss from R
(not RKWard) is a desciptives function which gives what excel and gnumeric
gives. I upload an example made with gnumeric to show you what I mean. A
statistical app probably should have more and I like the ones already
provided in both plugins.

Well I am sorry that I did not see the basic statistics plugin, I could not
imagine all the things it has. I feel that I am not right person to get
involved with these plugins and have a special opinion, I believe someone
else would know more things on this than me. I just say an opinion as an
average user on this.

For now I have to take a look on the quality charts and functions to finish
what i started with the pareto chart.

Regards,
Ilias
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/rkward-devel/attachments/20070227/fd865bf4/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: job_1637-untitled_document.pdf
Type: application/pdf
Size: 23028 bytes
Desc: not available
URL: <http://mail.kde.org/pipermail/rkward-devel/attachments/20070227/fd865bf4/attachment.pdf>


More information about the Rkward-devel mailing list