<br><br><div><span class="gmail_quote">2007/2/27, Thomas Friedrichsmeier <<a href="mailto:thomas.friedrichsmeier@ruhr-uni-bochum.de" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">thomas.friedrichsmeier@ruhr-uni-bochum.de

</a>>:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

On Tuesday 27 February 2007 10:08, I. Soumpasis wrote:<br>> 1. Why is this difference? My data does not contain NAs as you can see.<br><br>Let's see. moments::kurtosis has kurtosis as:<br>        n <- length(x)

<br>

        n * sum((x - mean(x))^4)/(sum((x - mean(x))^2)^2)<br>Whereas in stat.desc and in e1071 it is:<br>        sum((x - mean(x))^4)/(length(x) * var(x)^2) - 3<br>Now the "-3" is easily explained away, as this is the difference between

<br>plain "kurtosis" and "excess kurtosis". I suppose, to avoid confusion, here,<br>we could show an additional field "excess kurtosis" in the results which is<br>simply kurtosis - 3.<br>There's another subtle difference, though:

var (x) is really         sum ((x - mean(x))^2) / (length(x) - 1) note the "-1". This produces a further difference for small samples. Frankly, I don't know, which one is more correct.

</blockquote><div><br>I had noticed both differences but did not know what is more correct. Also differences appear at skewness<br>moments <br> (sum((x - mean(x))^3)/n)/(sum((x - mean(x))^2)/n)^(3/2)<br>e1071<br>sum((x - mean(x))^3)/(length(x) * sd(x)^3)

<br>stat.desc<br>Skew <- sum((x - mean(x))^3)/(length(x) * sqrt(var(x))^3)<br>which seem that there are both right ways to calculate skewness, the first is<br><a href="http://upload.wikimedia.org/math/4/6/7/4674bef620c81d954614d510c0bfba13.png">

http://upload.wikimedia.org/math/4/6/7/4674bef620c81d954614d510c0bfba13.png</a><br>and the other two<br><a href="http://upload.wikimedia.org/math/1/8/1/181e50d50e7a11d858745c975f03445c.png">http://upload.wikimedia.org/math/1/8/1/181e50d50e7a11d858745c975f03445c.png

</a><br><br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

> 2. Although a descriptives plugin exists, how do you see the idea of using<br>> the stat.desc additionally? Seems to me more informative, but the user<br>> looses the choice to choose only one thing eg, mean or median.

Actually, we already have two plugins for descriptives, already (the other is "Basic Statistics". Before adding a third, I think we should take a look at: 1) Which features do both plugins already provide?

2) Which features are available in one but not the other? 3) What are the differences in the GUI of these plugins and their output, which seem to be good ideas, which seem to be not so good ones (default settings, grouping, etc.)?

<br>4) Which features would be nice to have in this context, but available in<br>neither?<br>Then:<br>5) Create a list of features that a descriptive statistics plugin should have.<br>6) Create a mockup (i.e. don't worry about any details or R code, yet, just

<br>create a raw .xml GUI description) of what such a plugin would look like.<br>7) Does the mockup look like it would be useable? Is it overly complex? Is it<br>easily possible to find the "most commonly needed" options among all the

<br>others (also, how would the output be formatted; will the results be easy<br>to "see" in the output, or will it be too much at once; should be output be<br>grouped into several tables, ...)? Is there a "natural" way to split it into

<br>two (non-overlapping) plugins?<br>Finally:<br>8) Where can the required functionality be found in R, preferentially without<br>requiring too many packages, or only requiring them, if truly needed.<br><br>That's quite a bit of work, and some difficult questions (which is why I never

started on this), but I guess it's probably a good idea to consider these points before creating yet another descriptive statistics plugin. </blockquote></div>I can not have a trusted opinion on this. What I personally miss from R (not RKWard) is a desciptives function which gives what excel and gnumeric gives. I upload an example made with gnumeric to show you what I mean. A statistical app probably should have more and I like the ones already provided in both plugins.

Well I am sorry that I did not see the basic statistics plugin, I could not imagine all the things it has. I feel that I am not right person to get involved with these plugins and have a special opinion, I believe someone else would know more things on this than me. I just say an opinion as an average user on this.

<br><br>For now I have to take a look on the quality charts and functions to finish what i started with the pareto chart.<br><br>Regards,<br>Ilias<br>