[RkWard-devel] wilcoxon & ansari_bradley tests

Sun Feb 4 19:24:16 UTC 2007

Am Freitag, 2. Februar 2007 16:45 schrieb Thomas Friedrichsmeier:
> On Friday 02 February 2007 01:51, SJR wrote:
> > wilcoxon & ansari_bradley tests are ready. From my point of view they are
> > complete but need intensive testing.
>
> Great!
>
> Some things I noted:
>
> - As fars as I can see, the variable "rk.temp.print.conf.level" is no
> longer actually needed. So it should be removed.
>
Fixed.

> - "rk.temp.length.x" (in the two "inexact" tests) is only needed in one
> place. Therefore I suggest to save the variable, and adjust the test to
> if (length (eval (rk.temp.x)) < 50) instead of if (rk.temp.length.x < 50).
> Overall that should be slightly easier to read.

Okay. My reason for keeping it was to have a common structure. But of course 
it's not need. Changed.

>
> - I'd further suggest to move that note out of the rk.header (), and
> instead place it at the very end (after rk.results()) as:
> if (length (eval (rk.temp.x)) < 50) rk.print ("Note: There are less
> than...") After all, that's not a parameter, but an info.

As you can remember from the past I tried to the include information about the 
used parameters (ugly HTML at that time) which finally resulted in  your 
clean solution "rk.header ()".
This time I wanted to give the user more security how to use tests best by 
giving hints or warnings. Right now I'm not really convinced anylonger that 
this is a good idea. Therefore I removed this "feature" for the moment. 
Reason: This is not a build-in functionality of the test (I mean to take care 
of a certain "problematic" parameter and the capability to inform the user in 
case of (possible) wrong usage). Actually it's better that users just read 
the R-help instead of relying on information given by a plug-in (not to say 
the plug-in author). As Thomas stated that it's not a good practice to state 
the default values in the RKWard help I think like that applies here. The 
internals on packages can change and thus a recommendation might be wrong at 
the end. It's not possible to keep track on all internal changes whereas it's 
likely that parameters won't change much on a test and therefore will not 
need attention.
Maybe it's better to have just an option which states:
------------------
You used
	* RKWard: V X.Y.Z
	* Package: anything V ?.?.?
	* Test: test{anything}
	* Length = ??, Missing Values= ??
	* ... 
------------------
Maybe one could add such an information there. However, if user think the 
result is wrong it should be quite easy for them to look at the documentation 
and to find the mistake. As you stated for RKWard:"Practical statistics is 
not just about calculating, after all, but also about documenting and 
ultimately publishing the results."

Maybe we should talk about the "Suggestions by RKWard to the user"-feature on 
a separate thread on the list if you see a need for this.

How do other statistical tools handle this?
Do they complex recommendations/warnings to the user?
Should it be a function of a sophisticated statistical tool?

>
> - Does this note recommed turning on the "compute exact p-value" option, or
> does it recommed to use the exact version of the test? I don't have any
> background knowledge on the difference between these two.

By default an asymptotic p-value is calculated but it's recommended to use the 
exact option (within this test) if there are less then 50 values and if there 
is no binding (Don't know if this is the right term German: "Bindung"). If n 
< 50 and if there is binding one should use the wilcoxon.exact() test 
(exactRankTests package) for correct results. Hope this answers your 
question.

>
> - I'm not quite happy with all that if and paste for the H1. It's too much
> noise for a simple description. Maybe we will need a helper function
> rk.describe.alternative() for that in the rkward library. For now, it's ok.
> Definitely something to keep in mind for after 0.4.6, though.
>
> - However, this is wrong for the Ansari-Bradley tests! I was not familiar
> with this test before, but it does not test for difference in sample means,
> but for a difference in variance. So writing "x is greater than y" is not
> correct. For an easy way out, just write "greater", "less" or "two.sided"
> for now.

Removed

>
> Regards
> Thomas