[RkWard-devel] S/R functions for RKWard

Daniele Medri daniele.medri at libero.it
Sun Oct 10 18:45:37 UTC 2004


Dearest,

to help RKWard development, I think that many of you could start to suggest 
S/R functions to include in rkward module. Everyone is invited to expand the 
list and probably many of these are already used in your daily work:

Data source function:
 - argument for the data.frame to use
 - argument for the name of the new data.frame to create
 - argument for variables to keep (or drop)

Variables function:
 - argument for the data.frame to use
 - argument for the new data.frame to create (default the same)
 - argument to handle particular conditions with missing values:
  1. remove variables with more than ..% of missing
  2. remove record with missing (uh! ..this is very hard)
  2. apply methods to input values instead missing fields
 - standardization..
 - normalization..
 - ...anything else could be usefull.
  

Data Partition function:
 - argument for the data.frame to use
 - argument for the method between:
  1. simple random
  2. stratified
  3. user defined throught the selection of a partition variable
 - arguments for partition values (default could be 60% for training, 30%   
  validation, 10% test)

Sampling function:
 - argument for the data.frame to use
 - argument for the method of sampling between:
  1. simple random
  2. Nth
  3. Stratified
  4. first N
  5. Cluster
 - argument for the sample size expressed as:
  1. percentage (with default 10%)
  2. number
 
Clustering function:
 - argument for the data.frame to use
 - argument for the new data.frame to create
 - argument fot the name of the variable that identify the cluster
 - argument to exclude the incomplete observations
 - argument for the cluster method, for eg.:
  1. Ward (this name sound familiary!)
  2. Centroid
  3. Average
 - argument for the CCC (clustering criterion cutoff) with default 3
 - argument for the minimum number of cluster (default 2.. obviously!)
 - argument for the maximum number of cluster (default the number of obs)
 - ...

Many more could be usefull so, please, post your ideas or code to review.
To develop these functions think that a GUI will help users to handle own 
choices.

bye
-- 
Daniele Medri - http://www.medri.org




More information about the Rkward-devel mailing list