[rkward-users] Pareto Analysis using RKward

Vikas Garud information4vikas at gmail.com
Sun Jun 5 09:26:27 UTC 2011

Hi Thomas,

Thanks for the reply and suggestions.  I tried those and got the
results as desired, well, almost.  Thanks.  Some questions though -

> Well, if the problem is reading the legends, then you'd probably want to cut
> by number of categories, rather than a limit by cumulative total, no?

You are right.  I should cut by limiting the number of categories.
The code suggested by you:
   cutoff <- 10
   if (length (cumulated.costs) > cutoff) {
       transformed <- sort (cumulated.costs, decreasing=TRUE)
       transformed <- c (transformed[1:cutoff], "Others"=sum

worked well for the sample data I had included earlier.  However, how
do I do that for another data, where I want to limit the list of
defects (categories for Pareto Chart) by number of such categories.
Following is sample data from another source (it contains 10099
records and 37 defect categories.  I would like Pareto Chart to
consist of, say 10, actual defect categories and remaining 27 combined
into others.
Date	Line	Defect
14/8/2005	1	Battery Failure
15/8/2005	3	Fit & Finish - Roof
5/8/2005	3	Fit & Finish - Bumper
19/8/2005	4	Fit & Finish - Roof
26/8/2005	2	Fit & Finish - Driver Side Door
3/8/2005	4	Emissions failure
16/8/2005	1	Brake Failure
28/8/2005	4	Emissions failure
29/8/2005	4	Hood Release Failure
15/8/2005	4	Seat Warmer Failure
28/8/2005	2	Headlamp failure
30/8/2005	4	Fit & Finish - Bumper
21/8/2005	2	Heater Failure
10/8/2005	3	Power Antenna Failure
20/8/2005	3	Evaporator Coil Failure
26/8/2005	3	Emissions failure
17/8/2005	4	Tail Light Failure
15/8/2005	3	Fit & Finish - Roof
6/8/2005	4	Power Window Failure
9/8/2005	1	Fuel Injection Failure
20/8/2005	2	Fit & Finish - Roof
15/8/2005	4	Air Conditioning Failure
1/8/2005	4	Fuel Injection Failure
12/8/2005	3	Fit & Finish - Driver Side Door
26/8/2005	2	Fit & Finish - Driver Side Door
2/8/2005	3	Seat Warmer Failure
1/8/2005	2	Sunroof Leaking
14/8/2005	3	Dome Light Failure
21/8/2005	2	Brake Failure
3/8/2005	4	Hood Release Failure
3/8/2005	4	Wiper Failure
24/8/2005	1	Throttle Failure
21/8/2005	4	Brake Failure
19/8/2005	2	Brake Failure
1/8/2005	4	Tail Light Failure
11/8/2005	4	Heater Failure
26/8/2005	3	Fit & Finish - Roof
6/8/2005	4	Sunroof Leaking
23/8/2005	2	Dome Light Failure
13/8/2005	2	Evaporator Coil Failure

I tried to generate a list for cumulated defects by:

cumulated.defects <- by (my.csv.data$Defect, my.csv.data$Defect, FUN=count)
(Probably shows my ignorance about using R)

And got error:
Error in FUN(X[[1L]], ...) : object 'count' not found
Calls: by ... by.default -> eval -> eval -> tapply -> lapply -> FUN

These would be good additions to GUI, would help in increasing the use
of these tools in Quality Improvement Activities in Industry (as
against research institutes, where people would not run away from
using scripts).

One more observation:  In traditional Pareto Chart, the Others
category is at the end.  In the Pareto Chart drawn by using this
script, the "Others" category appears somewhere in the  middle - in
sorted order.  Can this be taken towards the end?

Is there a way to get the computed vectors (The cost total) in the
data?  How do I read values of total costs for various defects?  The
tabulation in the output is in terms of serial number of the category
and cumbersome to relate to actual data.

And yes, I'll certainly try the latest version as soon as it is
available in SuSE repositories.  At present it is showing some build

Thanks and regards
Vikas Garud

PS:  I did not receive your replies on my email Id.  I read your
responses in the archives.  Therefore could not quote your mail
properly.  Need to check the mailing list settings for mail delivery.
  Vikas Garud

More information about the Rkward-users mailing list