[kde-linux] Using a kubuntu LiveCD to stress test a troublesome Windows system

Duncan 1i5t5.duncan at cox.net
Sun Jul 29 00:01:22 UTC 2012


Doug posted on Sat, 28 Jul 2012 15:48:59 -0400 as excerpted:

> On 07/28/2012 02:57 PM, Bruce Miller wrote:
>> Please forgive the elementary nature of these questions. A
>> recently-widowed friend has asked me to help troubleshoot her late
>> husband's large and powerful Windows 7 system. I am myself somewhat
>> preoccupied with other matters and am sure that I am missing one (or
>> more) obvious solution(s).
>>
>> The major constraint is that her late husband was a professional
>> photographer and he has image files everywhere on his system. Even I
>> can recognize that he was a truly gifted photographer, but that his
>> computer skills were not to match.
>>
>> There is exactly one USB backup drive and so far I have no idea whether
>> the backup is complete (I doubt it) or restorable. I am being careful
>> therefore to limit anything I do to non-destructive testing

>> Since I almost always use Kubuntu, my primary resource (at the outset)
>> is a CD of Kubuntu amd64 12.04 LTS.
>>
>> The obvious symptom on the Windows 7 machine is a BSOD (Blue Screen of
>> Death) almost every 20 minutes.

>> The first exception code on the BSOD is a memory location of<multiple
>> zeros>1. Her tech support person (owner of a well-known local computer
>> store and fellow photographic expert) says that the single digit in the
>> exception code suggests a hardware failure, probably on the
>> motherboard.


>> I have been running Kubuntu from the liveCD for over eighteen hours
>> which does not support the defective motherboard hypothesis. But my
>> mind has gone blank on how to stress-test the hardware without writing
>> to any hard disks. Google has not been a friend; on this problem, I
>> have found it surprisingly unhelpful.


I don't do MS any longer, and you mentioned google already, but if you 
haven't already, try counting the zeros and googling the EXACT error.  I 
would have tried but without knowing the exact number of zeroes...

>> So far, I have been running just a browser, a konsole session and
>> glxgears. The latter will put some load on the CPU and more on the
>> graphics subsystem. Google did lead me to a sourceforge utility called
>> systester which at least tests the CPU by attempting to calculate pi to
>> many millions of significant digits. But I cannot get it to run.

The first thing I thought about here was memtest86 (or memtest86+, a fork 
of the original, note that sometimes one might run while the other one 
doesn't so if you try one and it doesn't, see if you can find the other 
to try).  Some distros ship with it on their liveCD/DVDs and/or install 
media (I'm not sure whether Ubuntu does or not).  It would be an option 
in the boot menu, as it runs in place of the normal Linux kernel.

Typical usage is to run all tests several times thru, say overnight.  If 
after ~8 hours of testing it's not showing any problems, you can probably 
rule out the memory itself as a problem, tho it's still possible you'll 
get bus errors under test.  (I once had a system that tested fine on 
memtest86+, but that would occasionally have memory bus errors under 
heavy real-world usage, gentoo, so building packages from source, etc.  
Eventually a bios update came out that let me underclock the memory from 
the rated speed, and at a notch down from rated speed, I could actually 
tighten up various individual memory latencies and it was still solid as 
a rock... it just couldn't quite handle the rated speed on that board, 
probably because the board tolerances skewed one way while the memory 
tolerances skewed the other... just enough to cause occasional problems!)

So a clean result doesn't necessarily say absolutely no problems, but if 
you get errors, definitely consider the memory unreliable.


Another alternate, not as thorough, but a kernel option so it's quite 
easy to ship a kernel with the option turned on, making it available 
sometimes when memtest86* isn't, is the kernel memtester.  If it's 
available, booting the kernel with memtest=<number> will run that many 
test patterns. 0 of course disables (the default), 1 is one test pattern, 
4 is four...  Note that I've never actually run this myself, only seen 
the kernel option, so I'm not sure how long the patterns take, or whether 
it tests and then boots normally if it passes, just gives you the results 
and lets you reboot, or what.  But you may have the option built into the 
kernel on the live media, so it's worth a try if you don't see memtest as 
an option.

>> So my questions are the following:
>> 1. Does anyone have suggestions on safe non-destructive hardware
>> stress-testing applications? I have no objection to downloading and
>> burning to CD a specialized distro.

The distro that immediately came to mind here is SystemRescueCD.  Lots of 
info at the link below, but among other things, it has memtest86, various 
recovery and hard-drive testing (smart, etc) tools including PhotoRec, 
specifically designed to help with photo/video recovery (looks right down 
your alley ATM), the usual Linux filesystems, PLUS NTFS-3G for MS/NTFS 
filesystems, vfat of course, the chntpw windows password reset utility 
(doesn't appear to be needed here, but invaluable if you have friends 
that forget their password and call you...), all kinds of stuff.

http://www.sysresccd.org

>> 2. Is anyone familiar with systester?

I can't help you there.

>> The system is an Intel i7 with 8GB of RAM. It is a 2008-vintage Asus
>> P6T motherboard; the catch is that it takes an LGA1366 CPU, which is no
>> longer available. Replacing both the motherboard and CPU will cost
>> north of $500 and my reluctance to recommend that my friend spend that
>> sort of money is heightened by the last 18 hours of evidence that the
>> problem is not with the MB.

> Instead of spending $500, and not even knowing if that would solve the
> problem, why not spend less than $100 and buy a portable hard drive.
> Then, having booted up some Linux system from CD, copy all the picture
> files off the system to the portable hard drive.  Or even copy the whole
> hard drive for cleanup later, if you're short of time.

I'll second that.  You can memtest86 easily enough since that doesn't 
touch the hard drive (which you can even unplug entirely during that 
test, if you like).  Then consider getting a drive image ASAP.  You can 
even take the old drive out of that computer and plug it into another to 
do the imaging, if you're unsure of the reliability of the existing 
hardware.

But do get that second image (or even a third) and work with that.  At 
this point, it sounds like the photos and other data on the disk is all-
important, an irreplaceable  connection to someone no longer around.  So 
taking a copy of that (or two or three) is likely to be the best 
investment possible at this point, STILL well under the $500 mentioned 
for new hardware, and you can then freely work with one copy, without 
having to worry that you're risking the only copy of the data available.

SystemRescueCD is one tool that can help with that, and is all around a 
good live-image to keep handy, for all the other stuff it makes possible 
as well.

There's also clonezilla, a distro purpose-built for disk image cloning 
and recovery.  (FWIW, it also has memtest.)  You'd want the live version 
for single machine use -- se, server edition, is for massive simultaneous 
deployment (40+ systems...).

http://clonezilla.org/


Of course you could remove the disk to another machine for cloning, then 
do memtest86 or other direct-from-CD/DVD tests on the now diskless 
original barebones system, if you wanted to try doing both at once and 
have another system available to stick the hard drive in for cloning...

Meanwhile, for testing the cpu, the product that comes to mind is cpuburn 
and/or cpuburnin.  I've never actually used these and just googled them, 
but they should test the cpu and the system's thermal solution quite 
well.  I'd suggest using a live-boot with the hard drive disconnected, 
and monitor cpu and system temps for a few minutes before leaving it to 
run, just in case the system does NOT have the best cooling, etc (tho 
anything even half modern has hardware thermal shutdown on the CPU, to 
keep it from actually burning up, that should in theory save you from 
totally ruining it, but monitoring it's still probably wise, just in 
case).

FWIW, google says cpuburn is available for ubuntu, and I've actually not 
looked to see what live-distros carry it (tho tomsrtbt is mentioned), but 
I would recommend a live-distro for running it, as I said.  But since I'd 
have to do the same googling you'd have to do for that, I might as well 
let you do it.  Here's where I started, tho.

http://www.google.com/search?q=cpuburn+linux

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman




More information about the kde-linux mailing list