Question for the old hands, about disks
Duncan
1i5t5.duncan at cox.net
Fri Apr 15 11:55:11 BST 2011
gene heskett posted on Thu, 14 Apr 2011 23:54:39 -0400 as excerpted:
> Well, I hitched a ride over to seacrates site and picked up the latest
> firmware updating iso, this about 3 days ago, maybe 4, time has been a
> blur here since.
Seeing the below, I can certainly understand why.
> Following their instructions, I pulled the cables on the other drives
> and booted the cd I burnt. It updated the firmware on that drive, to
> cc49, from some version in the late 20's.
>
> So now the drive is stable, but the write speeds are running about 2
> megs/sec. I replaced the SATA cable with another and doubled the from
> platter read speeds, but they still don't match the other nearly
> identical drive. But its working with no errors now.
Sounds like they made it try more times before resetting, and possibly,
now use a more complex ECC mechanism that can recover data that it would
return an error on before. You get stable, as data that's difficult to
read correctly gets read many more times before it gives up, but slow, as
the retry and/or ecc-recover take time. (It may be that data that had
been written with the new firmware would have better ecc and could be
recovered faster, but it's dealing with data written with the old
firmware, so...)
But the 2 MB/sec rate sounds like a failing drive too. Unfortunately...
FWIW, my experience in this are goes back to the event I believe I
mentioned earlier, the system left running when the AC failed in the heat
of a Phoenix summer, overheating the drive resulting in a head-crash. In
the aftermath of that, I used IIRC dd-rescue on it. On the parts of the
disk that weren't damaged, it comparatively flew, but when it got to the
parts that were damaged and was trying and trying to read them, and dd-
rescue tries until it can't read in one direction, then tries from the
other end until it can't read, then tries spots in the middle to see if
the whole area is damaged or just some and if it can read any, it expands
what it can read in both directions from there until it can't read again,
so it spends a **LOT** of time trying to read damaged sectors...
It was as you say below, like watching paint dry!
So I know the feeling! Unfortunately!
> What pulled the plug and flushed the whole system was when I plugged in
> the cable to /dev/sda and booted the cd, it updated the firmware on it
> too, and somehow managed to also do the MBR on the drive so that even
> selecting it as the boot device in the F8 bios menu, it simply hung with
> a blank screen deadlock. It is still that way despite fdisk showing it
> as bootable, and I have mounted the individual partitions and have the
> majority of the data recovered now. But, that drives partition labels
> are meaningless & scrambled so that the partition bearing the sea-slash
> label, is actually, from its contents, the old /opt partition. Other
> labels are similarly miss-placed. Entirely possible since all the
> distros have fallen in love with the UUID numbers because they are
> supposedly more unique than a human generated label.
Sounds like the firmware update on it redid the way it handles the data
areas that store critical partition data, thus scrambling some of it. The
vast majority of the drive is fine, the partition layout map itself is
apparently fine (tho I'd be careful with it as it's possible the
partitions overlap now and writing to one might damage another, inspect
the partition data with fdisk or the like to be sure...). But the data
about those partitions, the UUIDs, labels, etc... not so fine.
FWIW, this is one reason I switched here to GPT based partitions, from the
legacy MBR based system. GPT stores two copies of the partition data, one
at the beginning of the drive and one at the end, and unlike mbr,
checksums the data. If it gets corrupted, the checksum is bad and it can
try reading the other table to see if it's good. Even if it can't do so
directly (the onboard logic must be small and simple), there's a good
chance that gptfdisk (formerly gdisk, one gpt form of fdisk/sfdisk, etc,
tho there are other tools available) or other gpt disk partitioning tools
can recover the partitions. Ever since that bad AC triggered head-crash
event, I've been rather more paranoid about data integrity than I used to
be, and this seemed to cover one of the gaps left by my 4-way md/RAID-1
setup rather well.
The other big advance of GPT is that it does away with the primary/
secondary/logical partition distinction. All partitions are treated the
same, with the minimal reserved area specified by the spec allowing 128
such partitions and expansion from that (if anyone were to actually find a
reason 128 partitions isn't enough, here, I find that simply tracking them
starts getting difficult after about two dozen or so, even with the help
of the GPT partition-level labels (*NOT* filesystem level, actual
partition level, readable/settable in gpt partitioning tools even before
the filesystem is made).
(I'm interested in btrfs for much the same reason; it has built-in
checksumming so will know immediately if the data it gets from the disk is
bad, and can for instance request it from another disk using its built-in
RAID-1, if it was configured for that, but as I've explained in previous
btrfs discussion, that filesystem is still experimental at present --
they're only now developing an fsck for it, for instance! I'm eagerly
anticipating, but the day I decide it's ready for me isn't likely to be
for another couple kernel releases, anyway, but almost certainly sometime
next year, if not second half of this year. We'll see.)
GPT is part of the new EFI transition from legacy BIOS-based systems, but
at least on Linux, doesn't require EFI hardware, since the kernel has an
option that can be enabled for it, grub can understand it (grub-2
directly, apparently, and there's patches for grub-0.97 or whatever, which
is what I'm still using, the patches being built-in for gentoo's grub) and
I believe current lilo does too (tho I'm not sure of that), and a number
of the partitioning tools do as well (I already mentioned gptfdisk, and
libparted based partitioners do as well). That's the bootloader, the
kernel, and the tools all three, so Linux is good to go, even on BIOS-
based hardware systems. (FWIW, Apple's already EFI, and for MS, it
depends on the version, but at least some require EFI based hardware to
handle GPT on the boot device, but handle it on data devices fine with
just BIOS.)
> Anyway, I installed a freshly burnt version of pclos, and have been
> watching paint dry at 2megs/sec while mc gets my data back.
"Watching paint dry". Yeah, that's about accurate when the disk is having
to reread the data several times to get it correct, due to damage or
whatever. As I said, I've been there.
> And yes, I run amanda, so it ought to be easy, but my next outgoing msg
> will be to the amanda-users list because I can't get it to build,
> missing includes bailouts. With zero clues as to what the package name
> that would fix it. And since the rpm packages are a totally different
> setup, I'll wait till the local build problems get sorted, its one of
> the few programs that I have been building and using the svn versions of
> since back in the late 90's.
Well, at least the rpm metadata should tell you what the direct
dependencies are, giving you /some/ clue as to what packages it needs.
And the same for the srpm, which should give you the build-time
dependencies as well.
Of course, when you're building it on your own, you may not enable all the
options the rpm versions do, so you might not actually need /all/ those
packages, but that's the list I'd start checking against if I ran into
build-time header errors on a normally rpm based system. (It's what I
used to do back in the early '00s, on Mandrake, anyway.)
FWIW, rpmfind is a great resource for this sort of thing. Try the
following link, click on the html-page link for a version that looks close
to what you're trying to build (rpm and srpm, the rpm for run-time deps,
the srpm for build-time deps), and take a look at the dependencies it
lists.
Or download the srpm from pclinuxos or rpmfind, and use rpm's "install
dependencies only, dryrun" functionality, to see what it thinks it needs
that you don't already have installed. You can then do it without the
dryrun if you think it looks right, and that should set you up for
building. (Unfortunately, it's been years since I worked with rpm, and
the switches for the above, even if I did remember them, may well have
changed by now. But the functionality is there, with the manpage
explaining if you need it, I should think.)
Or maybe you're already doing that, having taken for granted that you
didn't have to actually spell it out, and it's still throwing missing
headers errors.
Meanwhile, as they say, /test/ your recovery plan, because those backups
are only as good as your ability to recover them. =:^)
IOW, if you're going to use self-built backup software, be sure to keep
copies of it off-disk somewhere, that you can quick-copy back if
necessary, for recovery. And test that those copies work on a clean-
install.
Either that, or don't use backup software, but instead, simple copies, as
I do. I copy whole partitions at a time, for root, by bind-mounting it
without overmounts elsewhere, so I can copy the root partition itself, and
only the root partition. I use normal copying, either cp in archive mode
or filemanager copying (mc used to do this well, but the new version
copies sparse files un-sparse, making them take FAR more room on the new
partition than they did on the old one, so I don't use it for that any
more), old working partition to the new, freshly mkfsed partition.
To test the backups at a grub-and-/boot-still-available level, I simply
reboot and feed grub a different kernel append line root= parameter,
pointing it at the backup root partition (same disk or external, I have
both, just in case...).
To test the grub-and-/boot-eaten scenario, I boot to the grub enabled
/boot I installed on a thumb drive, feeding it parameters as needed. The
same thumb drive grub installation has kernels and sample grub menu
options that I can modify as needed, for both my main 64-bit workstation,
and my 32-bit netbook, and in all /boot locations (netbook, main
workstation, and thumbdrive) I keep a grubnotes file listing various
additional kernel append parameters that I've found useful over the years,
while testing live-git kernels, etc. I can cat that file from grub if I
need to, thus reminding me whether it's for instance noapic or apic=off,
if I suspect I'm having problems with it.
So I don't worry about backup software at all, because my backup software
is mkfs, mount and cp, with the recovery software being grub and/or mount,
depending on whether it's the root/system or a data partition. =:^)
What's great about it is that since the backup is a snapshot of my working
system just as it was when I made the copy, I have a fully functional
rescue partition too. No limited function thing, the real deal, including
all the software I had installed when I made the backup, everything. So
if my main working copy doesn't run, I can simply boot the backup, google
a solution if I need to, and fix the working copy. If it takes awhile, no
big deal, I have the usual /home (or a backup if it was lost too) mounted,
and can do all the usual stuff I'd normally do, checking mail and
newsgroups, keep up on my rss feeds, all that, in the mean time. And if
the former main working copy isn't recoverable, I simply update from the
backup I'm already running, no problem.
> Anyway, I haven't fallen over, yet, just occupied trying to get a new
> install that didn't use the partitions I gave it, sorted, but now I have
> /home, /opt, /root, and /var back on their proper partitions.
>
> One positive side effect seems to be that a lot of my kde4 problems have
> vanished too. And I think I have it all back to 4.6.2 now too. As I've
> rebooted 20+ times as I get the data moved & the only time I lost any
> config was when I mounted the new /home partition over the top of the
> install directory and rebooted after fixing fstab. But give me time,
> I'm sure I can screw it up again. :-)
Gotta make it on topic! =:^)
Glad it did seem to fix things up for you, kde-wise.
> Cheers, Duncan & now to see if my recovered kmailrc can still send an
> email.
Seems it worked. =:^)
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
___________________________________________________
This message is from the kde mailing list.
Account management: https://mail.kde.org/mailman/listinfo/kde.
Archives: http://lists.kde.org/.
More info: http://www.kde.org/faq.html.
More information about the kde
mailing list