[PATCH] Add support for JEDEC/metric standards to KLocale::formatByteSize
Michael Pyne
mpyne at kde.org
Sun Jul 12 05:48:20 BST 2009
Hi all,
First off I'd like to say that if you're subscribed to kde-devel you may have
noticed I was embroiled in quite a flamewar over the past 10 days or so (which
even leaked onto here a bit).
I've been disabused of some preconceived notions I had (for instance, I regret
typing "NO ONE NEEDS DECIMAL UNITS" (in all caps no less)). Hopefully I've
made some of my technical points in the midst of it all.
Anyways, I've taken the time to do some more research and so for the benefit
of justification, here's what I've learned:
* Units KB/MB etc in powers of 2 came about due to memory sizes typically
being powers of 2, and are predominant in computing at this point, -but-:
* Hard disk sizes and other kinds of storage devices not tied to binary sizes
have been powers of 10 as well, even before marketing dweebs turned it into a
science.
* Networking has from what I can tell always used powers of 10 when referring
to i.e. bandwidth and bitrate.
* Floppy disks are just stupid because their "1.44 MB" is really 1024 * 1440
or something like that (i.e. mixing base-2 and base-10).
* Beyond that, KB have traditionally been powers-of-2, as confusing as that is
to new computer users.
In order to erase some of the confusion, an IEC standard (addendum to IEC
60027-2) was developed in 1999 to add new units which were specifically
reserved for base-2 binary units (and is what we see now in KDE 4. Bug 57240
pertains). This is where KiB, MiB, etc. come from.
Metric units have been standardized, with very minor revisions, for centuries
now, and therefore most of the world tends to get computer units confused upon
introduction to "kilobytes", "megabytes", etc.
The 'traditional' units, where KB == 1024 bytes, MB == 2^20, etc. actually
/have/ a standard, believe it or not. They are standardized by JEDEC Standard
100B.01 (JEDEC is a semiconductor engineering standardization body), which
defines the units KB, MB, and GB (2^{10,20,30}) for use with solid-state
semiconductor memory sizes. The standard explicitly refers to bit counts as
being decimal (i.e. 1kbit == 1000 bits) and doesn't actually define the
spelled-out forms "kilobyte", "megabyte", and "gigabyte" (thus leaving the
unwary user to form the association by mistake themselves). But, it is right
now still the standard units for memory measurement.
The point to all this is that I think now that it would make sense to have
support for the 3 different standards in kdelibs. Many people do prefer the
traditional/JEDEC-style units, and even without that (and, for that matter,
with that), there are areas where metric decimal units are needed (networking,
hard disk sizes).
So, I've attached a patch which adds support for these 3 standards (right now,
we only support IEC 60027-2). This support is currently not exposed to a GUI
(which would be the next step, if we get that far). In addition, the patch
adds a new formatByteSize() call that accepts two extra parameters, the
standard to use (which I refer to as binary unit dialect for lack of a better
term), and the specific unit to use (i.e. an application can request the
result in kilobytes of the user's dialect, metric megabytes, etc.).
Also included is test cases (obviously they all pass for me, but please let me
know if they don't for you ;)
The way I would like to go is that we get this support integrated into
kdelibs. From there we do two things:
1) Add a GUI to localization settings KCM to allow the user to select their
preferred unit style. Over the past week I've heard input from numerous users
who all want one of the three ways (including metric/decimal units), so I
think it would make sense to be in the GUI as it's (apparently :) a very
sensitive issue for many. I will work on this.
2) Go through the KDE applications we maintain and use formatByteSize where
appropriate. What I mean by that is that whenever possible we should respect
the user's units. However, when we report download speed, that should be in
decimal given existing practice, even for users with JEDEC. When reporting
hard disk capacity (i.e. in KDiskFree) we should use (or at least show)
decimal units even for IEC or JEDEC. Likewise, when a flash drive is inserted
it probably makes the most sense to use IEC unless the user has already
selected JEDEC when displaying the total size (individual file sizes would
still be displayed per user's settings).
3) I can't find the email right now but I do remember reading an email from
someone on our i18n teams who was having trouble with the various applications
using their own units (sometimes they meant decimal, sometimes binary, etc.)
so the /other/ thing I'd like is to go through and correct applications to use
formatByteSize where appropriate. I can run grep as well as anyone else so I
suppose I can work on this.
This variant of the patch is inspired by the work done by Marcel Partap (even
if he and I didn't agree 100% ;). Back in February he posted a patch to add
support for adjusting the precision displayed by formatByteSize in addition to
adding support for essentially the next thousand years or so of computing's
units. It didn't make it in for some reason (it didn't seem to have much if
any dissent) but to save him some work my patch adds support for adjustable
precision as well. He also added support for longer suffixes (i.e. spelling
out kB as kilobytes) but I can't see the reason we'd need it so I've left it
out.
I intend for this to be my last gasp effort at making (mostly) everyone happy.
Please let me know if you have any technical comments or if I need to revise
my patch to meet the needs of i18n or l10n teams. (I have CC-ed i18n-doc but
please keep comments at the very least on core-devel).
Regards,
- Michael Pyne
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-kdelibs-klocale-binaryUnitDialects.patch
Type: text/x-patch
Size: 20237 bytes
Desc: not available
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20090712/b3cca8c1/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20090712/b3cca8c1/attachment.sig>
More information about the kde-core-devel
mailing list