[PATCH] Add support for JEDEC/metric standards to KLocale::formatByteSize

Michael Pyne mpyne at kde.org
Sun Jul 12 05:48:20 BST 2009


Hi all,

First off I'd like to say that if you're subscribed to kde-devel you may have 
noticed I was embroiled in quite a flamewar over the past 10 days or so (which 
even leaked onto here a bit).

I've been disabused of some preconceived notions I had (for instance, I regret 
typing "NO ONE NEEDS DECIMAL UNITS" (in all caps no less)).  Hopefully I've 
made some of my technical points in the midst of it all.

Anyways, I've taken the time to do some more research and so for the benefit 
of justification, here's what I've learned:

* Units KB/MB etc in powers of 2 came about due to memory sizes typically 
being powers of 2, and are predominant in computing at this point, -but-:
* Hard disk sizes and other kinds of storage devices not tied to binary sizes 
have been powers of 10 as well, even before marketing dweebs turned it into a 
science.
* Networking has from what I can tell always used powers of 10 when referring 
to i.e. bandwidth and bitrate.
* Floppy disks are just stupid because their "1.44 MB" is really 1024 * 1440 
or something like that (i.e. mixing base-2 and base-10).
* Beyond that, KB have traditionally been powers-of-2, as confusing as that is 
to new computer users.

In order to erase some of the confusion, an IEC standard (addendum to IEC 
60027-2) was developed in 1999 to add new units which were specifically 
reserved for base-2 binary units (and is what we see now in KDE 4.  Bug 57240 
pertains).  This is where KiB, MiB, etc. come from.

Metric units have been standardized, with very minor revisions, for centuries 
now, and therefore most of the world tends to get computer units confused upon 
introduction to "kilobytes", "megabytes", etc.

The 'traditional' units, where KB == 1024 bytes, MB == 2^20, etc. actually 
/have/ a standard, believe it or not.  They are standardized by JEDEC Standard 
100B.01 (JEDEC is a semiconductor engineering standardization body), which 
defines the units KB, MB, and GB (2^{10,20,30}) for use with solid-state 
semiconductor memory sizes.  The standard explicitly refers to bit counts as 
being decimal (i.e. 1kbit == 1000 bits) and doesn't actually define the 
spelled-out forms "kilobyte", "megabyte", and "gigabyte" (thus leaving the 
unwary user to form the association by mistake themselves).  But, it is right 
now still the standard units for memory measurement.

The point to all this is that I think now that it would make sense to have 
support for the 3 different standards in kdelibs.  Many people do prefer the 
traditional/JEDEC-style units, and even without that (and, for that matter, 
with that), there are areas where metric decimal units are needed (networking, 
hard disk sizes).

So, I've attached a patch which adds support for these 3 standards (right now, 
we only support IEC 60027-2).  This support is currently not exposed to a GUI 
(which would be the next step, if we get that far).  In addition, the patch 
adds a new formatByteSize() call that accepts two extra parameters, the 
standard to use (which I refer to as binary unit dialect for lack of a better 
term), and the specific unit to use (i.e. an application can request the 
result in kilobytes of the user's dialect, metric megabytes, etc.).

Also included is test cases (obviously they all pass for me, but please let me 
know if they don't for you ;)

The way I would like to go is that we get this support integrated into 
kdelibs.  From there we do two things:

1) Add a GUI to localization settings KCM to allow the user to select their 
preferred unit style.  Over the past week I've heard input from numerous users 
who all want one of the three ways (including metric/decimal units), so I 
think it would make sense to be in the GUI as it's (apparently :) a very 
sensitive issue for many.  I will work on this.

2) Go through the KDE applications we maintain and use formatByteSize where 
appropriate.  What I mean by that is that whenever possible we should respect 
the user's units.  However, when we report download speed, that should be in 
decimal given existing practice, even for users with JEDEC.  When reporting 
hard disk capacity (i.e. in KDiskFree) we should use (or at least show) 
decimal units even for IEC or JEDEC.  Likewise, when a flash drive is inserted 
it probably makes the most sense to use IEC unless the user has already 
selected JEDEC when displaying the total size (individual file sizes would 
still be displayed per user's settings).

3) I can't find the email right now but I do remember reading an email from 
someone on our i18n teams who was having trouble with the various applications 
using their own units (sometimes they meant decimal, sometimes binary, etc.) 
so the /other/ thing I'd like is to go through and correct applications to use 
formatByteSize where appropriate.  I can run grep as well as anyone else so I 
suppose I can work on this.

This variant of the patch is inspired by the work done by Marcel Partap (even 
if he and I didn't agree 100% ;).  Back in February he posted a patch to add 
support for adjusting the precision displayed by formatByteSize in addition to 
adding support for essentially the next thousand years or so of computing's 
units.  It didn't make it in for some reason (it didn't seem to have much if 
any dissent) but to save him some work my patch adds support for adjustable 
precision as well.  He also added support for longer suffixes (i.e. spelling 
out kB as kilobytes) but I can't see the reason we'd need it so I've left it 
out.

I intend for this to be my last gasp effort at making (mostly) everyone happy.  
Please let me know if you have any technical comments or if I need to revise 
my patch to meet the needs of i18n or l10n teams.  (I have CC-ed i18n-doc but 
please keep comments at the very least on core-devel).

Regards,
 - Michael Pyne
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-kdelibs-klocale-binaryUnitDialects.patch
Type: text/x-patch
Size: 20237 bytes
Desc: not available
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20090712/b3cca8c1/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20090712/b3cca8c1/attachment.sig>


More information about the kde-core-devel mailing list