Fwd: "International Domain Names" support in KDE
Marc Mutz
Marc.Mutz at uni-bielefeld.de
Fri Jan 24 13:05:06 GMT 2003
On Friday 24 January 2003 23:20, Martin Konold wrote:
<snip>
> The enconding used is going to be the well known escape character
> approach.
<snip>
Not at all. It basically first removes all [^A-Z0-9] characters from the
KC-normalized and casemapped Unicode string and then outputs them with
a prefix prepended (currently referred to as "zz-", that's what you
meant with "escape character") and a hyphen appended.
www.müller.com -> zz-www.mller.com-
To get back the original string, increments are appended. The increments
encode both the char to insert and it's position, AFAIR:
curstring = "";
curposition = 0;
char_to_insert = toUnicode( 127 );
foreach( increment )
curpositon = ( curposition + increment ) % curstring.length();
char_to_insert = toUnicode( (int)last_char_inserted
+ increment / curstring.length() );
curstring.insert( char_to_insert, curposition );
The increments are encoded as a "generalized variable length integer"
that employs a variable base and uses the [0-9A-Z] alphabet and the
result is appended to the encoded string.
It is _really_ ugly. Thank god they publish reference source in the RFC:
zz-www.mller.com-s3c0c4a
Marc
--
Nie wird so viel gelogen wie vor der Wahl, während des Kriegs und nach
der Jagd -- Otto von Bismarck
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: signature
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20030124/6890db41/attachment.sig>
More information about the kde-core-devel
mailing list