Encoding problem in the file ioslave

Thiago Macieira thiago at kde.org
Tue Sep 2 07:35:31 BST 2008


Oswald Buddenhagen wrote:
>On Mon, Sep 01, 2008 at 08:27:36AM +0200, Thiago Macieira wrote:
>> QFile won't let you because there's no way you can represent a broken
>> encoding in QString.
>
>how about it encoding illegal chars? some high-up, super-unlikely
>unicode char (22C7 certainly qualifies :D) followed by the hex-encoded
>byte value (or the char again to quote itself). that would break only if
>somebody directly c&p'd such a string from/to non-qt applications, which
>should be permissible for that broken case.

That's the transition solution we had. But it was in QString::fromUtf8 and 
QString::toUtf8, which meant that it was at the wrong level (it accepted 
invalid UTF-8 strings and made the string encoding/decoding slower) and 
only had any effect for users of UTF-8 locales. The problem affects all 
locales where random bytes do not equal strings.

It was removed in Qt 4.3.0.

The clean solution I'm thinking of here is to add support for QUrl to 
QFile. Since URLs can contain random binary data, it could hold the URL 
for an undecodable file name (i.e., file:///home/thiago/C%F3digo.txt).

However, that's also not a complete solution either. For locales not using 
UTF-8, url.setPath("/Código.txt"); file.setUrl(url); would have different 
behaviour from file.setFileName("/Código.txt"); since the IRI standard 
(RFC 3987) requires the "ó" to be encoded as %C3%B3, while 
QFile::encodeName would turn it into byte 0xF3 (Latin 1).

Also note that, despite being Unicode-friendly, Windows is not a UTF-8 
locale. It retains a legacy 8-bit encoding that isn't UTF-8 (except for 
Vietnamese).

-- 
  Thiago Macieira  -  thiago (AT) macieira.info - thiago (AT) kde.org
    PGP/GPG: 0x6EF45358; fingerprint:
    E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20080902/435dac28/attachment.sig>


More information about the kde-core-devel mailing list