Character encodings (UTF16)

Waldo Bastian bastian at kde.org
Wed Feb 9 16:01:33 GMT 2005


On Wednesday 09 February 2005 15:50, Andras Mantia wrote:
> Hi,
>
>  I'm sending this to core-devel, as it affect many applications
> including Kate (and everything using Katepart), KEdit, Konqueror and
> maybe others. There seem to be a problem with dealing with certain
> UTF16 encoded files. The question is whether the problem is in KDE/Qt
> or the files in question are broken. Attached is a file that renders
> fine in Firefox and Opera, the reporter says that it was saved in NVU,
> while it shows up as garbage in Konqueror, Kate if opened as UTF16.  In
> KEdit it's the same as in Kate when opened in UTF8 mode ("space" after
> every character), while Konqueror in UTF8 mode shows the source.
>  Does anybody know if this is a real problem (wrong handling of such
> files) or it's a problem in the file itself? Certainly for the user it
> looks like a real problem, especially that there are applications out
> there that can work with the file. If I run a
> "recode utf16LE..utf16 filename" on it, the resulted file can be opened
> in every KDE application.

I assume that the LE designation stands for "little endian" and that Qt 
defaults to "big endian". I believe one is supposed to insert a BOM (byte 
order mark) so that applications can guess correctly between utf16LE and 
utf16BE. The spaces that you see in utf8 mode are the NUL values from the 
high-bytes.

I think it would be possible for konqueror to detect LE and BE by looking for 
"<NUL" versus "NUL<" and adjust accordingly. Would be easier if there was a 
separate "utf16le" codec.

Cheers,
Waldo
-- 
bastian at kde.org   |   Free Novell Linux Desktop 9 Evaluation Download
bastian at suse.com  |   http://www.novell.com/products/desktop/eval.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20050209/a21404b9/attachment.sig>


More information about the kde-core-devel mailing list