Character sets / encoding

Anne Wilson cannewilson at googlemail.com
Thu Sep 10 09:51:55 BST 2009


On Thursday 10 September 2009 07:54:25 Patrick Nagel wrote:
> The real problem with charsets and encodings is, that you always have to
>  tell the interpreting program (Browser, Mail/News reader, ... whichever
>  program wants to show the bits from the net in a readable form) which
>  Charset (and encoding) has actually been used to encode the message, so
>  that it can choose the matching decoder.
> 
> If this information is not given, there is no other way than guessing. And
> everybody knows that computers are not good at that. How would a computer
>  know how the string 'äëïöüñ' from James should actually look like, if he
>  hadn't had specified the encoding in the header (open the source code of
>  his mail, and you will see the following line: Content-Type: text/plain;
>  charset="iso-8859-1"). The computer could then (for example) have guessed
>  that those bits were supposed to mean "潆秭" ("eddy billion" in Chinese)...
>  Ok, I admit, I cheated a bit on this one - it wouldn't have been a valid
>  bit sequence for a GBK decoder, which any sane guessing algorithm would
>  have detected... but still, I think you get the point.
> 
> So, people, use Unicode (the "universal charset") encoded as UTF-8 for
> everything - and maybe in a few years we can all forget about all this
> charset/encoding mess :)
> 
That explains a lot, thanks.
> 
> P.S.: I used Unicode/UTF-8 in this mail (and of course it's specified in
>  the mail's header), otherwise it wouldn't even have been possible to put
>  both Chinese characters and umlauts in one mail.
> 
At least that looks hopeful for the future ;-)

Anne
-- 
New to KDE4? - get help from http://userbase.kde.org
Just found a cool new feature?  Add it to UserBase
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kde/attachments/20090910/db5f6399/attachment.sig>
-------------- next part --------------
___________________________________________________
This message is from the kde mailing list.
Account management:  https://mail.kde.org/mailman/listinfo/kde.
Archives: http://lists.kde.org/.
More info: http://www.kde.org/faq.html.


More information about the kde mailing list