Character sets / encoding
lewisp at avex.co.uk
Tue Sep 8 08:48:01 BST 2009
On Monday 07 Sep 2009 Peter Lewis sent:
> On Sunday 06 Sep 2009 Anne Wilson sent:
> > In KMail I have problems with accented characters, resulting in this like
> > L�ck. I assume this is a problem of character encoding. There doesn't
> > seem to be anywhere in systemsettings that I can check and possibly alter
> > that. Any suggestions?
> I find this sort of behaviour on several websites as viewed in Firefox,
> often where a British pound sign should be. When I view the source in a
> tool to shows the hexadecimal value of the characters (okteta for example)
> I find that they are all group values greater than 0x7f, that is beyond the
> encoding scope of most character sets.
> I just assumed that the funny negative question mark is a way of saying
> "what the heck".
> The characters that you sent were 0x4c 0xef 0xbf 0xbd 0x63 0x6b so I am not
> surprised that nothing much could be done with it.
May I enlarge on my rather hasty post of last night.
The sequence 0xef 0xbf 0xbd is the utf-8 encoding application's way of saying
that it recognised a character that did not fit in the legal utf-8 character
I have found two pages in Wikipedia that describe it better than I can:
http://en.wikipedia.org/wiki/UTF-8 will show the utf-8 encoding technique.
is the mapping of character sets onto the unicode number plane.
http://en.wikipedia.org/wiki/Unicode_Specials is the key page that describes
what can go wrong to give you the "what the heck?".
I hope that this clears up everything and for haters of the big red-mondster
the smug feeling that it was all caused by a dirty "quick fix" colliding with
a well thought out solution!
This message is from the kde mailing list.
Account management: https://mail.kde.org/mailman/listinfo/kde.
More info: http://www.kde.org/faq.html.
More information about the kde