Character sets / encoding

Peter Lewis lewisp at
Tue Sep 8 08:48:01 BST 2009

On Monday 07 Sep 2009 Peter Lewis sent:

> On Sunday 06 Sep 2009 Anne Wilson sent:
> > In KMail I have problems with accented characters, resulting in this like
> > L�ck.  I assume this is a problem of character encoding.  There doesn't
> > seem to be anywhere in systemsettings that I can check and possibly alter
> > that. Any suggestions?
> I find this sort of behaviour on several websites as viewed in Firefox,
> often where a British pound sign should be. When I view the source in a
> tool to shows the hexadecimal value of the characters (okteta for example)
> I find that they are all group values greater than 0x7f, that is beyond the
> encoding scope of most character sets.
> I just assumed that the funny negative question mark is a way of saying
> "what the heck".
> The characters that you sent were 0x4c 0xef 0xbf 0xbd 0x63 0x6b so I am not
> surprised that nothing much could be done with it.

May I enlarge on my rather hasty post of last night.

The sequence 0xef 0xbf 0xbd is the utf-8 encoding application's way of saying 
that it recognised a character that did not fit in the legal utf-8 character 

I have found two pages in Wikipedia that describe it better than I can: will show the utf-8 encoding technique. 
is the mapping of character sets onto the unicode number plane. is the key page that describes 
what can go wrong to give you the "what the heck?".

I hope that this clears up everything and for haters of the big red-mondster 
the smug feeling that it was all caused by a dirty "quick fix" colliding with 
a well thought out solution!

Kind regards,
Peter Lewis

This message is from the kde mailing list.
Account management:
More info:

More information about the kde mailing list