Messed-up encoding in doc-comments (in the popup-widget)

Andreas Pakulat apaku at gmx.de
Tue Jul 7 22:16:55 UTC 2009


On 07.07.09 23:58:53, David Nolden wrote:
> Am Dienstag 07 Juli 2009 14:53:59 schrieb Milian Wolff:
> > Hey guys, I want to fix bug https://bugs.kde.org/show_bug.cgi?id=183182
> > but am a bit lost. Can someone give me a bit of insight?
> >
> > What I found out is that it only affects doc-comments and that it is not
> > due to formatComment().
> >
> > 1) How could I properly debug this stuff? kDebug() doesn't seem to work
> > fine. Could I use GDB and somehow print me the text and see where it
> > gets corrupted?
> >
> > 2) Could the popup-widget be the culprit? Where are its sources again?
> Generally, the comments are supposed to be utf8 encoded within the duchan. And 
> kDebug() should work properly with them. If it doesn't, then that's probably 
> already part of the problem.
> 
> There is one thing that comes into my mind. Look at 
> kdevelop/languages/cpp/preprocessjob.cpp: I think there's somewhere a @todo, 
> saying something like "convert the file to utf-8 if it isn't yet", and I think 
> that todo is there still.
> 
> That would need a check like "If the local encoding is not utf-8, convert the 
> text before processing it".

I'd just like to add that "use the local encoding" might not necessarily
help. Most distro's use utf-8 as default encoding, but the user may have
files from "back then" when he was using KOI-8, latin9 or whatever.

As selecting the right encoding for each file is not going to work for the
background parser, I'm wondering wether maybe we should try to use th
KEncodingProber class from kdelibs (IIRC thats the name of the newer one,
using the algorithms developed by mozilla to detect the encoding)?

Andreas

-- 
Of course you have a purpose -- to find a purpose.




More information about the KDevelop-devel mailing list