Messed-up encoding in doc-comments (in the popup-widget)

Andreas Pakulat apaku at gmx.de
Wed Jul 8 08:44:57 UTC 2009


On 08.07.09 09:22:24, David Nolden wrote:
> Am Mittwoch 08 Juli 2009 08:46:15 schrieb Andreas Pakulat:
> > On 08.07.09 08:07:38, David Nolden wrote:
> > > Am Mittwoch 08 Juli 2009 00:16:55 schrieb Andreas Pakulat:
> > > > As selecting the right encoding for each file is not going to work for
> > > > the background parser, I'm wondering wether maybe we should try to use
> > > > th KEncodingProber class from kdelibs (IIRC thats the name of the newer
> > > > one, using the algorithms developed by mozilla to detect the encoding)?
> > >
> > > This would slow down the parsing even more. So I guess we will have to
> > > bother the user with the decision at some point.
> >
> > Hmm, good point, would be nice to have some stats though on how much the
> > encoding-probing influences the parsing process.
> >
> > In addition to what I answered to Milian, we could also offer a way of just
> > converting a given file into utf-8 if the user tells us the source
> > encoding? That should actually be pretty easy with Qt's classes...
> The question is if probing even works correctly with source-files, that 
> contain many non-plain characters.

Hmm, that probably depends on the source encoding, encodings that are based
on ascii should be a lot easier to properly detect than others.

Another problem is if a file mixes different encodings. I'm not sure if
there's an encoding that is completely separate from ascii (i.e. the first
128 bytes don't match ascii), but if there is it could happen that there is
a C++ file with C++ code in plain-ascii and comments in that other
encoding. You cannot properly encode these comments then.
 
> We should definitely convert anything we get into utf-8 if we know it's 
> something else.

Yeah, hence a context menu option to "convert this file to utf-8",
"convert this directory and subdirs to utf-8".

> That encoding-conversion thing sounds like a useful utility,

Its as easy as

QTextStream is(&f);
is.setCodec( QTextCodec::fromName( <user-given-name> ) );
QString data = is.readAll();
QTextStream os(&f);
os.setCodec( QTextCodec::fromName( "utf-8" ) );
os.write( data );

(no testing done, but should work)

> although it could 
> create quite a mess on larger projects when done on a per-file basis. It's 
> better when such a thing is solved project-wide.

See above, we can easily provide an option that converts all project files
additionally to the single-file option. I'd just like to have the latter
available so you can convert individual files using a different encoding.

Oh and of course the option would need to ask for the source encoding and
we somehow need to assemble a list of encodings that are supported by Qt...

Andreas

-- 
You will be awarded the Nobel Peace Prize... posthumously.




More information about the KDevelop-devel mailing list