Patch v4: Fix KEncodingProber crashiness

Sat Mar 21 15:03:20 GMT 2009

On Samstag 21 März 2009, Jeff Mitchell wrote:
> Peter Oberndorfer wrote:
> > But what if the current prober has not detected anything?
> > Then the current confidece in the prober is still 0.0
> > 
> > The intention of the code is IMO when a unicode BOM is found
> > a confidence of 0.99 is returned.
> > 
> > At least i think it did this until 845343
> 
> Unsure...I haven't looked at the history of the code, just how it is
> now.  Regardless, in today's code, the variable was a total do-nothing,
> as the confidence in the prober is always what is returned.
> 

I think it is very important to look at the past and see how it got into the broken state, and how it worked before.

If we just look at the current state we can thow away the whole unicodeTest, since it does nothing useful now.

It says it found something.
But it does not set a probability.
And it does not tell us what it thinks it found.

It does not even feed the data to the normal probers if the unicodeTest found something.

The class description says "Always do Unicode probe regardless the ProberType"
so we probably should not just rip it out but fix it.

Maybe we could remove the unicodeTest, add another proberUnicode with
the same nsCharsetProber interface like the other probers.
And then first feed to proberUnicode and afterwards to prober.
Bit I haven't fully thought about it yet.

> >>> 5)
> >>> I think the prober is leaked on setProberType()?
> >> Yes, it is.  Fixed in the attached v3 patch.
> >>
> > 
> > I think a part of this patch is already applied now 942108 ?
> 
> Yes, you're right.  I wish Wang had not done this, so that the finalized
> patch could be committed in one SVN revision.
> 
> >> +        if (prober)
> >> +            delete prober;
> >> +
> > 
> > Why is the if needed?
> > it is legal to delete a 0 pointer.
> > The destructor above does this already.
> 
> It's not.  I can fix it if you care (attached).

Yes please.

> 
> >> +    QString encoding;
> > 
> > Why is this a QString?
> > it is only ever assigned static char strings.
> > and then it is only tested once with isEmpty but never used anywhere else?
> > I think it should somehow be returned in encodingNameByteArray
> > when unicodeTest() found something.
> 
> Better idea: remove it entirely, as it's basically a no-op.  In attached.

I don't think this patch is enough since it still sets proberState.
Without doing something useful actually.

Additionally David Faure posted some comments regarding the new funcion on kde-commits.

> * Missing @since 4.3
Or @since 4.2.? in case it is backported.

> * Why not name it encoding() ?
> encodingNameByteArray reads very hackish...
I also have to say the name sounds strange
now we still have a chance to choose a better name.

Also we should fix the class description with the usage example.
Which also has a typo:
QTextCodec::codeForName -> code_c_ForName

Meanwhile i will try to come up with some unit tests for the unicode stuff.
And see how(if at all) QTextCodec handles the encoding names unicodeTest wants to return.

Greetings Peter