Patch v3: Fix KEncodingProber crashiness
Peter Oberndorfer
kumbayo84 at arcor.de
Sat Mar 21 12:52:42 GMT 2009
On Samstag 21 März 2009, Jeff Mitchell wrote:
> Yes, these are some issues which I didn't look at as I was just focused
> on the crash and the encoding variable. :-)
>
> Peter Oberndorfer wrote:
> > 4)
> > currentConfidence is set when unicodeTest detects something.
> > But the value is never used.
> > So confidence 0.0 is returned i think?
>
> No, the confidence returned is the confidence of the prober itself.
> currentConfidence itself is, as you surmised, never used. It's removed
> in the attached v3 patch.
>
But what if the current prober has not detected anything?
Then the current confidece in the prober is still 0.0
The intention of the code is IMO when a unicode BOM is found
a confidence of 0.99 is returned.
At least i think it did this until 845343
For example katebuffer.cpp feeds in some data and then checks for confidence > 0.5
without ever looking at the found it state.
Maybe this breaks?
> > 5)
> > I think the prober is leaked on setProberType()?
>
> Yes, it is. Fixed in the attached v3 patch.
>
I think a part of this patch is already applied now 942108 ?
> > 6)
> > uselesss #define MINIMUM_THRESHOLD ?
>
> Yep. Also fixed.
>
> --Jeff
>
> Index: kencodingprober.cpp
> ===================================================================
> --- kencodingprober.cpp (revision 942070)
> +++ kencodingprober.cpp (working copy)
> @@ -34,15 +34,12 @@
>
> #include <string.h>
>
> -#define MINIMUM_THRESHOLD (float)0.2
> -
> class KEncodingProberPrivate
> {
> public:
> - KEncodingProberPrivate(): encoding(strdup("")), prober(NULL), mStart(true) {};
> + KEncodingProberPrivate(): prober(NULL), mStart(true) {};
> ~KEncodingProberPrivate()
> {
> - delete encoding;
> delete prober;
> }
> void setProberType(KEncodingProber::ProberType pType)
> @@ -53,6 +50,10 @@
> * for single-byte encodings (most western encodings), nsSBCSGroupProber is ok,
> * because encoding state machine can detect many such encodings.
> */
> +
> + if (prober)
> + delete prober;
> +
Why is the if needed?
it is legal to delete a 0 pointer.
The destructor above does this already.
> switch (proberType) {
> case KEncodingProber::None:
> prober = NULL;
> @@ -130,17 +131,13 @@
> break;
> } // switch
>
> - if (encoding && strlen(encoding))
> - {
> + if (!encoding.isEmpty())
> proberState = KEncodingProber::FoundIt;
> - currentConfidence = 0.99f;
> - }
> }
> }
> KEncodingProber::ProberType proberType;
> KEncodingProber::ProberState proberState;
> - float currentConfidence;
> - const char *encoding;
> + QString encoding;
Why is this a QString?
it is only ever assigned static char strings.
and then it is only tested once with isEmpty but never used anywhere else?
I think it should somehow be returned in encodingNameByteArray
when unicodeTest() found something.
> kencodingprober::nsCharSetProber *prober;
> bool mStart;
> };
> @@ -201,12 +198,18 @@
> return d->proberState;
> }
>
> +//DEPRECATED, do *not* use
> const char* KEncodingProber::encodingName() const
> {
> + return 0;
Shouldn't we return a strdup'ed text here?
It leaks memory but still works.
something like (warning untested!):
return strdup(encodingNameByteArray().constData());
or at least fix katebuffer.cpp to not call the old function?
> +}
> +
> +QByteArray KEncodingProber::encodingNameByteArray() const
> +{
> if (!d->prober)
> - return strdup("UTF-8");
> + return QByteArray("UTF-8");
>
> - return strdup(d->prober->GetCharSetName());
> + return QByteArray(d->prober->GetCharSetName());
> }
Greetings Peter
More information about the kde-core-devel
mailing list