Hi, some comments about encoding detection (KEncodingDetector)
Wang Hoi
zealot.hoi at gmail.com
Mon Aug 4 09:47:49 BST 2008
a modified patch, introduce KEncodingProber (KEncodingDetector2 is a bad name)
with a clean and more powerful interface:
class KDECORE_EXPORT KEncodingProber
{
public:
enum ProberState {
FoundIt, // sure
NotMe, // sure not
Probing //initial State or not sure
};
enum ProberType {
Universal,
Arabic,
..........
Unicode,
WesternEuropean
};
KEncodingProber(ProberType proberType=Universal);
~KEncodingProber();
void reset();
ProberState feed(const QByteArray &data);
ProberState feed(const char* data, int len);
ProberState getState() const;
const char* getEncoding() const;
float getConfidence() const; // 0.0 ~ 0.99
private:
KEncodingProberPrivate* const d;
};
user can feed data to it continously, until ProberState change from
Probing to FoundIt or NotMe, when ProberState==Probing, user can also
call getConfidence() etc.. to get the most confident encoding it
guessed from feeded data.
it's used to *guess* the encoding of raw text, not able to get the
encoding directly from Html/Xml tags ( such as <?xml encoding="xxx" ?>
).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: encodingDetection.patch.tar.bz2
Type: application/x-bzip2
Size: 126607 bytes
Desc: not available
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20080804/7507b737/attachment.bin>
More information about the kde-core-devel
mailing list