Bugfix and Improvements on encoding probe

Sat Mar 28 12:36:40 UTC 2009

This patch first fix a bug:

Taglib::String::toCString() only works correctly if
Taglib::String::isLatin(), since taglib internally directly convert
wchar to char, high bits ignored. (eg. 0xec0b -> 0x0b)
So don't do encoding probe when isLatin() return false,
this fix a BUG that utf-8 encoding can't be detected, (since the
source string feed to prober is WRONG), after fix this, many mp3 with
utf-8 encoded id3v1 tag can be detected correctly.

Secondly, at the end of readMetaData() function, again perform the
encoding probe in two steps:

1. isLatin() return true (to skip probe these encodings: ucs4,utf-16
le/be, and this is the situation that prober can work correctly )
2. while isLain() return false(single byte encoding), AND
errorsIfUtf8() return true ( to be sure that it's 100% not utf-8 ),

errorsIfUtf8() is a function which return true, means 100% not utf-8,
return false, means it has a big chance to be utf-8 encoded.

The above two steps can solve the encoding problem for id3v2 or other
tags not utf-8/16 encoded,
and don't do blindly probing in this situation , only probe when we're
sure it's wrong,
try to avoid display strange glyphs to user..

The patch also includes same modifications to the collectionscanner,

Regards,
-------------- next part --------------
A non-text attachment was scrubbed...
Name: amarok.patch
Type: application/octet-stream
Size: 14061 bytes
Desc: not available
URL: <http://mail.kde.org/pipermail/amarok/attachments/20090328/49e10b60/attachment.obj>