[Fwd: [Bug 200596] [Patch] id3v1 japanese characters encoding]
mook at songbirdnest.com
Mon Nov 16 19:52:53 CET 2009
On 2009-11-15 1:58 PM, Jeff Mitchell wrote:
> We've had some trouble with character set detection that Cesar has
> managed to narrow down to TagLib stripping the Unicode BOM from its
> strings -- see the attached message, and if you need more background the
> last few comments of the associated bug report. Would it be possible to
> provide one of these two solutions?
For reference, Songbird has a local patch to do B) - it exposes whether
the string was constructed with the Latin1 or one of the Unicode types.
That's an API change, of course, so it probably wouldn't be valid for
upstream until 2.0.
TagLib::String::isLatin1() can be used for now, but will still get you
false positives because it _is_ valid to have metadata that happens to
be Unicode, contains non-ASCII Latin1, and will be misdetected. At
least, I don't trust my charset detector that much :)
(The patch is http://timeline.songbirdnest.com/vendor/changeset/10855
which doesn't make much sense until you realize we had a previous patch
in http://timeline.songbirdnest.com/vendor/changeset/10852 which felt
mook at songbirdnest
More information about the taglib-devel