Generic API useless with Unicode for ID3v2?
Vitali Lovich
vlovich at gmail.com
Wed Oct 17 11:55:52 CEST 2007
Linus Walleij wrote:
> 2007/10/17, Vitali Lovich <vlovich at gmail.com>:
>
>
>> The better question is why UTF-8 isn't used as the default encoding
>> everywhere.
>>
>
> IIRC the main reason for not using UTF-8 is that there are many old
> programs out there which cannot handle it, since it was introduced with
> ID3v2 2.4 and taglib is actually one of the few libs which can handle
> it correctly.
>
> The interoperability and likeness to ASCII doesn't help ID3v2, each
> frame is tagged with an encoding and when that has the unknown
> enumeration value for UTF-8 (0x03) old implementations done
> prior to 2.4 will just bail out and set the string to <null>.
>
> Linus
> _______________________________________________
> taglib-devel mailing list
> taglib-devel at kde.org
> https://mail.kde.org/mailman/listinfo/taglib-devel
>
>
Then the behaviour is that UTF8 is used for 2.4 frames & identifies
itself as such, UTF8 is used for prior implementations and identifies &
encodes itself as UTF16 when saving back if Latin1 is a lossy encoding
(otherwise Latin1 is used).
Additionally, all Latin1 frames read or Latin1 strings input are
interpreted as UTF8 - this way buggy or quirky implementations writing
UTF8 data and identifying it as Latin1 will still provide the correct &
consistent behaviour to the clients using the library.
More information about the taglib-devel
mailing list