Generic API useless with Unicode for ID3v2?

Vitali Lovich vlovich at gmail.com
Wed Oct 17 11:55:52 CEST 2007



Linus Walleij wrote:
> 2007/10/17, Vitali Lovich <vlovich at gmail.com>:
>
>   
>> The better question is why UTF-8 isn't used as the default encoding
>> everywhere.
>>     
>
> IIRC the main reason for not using UTF-8 is that there are many old
> programs out there which cannot handle it, since it was introduced with
> ID3v2 2.4 and taglib is actually one of the few libs which can handle
> it correctly.
>
> The interoperability and likeness to ASCII doesn't help ID3v2, each
> frame is tagged with an encoding and when that has the unknown
> enumeration value for UTF-8 (0x03) old implementations done
> prior to 2.4 will just bail out and set the string to <null>.
>
> Linus
> _______________________________________________
> taglib-devel mailing list
> taglib-devel at kde.org
> https://mail.kde.org/mailman/listinfo/taglib-devel
>
>   
Then the behaviour is that UTF8 is used for 2.4 frames & identifies 
itself as such, UTF8 is used for prior implementations and identifies & 
encodes itself as UTF16 when saving back if Latin1 is a lossy encoding 
(otherwise Latin1 is used).

Additionally, all Latin1 frames read or Latin1 strings input are 
interpreted as UTF8 - this way buggy or quirky implementations writing 
UTF8 data and identifying it as Latin1 will still provide the correct & 
consistent behaviour to the clients using the library.


More information about the taglib-devel mailing list