store text using UTF-8

moowei emailmoo at gmail.com
Sat Jun 17 09:35:04 CEST 2006


Does TagLib::ID3v2::FrameFactory::instance()->setDefaultTextEncoding(TagLib::String::UTF8)
have effect on reading (parsing) an existing file?

The reason is, there are many id3tag text frames that are not written
in standard encoding specified in the id3 manual. For instances: Big5,
GB3212, Shift_JIS, EUC-KR... are used to encode texts. Thus, the
encoding byte in Text Identification Frame really doesn't help TagLib
during parsing. The encoding byte maybe set to $00, $01, or some other
number. None of them are correct because they are neither Latin1 nor
Unicode.

If I were correct, TagLib converts these "text" into UTF-16BE at the
"very" beginning of constructing the frame object. And these
non-standard text doesn't get parsed correctly. I understand TagLib
shouldn't need to consider these non-standard case, but:

Is there an way to access the "raw" text data (before converted into
UTF16) using TagLib?

What I am trying to do is converting these non-standard encoded text
into Unicode through iconv, but I would need to access the raw text in
the first place.

Regards,

Moo


On 6/16/06, Scott Wheeler <wheeler at kde.org> wrote:
> On Thursday 15 June 2006 21:09, moowei wrote:
>
> > I seemed to be complicating the issue. Are there examples regarding
> > character encoding?
>
> You can set the default using:
>
> TagLib::ID3v2::FrameFactory::instance()->setDefaultTextEncoding(TagLib::String::UTF8);
>
> -Scott
>
> --
> For a successful technology, reality must take precedence over public
> relations, for nature cannot be fooled.
> --Richard Feynman
> _______________________________________________
> taglib-devel mailing list
> taglib-devel at kde.org
> https://mail.kde.org/mailman/listinfo/taglib-devel
>


More information about the taglib-devel mailing list