Question about taglib abilities

Антон Сергунов setosha at gmail.com
Sun Jul 15 17:55:25 UTC 2012


Yes.

Usually id3 tag has both id3v1 and id3v2 tags.

Id3v1 should have all text fields latin1 encoded.
But by historical reasons it usual (thanks to winamp player) has local
windows encoding.
Because it was only way to save non latin1 strings.

so tagLib has TagLib::ID3v1::Tag::setStringHandler() function to overwrite
id3v1 string encoder.


id3v2 has encoding field and can use unicode-16 utf-8.
But I saw my own eyes id3v2 fields saved this way.
I think some software translte id3v1 tags to id3v2 as is.

That's why a told you to check TagLib::String::isLatin1 and then convert to
unicode from user's local windows encoding.

2012/7/16 Christian Convey <christian.convey at gmail.com>

> So is the following correct regarding libtag, and mp3 files using
> ID3v2.2?  (Please forgive me if I'm getting too off-topic of libtag.
> I'll stop asking whenever you like.)
>
> The "latin1" coding, strictly speaking, refers to ISO 8859-1.
>
> What Microsoft calls "cp1250" and "cp1252" are also eight-bit
> character encodings.  These two encodings are consistent with ISO
> 8859-1, but they are supersets of 8859-1.
>
> When an ID3v2.2 tag has a field (such as Title) encoded in cp1250 or
> cp1252, the field's "type" byte will indicate that the "latin1"
> encoding is being used.  However, there is not enough information in
> the ID3 metadata to know with certainty whether the actual code page
> is cp1250, cp1252, or something else.
>
> Is that it?
>
> Thanks again.
> - Christian
>
> On Sun, Jul 15, 2012 at 12:40 PM, Антон Сергунов <setosha at gmail.com>
> wrote:
> > windows cp1250 or cp1252 for german
> >
> >
> > 2012/7/15 Christian Convey <christian.convey at gmail.com>
> >>
> >> Thanks very much, that's a big help.
> >>
> >> Do you happen to know if it's common for MP3 tagging software to use
> >> character encodings *other than* the five valid ID3v2 encodings
> >> (latin1, UTF16, UTF16BE, UTF16LE, and UTF8) ?
> >>
> >> I'm trying to anticipate how many different character encodings I'll
> >> have to try out when debugging this MP3 file.
> >>
> >> Thanks,
> >> Christian
> >>
> >> On Sun, Jul 15, 2012 at 11:27 AM, Антон Сергунов <setosha at gmail.com>
> >> wrote:
> >> > TagLib doesn't convert strings. It read encoding (String::Type) and
> raw
> >> > data
> >> > (ByteArray) from file.
> >> > You can then perform conversion with String::toWString() but before it
> >> > contains raw byte data from file.
> >> >
> >> > But I can't find function to get type enum here.
> >> > So you can get raw data with String::data(Type t)
> >> >
> >> >
> >> > 2012/7/15 Christian Convey <christian.convey at gmail.com>
> >> >>
> >> >> Thanks.  But this is actually a podcast run by someone else:
> >> >> http://www.dw.de/dw/0,,2548,00.html
> >> >>
> >> >> So actually fixing the problem is outside of my power.  What I'd like
> >> >> to do is research the problem with their mp3 files carefully, so that
> >> >> I can tell them precisely with the problem is.
> >> >>
> >> >> (For example, "Your mp3 tagging software is claiming that the text is
> >> >> encoded using UTF-8, but it's actually UTF-16.")
> >> >>
> >> >> On Sun, Jul 15, 2012 at 10:52 AM, Антон Сергунов <setosha at gmail.com>
> >> >> wrote:
> >> >> > Most common id3 encoding problem is using local 8bit win encoding
> in
> >> >> > Latin1
> >> >> > fields. You can use special Latin1 handler or (better works for me)
> >> >> > if
> >> >> > string is in Latin1 convert it to local 8 bit windows encoding.
> >> >> >
> >> >> > 15.07.2012 21:35 пользователь "Christian Convey"
> >> >> > <christian.convey at gmail.com> написал:
> >> >> >>
> >> >> >> I'm new to ID3 tag handling.  Can you tell me if taglib can be
> used
> >> >> >> to
> >> >> >> solve a particular problem?
> >> >> >>
> >> >> >> I have MP3 files frm a podcast, and I suspect that there's an
> >> >> >> inconsistency between the actual encoding of the ID3v2.2 Title
> >> >> >> field,
> >> >> >> and the byte that states what encoding is used for that string.
> >> >> >>
> >> >> >> Can taglib tell me which encoding the file *claims* to have for
> that
> >> >> >> field?
> >> >> >>
> >> >> >> And can I get taglib to give me the bytes in the ID3v2.2 Title
> field
> >> >> >> *without* taglib automatically performing some kind of
> >> >> >> character-encoding translation?
> >> >> >> _______________________________________________
> >> >> >> taglib-devel mailing list
> >> >> >> taglib-devel at kde.org
> >> >> >> https://mail.kde.org/mailman/listinfo/taglib-devel
> >> >> >
> >> >> >
> >> >> > _______________________________________________
> >> >> > taglib-devel mailing list
> >> >> > taglib-devel at kde.org
> >> >> > https://mail.kde.org/mailman/listinfo/taglib-devel
> >> >> >
> >> >> _______________________________________________
> >> >> taglib-devel mailing list
> >> >> taglib-devel at kde.org
> >> >> https://mail.kde.org/mailman/listinfo/taglib-devel
> >> >
> >> >
> >> >
> >> > _______________________________________________
> >> > taglib-devel mailing list
> >> > taglib-devel at kde.org
> >> > https://mail.kde.org/mailman/listinfo/taglib-devel
> >> >
> >> _______________________________________________
> >> taglib-devel mailing list
> >> taglib-devel at kde.org
> >> https://mail.kde.org/mailman/listinfo/taglib-devel
> >
> >
> >
> > _______________________________________________
> > taglib-devel mailing list
> > taglib-devel at kde.org
> > https://mail.kde.org/mailman/listinfo/taglib-devel
> >
> _______________________________________________
> taglib-devel mailing list
> taglib-devel at kde.org
> https://mail.kde.org/mailman/listinfo/taglib-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/taglib-devel/attachments/20120716/3562be30/attachment.html>


More information about the taglib-devel mailing list