[Digikam-devel] [Bug 132244] Special Chars in Keywords decode wrong in IPTC

Loïc Brarda loic.brarda21 at fnac.net
Wed Oct 4 12:23:42 BST 2006


2006/10/4, Gilles Caulier <caulier.gilles at free.fr>:

> Note: my comments #1 still right. UTF8 is not supported by IPTC. If an application try to embed UFT8 string in an IPTC tags, well the IPTC specification is not respected. Look here:
>
> http://www.iptc.org/std/IIM/4.1/specification/IIMV4.1.pdf
>

For me, it's not that clear in the specification.

The character set can be defined in the envelop record (dataset 1:90)
which is normaly not used (as I understand the specs, the whole spec
was made to encapsulate picture in IIMV file, not encapsulate IOTC
infos in picture files).

Other specification sections let me think UTF8 is possible :

"Section 1.12 DataSet octet sizes do not imply character sizing. The number of
characters will depend on the encoding method specified. The number of octets
specified within a DataSet Data Field Octet Count will always be equal
to or greater
than the number of characters of data represented."

There is also the definition of UTF8 in Section 1.75.

The more standard way should probably be using a record 1 with a 1:90
dataset to define UTF8 but I think most programs just use UTF8
directly in the text fields.

After  some googling, I found the following page
(http://bugs.php.net/bug.php?id=27238) with links with files with
Record1 charset info but unfortunatly, the links are broken.
I found also some links with IPTC software showing their UTF8 support.

I'll try to do some tests with differents IPTC writing software.

   Loic



More information about the Digikam-devel mailing list