<html>

  <head>


    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    Hi all<br>

    <br>

    I've been using digikam for a long time but one thing I always

    stumble upon again and again is interoperability concerning the

    various forms of Jpeg Comments.<br>

    I usually view my files in Digikam and Gwenview as well as Photoshop

    and Faststone ImageViewer on Windows.<br>

    So far I haven't found an acceptable way to tag my images so that it

    displays correctly most of the time.<br>

    <br>

    I found this old thread explaining some charsets of the various

    fields:<br>

    <a class="moz-txt-link-freetext" href="http://mail.kde.org/pipermail/digikam-users/2006-October/002116.html">http://mail.kde.org/pipermail/digikam-users/2006-October/002116.html</a><br>

    It says:<br>

    - JFIF is converted from latin1<br>

    - EXIF UserComment may provide a charset, else some 'autodetection'

    takes place<br>

    - IPTC is converted from latin1<br>

    - XMP wasn't supported then...<br>

    <br>

    With some testing I found that digiKam reads the tags in the

    following order:<br>

    -

    <meta name="qrichtext" content="1">

    Xmp.dc.description<br>

    -

    <meta name="qrichtext" content="1">

    Xmp.exif.UserComment<br>

    -

    <meta name="qrichtext" content="1">

    Xmp.tiff.ImageDescription<!--EndFragment-->

    <meta http-equiv="Content-Type" content="text/html;

      charset=ISO-8859-1">

    <style type="text/css">

p, li { white-space: pre-wrap; }

</style><!--EndFragment-->

    <meta http-equiv="Content-Type" content="text/html;

      charset=ISO-8859-1">

    <style type="text/css">

p, li { white-space: pre-wrap; }

</style><br>

    - JFIF Comment ("Jpeg comment")<br>

    - Exif.Photo.UserComment<br>

    - Iptc.Application2.Caption (envelope encoding not honored)<br>

    <br>

    All the Xmp.*.* tags seem to be read and written as UTF8 which is

    correct as far as I know<br>

    However, the JFIF-Comment is written as UTF8 which is at least

    questionable, as the standard doesn't define any charset at all as

    far as I know (and it also seem to have changed since the above

    discussion in 2006).<br>

    <br>

    a) Now when we come to EXIF, things get hairy:<br>

    I've prepared a jpeg file with exiv2 and inserted an

    Exif.Photo.UserComment using Unicode: (reading with exiv2 -pv

    image.jpg) - I've added the complete tag name in the comment to

    recognize where it comes from later on)<br>

    0x9286 Photo        UserComment                 Undefined  88 

    charset="Unicode" Commentwithäöü. (Exif.Photo.UserComment)<br>

    <br>

    Now when viewing in digiKam, the Xmp.dc.description tag is used in

    the GUI since it's present as well. If I change the text and save

    again, the comment shows up as:<br>

    0x9286 Photo        UserComment                 Undefined  23 

    charset="Ascii" Commentwith���.<br>

    <br>

    Thus the text was converted to ISO-8859-1 and the charset specified

    as Ascii - isn't that wrong, since it's definitely not ASCII but

    ISO-8859-1? Why doesn't digiKam use charset="Unicode"?<br>

    <br>

    b) Iptc.Application2.Caption: <br>

    According to that Mail from 2006, IPTC Data is always

    encoded/decoded as latin1, though in other places I found that one

    should/can specify the Iptc.Envelope.CharacterSet to specify the

    character set used. This appears to be ignored by digiKam...<br>

    <br>

    c) Question about Xmp "lang"<br>

    One thing I still do not understand is the lang="..." attribute in

    Xmp comments - what exactly is its meaning? Is it just to add

    multiple entries using different languages? Does this affect

    encoding at all or is it really always UTF8 ?<br>

    <br>

    Thank you very much<br>

    <br>

    Matt<br>

    <!--EndFragment--><!--EndFragment--><!--EndFragment-->

    <meta http-equiv="Content-Type" content="text/html;

      charset=ISO-8859-1">

    <style type="text/css">

p, li { white-space: pre-wrap; }

</style>

  </body>

</html>