<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Hi all<br>
<br>
I've been using digikam for a long time but one thing I always
stumble upon again and again is interoperability concerning the
various forms of Jpeg Comments.<br>
I usually view my files in Digikam and Gwenview as well as Photoshop
and Faststone ImageViewer on Windows.<br>
So far I haven't found an acceptable way to tag my images so that it
displays correctly most of the time.<br>
<br>
I found this old thread explaining some charsets of the various
fields:<br>
<a class="moz-txt-link-freetext" href="http://mail.kde.org/pipermail/digikam-users/2006-October/002116.html">http://mail.kde.org/pipermail/digikam-users/2006-October/002116.html</a><br>
It says:<br>
- JFIF is converted from latin1<br>
- EXIF UserComment may provide a charset, else some 'autodetection'
takes place<br>
- IPTC is converted from latin1<br>
- XMP wasn't supported then...<br>
<br>
With some testing I found that digiKam reads the tags in the
following order:<br>
-
<meta name="qrichtext" content="1">
Xmp.dc.description<br>
-
<meta name="qrichtext" content="1">
Xmp.exif.UserComment<br>
-
<meta name="qrichtext" content="1">
Xmp.tiff.ImageDescription<!--EndFragment-->
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<style type="text/css">
p, li { white-space: pre-wrap; }
</style><!--EndFragment-->
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<style type="text/css">
p, li { white-space: pre-wrap; }
</style><br>
- JFIF Comment ("Jpeg comment")<br>
- Exif.Photo.UserComment<br>
- Iptc.Application2.Caption (envelope encoding not honored)<br>
<br>
All the Xmp.*.* tags seem to be read and written as UTF8 which is
correct as far as I know<br>
However, the JFIF-Comment is written as UTF8 which is at least
questionable, as the standard doesn't define any charset at all as
far as I know (and it also seem to have changed since the above
discussion in 2006).<br>
<br>
a) Now when we come to EXIF, things get hairy:<br>
I've prepared a jpeg file with exiv2 and inserted an
Exif.Photo.UserComment using Unicode: (reading with exiv2 -pv
image.jpg) - I've added the complete tag name in the comment to
recognize where it comes from later on)<br>
0x9286 Photo UserComment Undefined 88
charset="Unicode" Commentwithäöü. (Exif.Photo.UserComment)<br>
<br>
Now when viewing in digiKam, the Xmp.dc.description tag is used in
the GUI since it's present as well. If I change the text and save
again, the comment shows up as:<br>
0x9286 Photo UserComment Undefined 23
charset="Ascii" Commentwith���.<br>
<br>
Thus the text was converted to ISO-8859-1 and the charset specified
as Ascii - isn't that wrong, since it's definitely not ASCII but
ISO-8859-1? Why doesn't digiKam use charset="Unicode"?<br>
<br>
b) Iptc.Application2.Caption: <br>
According to that Mail from 2006, IPTC Data is always
encoded/decoded as latin1, though in other places I found that one
should/can specify the Iptc.Envelope.CharacterSet to specify the
character set used. This appears to be ignored by digiKam...<br>
<br>
c) Question about Xmp "lang"<br>
One thing I still do not understand is the lang="..." attribute in
Xmp comments - what exactly is its meaning? Is it just to add
multiple entries using different languages? Does this affect
encoding at all or is it really always UTF8 ?<br>
<br>
Thank you very much<br>
<br>
Matt<br>
<!--EndFragment--><!--EndFragment--><!--EndFragment-->
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<style type="text/css">
p, li { white-space: pre-wrap; }
</style>
</body>
</html>