[Digikam-devel] [Bug 205824] EXIF UserComments with special characters get tagged as ASCII

Andreas Huggel ahuggel at gmx.net
Fri Sep 25 05:31:59 BST 2009


https://bugs.kde.org/show_bug.cgi?id=205824





--- Comment #8 from Andreas Huggel <ahuggel gmx net>  2009-09-25 06:31:55 ---
I found only a hidden hint that seems to point to UTF-16 for a "UNICODE"
UserComment. It's in the comments for tag ImageDescription, on page 22 of the
Exif specs: "When a 2-byte code is necessary, the Exif Private tag UserComment
is to be used".

Exiv2 doesn't do any conversion (yet...), it leaves it to the application to do
the right thing.

For comparison, Exiftool writes the UserComment tag with an Exif character code
"ASCII" if the text consists of only 7-bit characters, else it uses the Exif
character code "UNICODE" and encodes the text in UTF-16.
It encodes the UTF-16 string using the same byte order as the rest of the
Exif/TIFF structure and without a BOM.
On read it expects a UTF-16 encoded text, has some intelligence to guess the
byte order, and interprets a BOM if there is one. It doesn't seems to have any
provision for UTF-8 encoded UserComment text, though.

Exiv2 should probably follow a similar logic eventually, although I'd think
that there are images with UTF-8 encoded UserComment tags out there in the
wild.

Andreas

-- 
Configure bugmail: https://bugs.kde.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.



More information about the Digikam-devel mailing list