[Digikam-users] unicode chars break xmp sidecars?
Phil
philtuckey at free.fr
Fri May 16 00:19:40 BST 2014
Thanks for looking Gilles. This made me think I might be causing the
problem by something I do to my images, and I found the cause.
The problem is triggered by setting the IPTC record CodedCharacterSet to
UTF8. For example, with image.jpg which contains no IPTC records, run
exiftool -tagsfromfile @ -iptc:all -codedcharacterset=utf8 image.jpg
This creates two IPTC records, CodedCharacterSet (= ESC % G) and
EnvelopeRecordVersion (= 4). After this, the
unicode-tag-breaking-sidecars behaviour appears for image.jpg. (One can
verify that the problem is not caused by the EnvelopeRecordVersion record.)
I was lead to set IPTC:codedcharacterset=utf8 by advice in the exiftool FAQ:
http://www.sno.phy.queensu.ca/~phil/exiftool/faq.html#Q10
This usage appears to be consistent with the IPTC IIM specification
pointed to from that page:
http://www.iptc.org/std/IIM/4.1/specification/IIMV4.1.pdf
(I quote the relevant part below.)
So it looks like digikam should continue to write the xmp sidecars as
usual, when this record is set to utf8. Am I missing something?
I tried tagging such images in darktable, which I believe also uses
exiv2, and it wrote the sidecars correctly, which suggests the problem
is specific to digikam.
Best Philip
Quote from IPTC IIM specification v.4 rev.1:
"1.90 Coded Character Set
Optional, not repeatable, up to 32 octets, consisting of one or more
control functions used for the announcement, invocation or designation
of coded character sets. The control functions follow the ISO 2022
standard and may consist of the escape control character and one or more
graphic characters. For more details see Appendix C, the IPTC-NAA Code
Library.
The control functions apply to character oriented DataSets in records
2-6. They also apply to record 8, unless the objectdata explicitly, or
the File Format implicitly, defines character sets otherwise.
If this DataSet contains the designation function for Unicode in UTF-8
then no other announcement, designation or invocation functions are
permitted in this DataSet or in records 2-6.
..."
On 15/05/14 22:58, Gilles Caulier wrote:
> I try to reproduce to dysfuntion here (Linux) and "Café appears fine
> in sidecar file.
>
> Sound like a dysfunction from Exiv2 which is delegate to write sidecar content.
>
> Best
>
> Gilles Caulier
>
> 2014-05-15 22:02 GMT+02:00 Phil <philtuckey at free.fr>:
>> Does anyone else see the following behaviour? If I assign a tag containing a
>> (non-ascii) unicode character to an image, for example "café", digikam will
>> write the tag to the image file perfectly well, but fails to write the xmp
>> sidecar correctly. Only the first line of the sidecar is written:
>> <?xml version="1.0" encoding="UTF-8"?>
>>
>> I am on OSX 10.9.2, digikam 3.5.0 (current macports).
>>
>> Thanks
>> _______________________________________________
>> Digikam-users mailing list
>> Digikam-users at kde.org
>> https://mail.kde.org/mailman/listinfo/digikam-users
> _______________________________________________
> Digikam-users mailing list
> Digikam-users at kde.org
> https://mail.kde.org/mailman/listinfo/digikam-users
>
More information about the Digikam-users
mailing list