[Digikam-devel] [digikam] [Bug 195508] Syncing IPTC with UTF-8 characters from XMP after conversion to printable ASCII

Alan Pater alan.pater at gmail.com
Sat May 16 00:47:06 BST 2015


https://bugs.kde.org/show_bug.cgi?id=195508

--- Comment #14 from Alan Pater <alan.pater at gmail.com> ---
I can't answer for Andreas, but my understanding is that UTF-8 is allowed and
optional in IPTC-IIM. My own tests within exiv2 show that unicode characters
are preserved when syncing between XMP and IPTC. I probably missed some cases
though, as I was not explicitly looking for cases where it did not. I don't
think converting is needed. If unicode exists in XMP,  it can be preserved in
IPTC. 

This is way over my head technically, but the IPTC spec (version 3, October
1995) says:

1:90 Coded Character Set
Optional, not repeatable, up to 32 octets, consisting of the
escape control character, and graphic characters.
One or more escape sequences for the announcement of the
code extension facilities used in the data which follows, for the
initial designation of the G0, G1, G2 and G3 graphic character
sets and the initial invocation of the graphic set (7 bits) or the
left-hand and the right-hand graphic set (8 bits) and for the initial
invocation of the C0 (7 bits) or of the C0 and the C1 control
character sets (8 bits) in use for data fields in records 2-6 and 8.
Follows the ISO 2022 standard. The recognised graphic
repertoire and control function repertoire are listed in Appendix
C.
The announcement of the code extension facilities, if
transmitted, must appear in this data set. Designation and
invocation of graphic and control function sets (shifting) may be
transmitted anywhere where the escape and the other
necessary control characters are permitted. However, it is
recommended to transmit in this data set an initial designation
and invocation, i.e. to define all designations and the shift status
currently in use by transmitting the appropriate escape
sequences and locking-shift functions.
If 1:90 is omitted, the default for records 2-6 and 8 is ISO 646
IRV (7 bits) or ISO 4873 DV (8 bits). Record 1 shall always use
ISO 646 IRV or ISO 4873 DV respectively.

-- 
You are receiving this mail because:
You are the assignee for the bug.



More information about the Digikam-devel mailing list