[Digikam-devel] [digikam] [Bug 159220] Non printable characters in IPTC keyword set by Digikam and displayed by Gallery 2

Eric Bayard ebayard63-projet at yahoo.fr
Mon Jan 6 14:17:12 GMT 2014


https://bugs.kde.org/show_bug.cgi?id=159220

Eric Bayard <ebayard63-projet at yahoo.fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ebayard63-projet at yahoo.fr

--- Comment #14 from Eric Bayard <ebayard63-projet at yahoo.fr> ---
(In reply to comment #10)
> Jean Marc,
> 
> digiKam > 0.9.x support XMP. XMP replace IPTC and support UTF-8. IPTC has
> never supported UTF-8 and have several limitation over strings size. XMP do
> not have these limitation. 
> 
> Gallery server must support XMP by default and use it as well instead IPTC.
> 
> Gilles Caulier

Hi Gilles,
Actually this is wrong. UFT is officially part of IPTC standard since 1997 (XMP
were first used in 2001 by adobe in acrobat)

You can find a lot of publication on this subject. But the best is to directly
check the standard on the IPTC website. Note that tha latest IPTC standard are
based on XMP implementation, but this not what we are dicussing here.

example: 
http://www.gwww.wan-ifra.org%2Fsystem%2Ffiles%2Ffield_ifra_mag_file%2FF_tp980258.pdf&ei=BrXKUq_HEYGshQeBmoHwBg&usg=AFQjCNHAaCBNHuKLvObVXCLL-ZlWs4TrTQ&sig2=JdQvFsrXcaKJOOz-ieanoA&bvm=bv.58187178,d.ZG4

or quoted from: http://www.iptc.org/std/IIM/4.1/specification/IIMV4.1.pdf 
(year 1999)

 "Coded
Character
Set
Optional, not repeatable, up to 32 octets, consisting of one or
more control functions used for the announcement, invocation or
designation of coded character sets. The control functions follow
the ISO 2022 standard and may consist of the escape control
character and one or more graphic characters. For more details
see Appendix C, the IPTC-NAA Code Library.
The control functions apply to character oriented DataSets in
records 2-6. They also apply to record 8, unless the objectdata
explicitly, or the File Format implicitly, defines character sets
otherwise.
If this DataSet contains the designation function for Unicode in
UTF-8 then no other announcement, designation or invocation
functions are permitted in this DataSet or in records 2-6.
For all other character sets, one or more escape sequences are
used...."

or from the metadata working group that sets the standards and use them of
course:
www.metadataworkinggroup.com/pdf/mwg_guidance.pdf  page 28 (Note that the whole
section is very interesting for digikam as it speaks about metadata
reconciliation guidance)

"IPTC-IIM SHOULD be written using the Coded Character Set (1:90) as UTF-8 (see
“Section 1.6 Coded Character Set” in the IIM specification).

If the IPTC-IIM has not been written in UTF-8 before, a robust Changer SHOULD
convert all properties to UTF-8 and write the corresponding identifier for
UTF-8 to the 1:90 DataSet...

In a word  DIGIKAM is not standard compliant
It  makes it incompatible with all other image management software, such as
those from adobe. 
It also corrupts our metadata which really is a shame because it is still a
really good piece of software

Regards
Eric

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the Digikam-devel mailing list