D10694: epubextractor: Handle multiple subjects better

Stefan BrĂ¼ns noreply at phabricator.kde.org
Thu Apr 19 19:24:35 UTC 2018


bruns added inline comments.

INLINE COMMENTS

> michaelh wrote in epubextractor.cpp:85
> I think we should port away from libepub. Multiple titles result in one ';'-joined string.
> Also it seems to be unmaintained.

The joined titles is the fault of this epubextractor AFAICS - see fetchMetadataString

> michaelh wrote in epubextractor.cpp:97
> Right, this inconsistency is intentional, and it needs discussion. That's why I added a comment in D12197 <https://phabricator.kde.org/D12197> which was probably overlooked.
> DC and IDPF aren't very clear on how to use `dc:subject`. Calibre interprets it as tags, My impression is, that most provider also do. Hence I prefer to use `Property::Keywords` only because it comes closest imo. That change would not really be breaking as currently `Property::Subject` is one large string joined with ';'.

Distinction between Subject and Keywords typically is keywords are just a bunch of words without further specification, while subject, as specified by DC, and as used by e.g. libraries, are taken from filed specific catalogs.

Baloos properties documentation specifically mentions dc:subject for Properties:Subject.

One of the file formats which has both, keywords and subject, is ODF, which uses dc:subject and meta:keywords.

DC specifies for //any// property:

> Recommendation 5. Multiple property values should be encoded by repeating the XML element for that property.

My opinion is to **always** use Properties::Subject for dc:subject (as documented for baloo), and add each property instance individually. If properties are already messed up in the originating document, there is nothing we can do, but we should not make things worse.

REPOSITORY
  R286 KFileMetaData

BRANCH
  multi-subject

REVISION DETAIL
  https://phabricator.kde.org/D10694

To: michaelh, mgallien, dfaure
Cc: bruns, astippich, #frameworks, ashaposhnikov, michaelh, spoorun, navarromorales, isidorov, firef, andrebarros, emmanuelp
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-frameworks-devel/attachments/20180419/d941510e/attachment-0001.html>


More information about the Kde-frameworks-devel mailing list