D10694: epubextractor: Handle multiple subjects better
Stefan BrĂ¼ns
noreply at phabricator.kde.org
Thu Apr 19 19:24:35 UTC 2018
bruns added inline comments.
INLINE COMMENTS
> michaelh wrote in epubextractor.cpp:85
> I think we should port away from libepub. Multiple titles result in one ';'-joined string.
> Also it seems to be unmaintained.
The joined titles is the fault of this epubextractor AFAICS - see fetchMetadataString
> michaelh wrote in epubextractor.cpp:97
> Right, this inconsistency is intentional, and it needs discussion. That's why I added a comment in D12197 <https://phabricator.kde.org/D12197> which was probably overlooked.
> DC and IDPF aren't very clear on how to use `dc:subject`. Calibre interprets it as tags, My impression is, that most provider also do. Hence I prefer to use `Property::Keywords` only because it comes closest imo. That change would not really be breaking as currently `Property::Subject` is one large string joined with ';'.
Distinction between Subject and Keywords typically is keywords are just a bunch of words without further specification, while subject, as specified by DC, and as used by e.g. libraries, are taken from filed specific catalogs.
Baloos properties documentation specifically mentions dc:subject for Properties:Subject.
One of the file formats which has both, keywords and subject, is ODF, which uses dc:subject and meta:keywords.
DC specifies for //any// property:
> Recommendation 5. Multiple property values should be encoded by repeating the XML element for that property.
My opinion is to **always** use Properties::Subject for dc:subject (as documented for baloo), and add each property instance individually. If properties are already messed up in the originating document, there is nothing we can do, but we should not make things worse.
REPOSITORY
R286 KFileMetaData
BRANCH
multi-subject
REVISION DETAIL
https://phabricator.kde.org/D10694
To: michaelh, mgallien, dfaure
Cc: bruns, astippich, #frameworks, ashaposhnikov, michaelh, spoorun, navarromorales, isidorov, firef, andrebarros, emmanuelp
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-frameworks-devel/attachments/20180419/d941510e/attachment-0001.html>
More information about the Kde-frameworks-devel
mailing list