D21839: [TermGenerator] Use UTF-8 ByteArray for termList

Stefan Brüns noreply at phabricator.kde.org
Sun Jun 16 18:19:29 BST 2019


bruns added a comment.


  In D21839#480765 <https://phabricator.kde.org/D21839#480765>, @poboiko wrote:
  
  > Actually, there is an issue with that code right now, which I wanted to fix, but forgot.
  >  The trimming part `finalArr = finalArr.mid(0, maxTermSize);` actually should be performed on `QString` instead of `QByteArray` - unicode symbols inside term can consist of two bytes, and cutting at `maxTermSize` bytes can actually cut half of last symbol. I end up with terms like `тождественно�` inside `balooshow -x`.
  >  Not to mention that russian terms end up being pretty small.
  
  
  As the limit is somewhat arbitrary, maybe we can just limit the QString? I don't think this has any serious side effects.

REPOSITORY
  R293 Baloo

REVISION DETAIL
  https://phabricator.kde.org/D21839

To: bruns, #baloo, ngraham, astippich, poboiko
Cc: kde-frameworks-devel, LeGast00n, fbampaloukas, domson, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns, abrahams
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-frameworks-devel/attachments/20190616/df24ef2d/attachment.html>


More information about the Kde-frameworks-devel mailing list