Review Request 117789: Optimize word count in PlainTextExtractor.

Milian Wolff mail at milianw.de
Sat Apr 26 14:15:26 BST 2014


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/117789/
-----------------------------------------------------------

Review request for kdelibs and Vishesh Handa.


Repository: kfilemetadata


Description
-------

Optimize word count in PlainTextExtractor.

Regular expressions are notoriously slow. Implementing a simple
word-count directly in C++ is much faster, as shown by the benchmark:

Before:
     702.0 msecs per iteration (total: 7,020, iterations: 10)
After:
     125.5 msecs per iteration (total: 1,256, iterations: 10)

Make the plaintext extractor benchmark more meaningful.

It now operates on a larger file and uses QBENCHMARK to actually get some data.


Diffs
-----

  autotests/indexerextractortests.cpp 1cb8e65da7d764eab1923054659ae5841104de2d 
  src/extractors/plaintextextractor.cpp 536e02d843f24dbbc19035029896b9e696e8b302 

Diff: https://git.reviewboard.kde.org/r/117789/diff/


Testing
-------


Thanks,

Milian Wolff

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20140426/7817f44a/attachment.htm>


More information about the kde-core-devel mailing list