Review Request 117789: Optimize word count in PlainTextExtractor.

Fri May 2 10:54:34 BST 2014

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/117789/
-----------------------------------------------------------

(Updated May 2, 2014, 9:54 a.m.)

Status
------

This change has been marked as submitted.

Review request for kdelibs and Vishesh Handa.

Repository: kfilemetadata

Description
-------

Optimize word count in PlainTextExtractor.

Regular expressions are notoriously slow. Implementing a simple
word-count directly in C++ is much faster, as shown by the benchmark:

Before:
     702.0 msecs per iteration (total: 7,020, iterations: 10)
After:
     125.5 msecs per iteration (total: 1,256, iterations: 10)

Make the plaintext extractor benchmark more meaningful.

It now operates on a larger file and uses QBENCHMARK to actually get some data.

Diffs
-----

  autotests/indexerextractortests.cpp 1cb8e65da7d764eab1923054659ae5841104de2d 
  src/extractors/plaintextextractor.cpp 536e02d843f24dbbc19035029896b9e696e8b302 

Diff: https://git.reviewboard.kde.org/r/117789/diff/

Testing
-------

Thanks,

Milian Wolff

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20140502/60a5ebea/attachment.htm>