Ad-filter loading and QRegExp performance

Robert Knight robertknight at gmail.com
Mon Jan 29 14:14:26 GMT 2007


Hello,

Out of curiosity, I did a little profiling of Konqueror ( KDE 4 )
startup recently using callgrind.
I was surprised to see that according to the output, about 5-10% of
the time is spent processing regular expressions for ad filters.

QRegExp seems to have a mechanism internally to delay parsing the
expression until it is needed ( ie. until indexIn() , exactMatch() or
a similar method is called on the regexp).  Whenever a QRegExp is
copied, the engine preparation is performed first, so copying a
QRegExp loses the benefits of this delayed parsing.
KHTML stores the ad-filters in a QVector<QRegExp> internally, which
means copies inside Qt when using the append() method to add a new
filter to the internal list.  This forces all ad-filter reg-exps to be
parsed on startup.

Konqueror ships with almost 200 ad filters out of the box, and parsing
all of these takes some time.

Initially I tried replacing QVector<QRegExp> with QVector<QRegExp*>
instead, but I realised that KHTMLSettings needs to be copied, and so
these pointers would need to be shared somehow.  I am not sure of the
best way to do that.

The alternative, would be to modify the QRegExp code so that the
assignment operator did not automatically parse the reg exp.  Who
should I get in touch with to discuss this?

There is also some discussion about whether it is right of us to ship
any ad-filters, or 200 of them with Konqueror, but I want to leave
that for another time.

Regards,
Robert.



More information about the kfm-devel mailing list