Review Request: Rewrite Google's tracking URLs in search results

Thomas Fischer fischer at unix-ag.uni-kl.de
Sun Dec 23 11:09:47 GMT 2012


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/107867/
-----------------------------------------------------------

Review request for kdelibs.


Description
-------

This patch adds the feature to KHTML to rewrite URLs that are used to track users. Right now, only tracking URLs from Google's search result are supported, but the list can be expanded (hard-coded right now).
Example: A search for "KDE" may result in a list of links, including a link like
http://www.google.com/url?q=http://www.kde.org/&sa=U&ei=YsYFfgOqAZzBQBC&ved=GEFANYNoNG&usg=Y8BfN6qj0QYNHYJQQBEB
When you follow this link, Google will transparently redirect you to http://www.kde.org, but still record your behaviour.
The patch rewrites such links already in the HTML parsing phase, i.e. you never see the tracking URL, but instead the final URL only.

The rewrite feature can be disabled through a setting, but there is no GUI for that yet.

I was thinking about automatically detecting tracking URLs through a regular expression, but I guess running a regular expression check for every URL would be too time-consuming.

I wrote the patch for 4.9.3 as this is the version I am using on the testing machine. I assume the affected classes haven't changed much in recent months, so it should be fairly simple to port to HEAD or future 4.11.


Diffs
-----

  khtml/khtml_settings.h 0faec6d 
  khtml/khtml_settings.cpp b5693b4 
  khtml/xml/dom_docimpl.cpp bb65a89 

Diff: http://git.reviewboard.kde.org/r/107867/diff/


Testing
-------


Thanks,

Thomas Fischer

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20121223/80ec57fa/attachment.htm>


More information about the kde-core-devel mailing list