[Kde-pim] Another linklocator patch
Ingo Klöcker
kloecker at kde.org
Mon Aug 24 21:30:32 BST 2009
On Sunday 23 August 2009, Martin Koller wrote:
> Hi,
>
> even after fixing a previous bug in the linklocator, we have another
> issue with it: https://bugs.kde.org/show_bug.cgi?id=202445
>
> So I decided to modify the linklocator algorithm (after consulting
> RFC3986) to the following concept:
>
> linklocator does not really parse an URL (and it never did), so it's
> not important if the URL in fact is really a completely valid url -
> this will be handled in the target application then.
> What is important is that the URL is discovered, even if it is
> enclosed with some sort of brackets and even interrupted by line
> breaks or white space. RFC3986 explains this in appendix C
>
> So the modified code now handles these cases and it does not check
> for explicit allowed characters in all the different parts of a URL,
> which also has the advantage of fixing the mentioned bug. As said
> above: linklocator is not about checking the validity of the URL
> syntax.
I like this approach.
> Attached is the patch including a modified unit test.
>
> P.S.:line breaks in URLs are still not working in kmail with this
> modification, because the LinkLocator::convertToHtml() has to be
> changed to not remove the line breaks before my new code gets a
> chance to use it. I'll look into this in the next days, if time
> permits.
Please submit your next patch to Review Board. It's a pain to comment on
a patch via mail.
One thing you should not do is
+ url.reserve( mText.length() - mPos ); // avoid allocs
If mText is very large (e.g. a message with a large inline patch on the
Linux Kernel mailing list) then you will waste lots of memory just to
save a few CPU cycles for allocations. If anything then restrict the
size you reserve to a sensible value, e.g.
+ url.reserve( qMin( 1024, mText.length() - mPos ) ); // avoid
allocs
OTOH, why don't you do it as before, i.e. look for the end and then do a
simple
- url = mText.mid( start, mPos - start );
Ah, okay, you skip whitespace. Hmm. Okay. In this case please use the
above reserve with fixed upper limit. I think even 128 would be enough
as sensible maximal value for the initially reserved size of url.
Your code assumes that a URL that is started by some bracket or quote is
closed by the corresponding closing bracket/quote. I'm not sure how
safe this assumption is. Hmm, I guess it's safe enough. OTOH,
apparently there is a maxUrlLen(). So I suggest to abort the URL
parsing when url.length() exceeds maxUrlLen(). And probably maxUrlLen()
could also be used as upper limit for the reserve call.
Regards,
Ingo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kde-pim/attachments/20090824/d47adcf3/attachment.sig>
-------------- next part --------------
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/
More information about the kde-pim
mailing list