PATCH: kdelibs/kdecore/kstringhandler.cpp tagURLs() method

Stephan Hermann sh at kde-coder.de
Thu Jul 11 12:11:37 BST 2002


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,


On Thursday 11 July 2002 12:38, Marc Mutz wrote:
> On Thursday 11 July 2002 07:48, Stephan Hermann wrote:
> <snip>
>
> > > Also, for speed, the parentheses (foo) should be written as the
> > > non-capturing (and very much faster) version (?:foo). The use of
> > > cap(1) can be replaced with something like href.beginsWith( "www."
> > > ) and href.beginsWith( "ftp." ).
> >
> > Is it not the same ?
>
> <snip>
>
> No, with (foo), QRegExp is forced to keep track of the captured text,
> while (?:foo) is just a grouping construct and is optimized away at
> (regexp) compile time.

Well, I changed the regexp in this way:

QRegExp 
urlEx("(?:www\\.|ftp\\.|\\w+\\://)[\\d\\w\\.]+[:\\d]{0,}[/]{0,1}[~/\\.-?&=#:_\\d\\w]{0,}");

It will find now these types of (malformed!) urls

www.* 
ftp.*
( the two possibilities to determine the protocol which are in someway 
standardized)

<protocol>://<fqdn>/
<protocol>://<fqdn>/<path>/<page>?<QUERYString>
<protocol>://<fqdn>:port/
<protocol>://<fqdn>:port/<path>/<page>?<querytring>

so, urls like 
	
	irc://irc.kde.org:6667/#kde
	http://www.google.com/search?q=About%20KDE
will be found (hopefully, i don't test all possibillities)

After all, I'd take your hint and changed the use of ->cap(1) into 
href.startsWith(...).

And at least, the replacement in hrefProtocol I changed, too.
Not in this way, you described in your last mail, but I used sprintf and 
QString::latin1()

> > Well, that is true. (Trolltechs Implementation mistake ;)) The only
> > safe way is to escape those placeholders, e.g. before you use the
> > QString::arg() method, you have to do something like this:
> > QString::replace(QRegExp("%\\d"),"");
> > (the empty string is a little bit to hard ;))
>
> No, the safe way is to use operator+:
> QString url = "<a href=\"" + hrefProtocol + "\">" + href + "</a>";

Hmmm...I like the sprintf way ;) but well, i can change it into your solution 
if you want and disagree with the sprintf way.

> > After all, I'm searching for a better URL regexp with increased
> > speed. But after all, please check the old version of
> > KStringHandler::tagURLs, you can found the same format string
> > vulnerability.
>
> <snip>
>
> I didn't say you introduced it ;-)

,-)

I'll make a patch and send it to you directly 

regards,

\sh

- -- 
St. Hermann, Troisdorf
One solution for a simple problem: A7 B4 C2 D5 E8 F1 G3 H6
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9LWfsV8AnusWiV6wRAoVJAJ4mTcHAK7kywVuQxUJdRrRbqgkbKgCfZSPW
m3+IWjKeAUoIfA9223G7fvg=
=KE71
-----END PGP SIGNATURE-----





More information about the kde-core-devel mailing list