implicit QChar constructors
Dirk Mueller
kde-optimize@mail.kde.org
Wed, 22 Jan 2003 00:15:35 +0100
On Die, 21 Jan 2003, Jesse Yurkovich wrote:
> The second piece of code was, of course, much much faster than the first=
=20
> because of all the constructor calls going on (even for small values of N=
).=20
Much faster?
The QChar() constructors are inlined, so there is no actual "work" to do.=
=20
To prove my point, I've analyzed the assembly output of=20
#include <qstring.h>
int main()
{
QChar* txt =3D new QChar[10];
QChar space(' ');
for ( unsigned int z =3D 0; z < 10; ++z)
txt[z] =3D ' ';
return 0;
}
I use g++ (3.2.0), no other switches except -O2.=20
The "implicit constructor" variant went down to this code in the loop:=20
=2EL21:
=B7 movw=B7 $32, -24(%ebp)
=B7 movw=B7 $32, (%eax,%edx,2)
=B7 incl=B7 %edx
=B7 cmpl=B7 $9, %edx
=B7 jbe=B7 .L21
As you can see, gcc can optimize the repeated assignment quite well, it map=
s=20
it to the optimal x86 assembly mnemonic:=20
movw=B7 $32, (%eax,%edx,2)
However, for reasons beyond me, gcc pessimized the code by not=20
loop-eliminating this construction:
movw=B7 $32, -24(%ebp)
This is the implicit QChar constructor. There is absolutely no need to have=
=20
it in the loop. Appears to be a bug in gcc to me.=20
So I was curious and used "=3D space" instead. Here is the output:=20
=2EL21:
=B7 movl=B7 -24(%ebp), %eax
=B7 movw=B7 %ax, (%ecx,%edx,2)
=B7 incl=B7 %edx
=B7 cmpl=B7 $9, %edx
=B7 jbe=B7 .L21
As you can see, the actual code is theoretically worse, because it ends up=
=20
in a memory read and a sequential memory write. This can hardly be faster i=
n=20
the average case.=20
Well, gcc could have been intelligent enough to use a register for the load=
,=20
like for example %ax, but apparently its not. Again, I suspect a bug in gcc=
.=20
I'd be interested to analyze the internal parse tree of gcc to understand=
=20
why it thought it is unable to do this trivial optimization, but I have not=
=20
much knowledge in that area.=20
Then, I was curious to see if specifying __attribute__((const)) on all=20
relevant QChar and QString methods and constructors. Unfortunately, it made=
=20
no difference at all for this testcase. This looks like a bug in gcc to me,=
=20
although the documentation is quite unclear on if it can optimize C++=20
methods at all (because of the implicit this pointer)
--=20
Dirk (received 5 mails today)