[kdepim-users] Tbird versus Kmail, performance

Daniel Vrátil dvratil at redhat.com
Fri Nov 14 13:31:15 GMT 2014


On Friday 14 of November 2014 12:36:47 René J.V. Bertin wrote:
> On Friday November 14 2014, Pablo Sanchez wrote regarding "Re: Tbird versus
> Kmail"
> 
> Hi again,
> 
> For giggles, here's the output from iostat for the last 20% or so of syncing [All Mail] after a akonadi vacuum:
> > zpool iostat Patux 3
> 
>                capacity     operations    bandwidth
> pool        alloc   free   read  write   read  write
> ----------  -----  -----  -----  -----  -----  -----
> Patux       89.0G  42.0G     11     24   266K   879K
> Patux       89.0G  42.0G      0     24  21.0K   681K
> Patux       89.0G  42.0G      0     47      0   892K
> Patux       89.0G  42.0G      0     25  18.3K   725K
> Patux       89.0G  42.0G      0     26      0   406K
> Patux       89.0G  42.0G      0     33  20.3K   778K
> Patux       89.0G  42.0G      0     16      0   292K
> Patux       89.0G  42.0G      0     32      0   466K
> Patux       89.0G  42.0G      0     14  18.7K   553K
> Patux       89.0G  42.0G      0     30      0   409K
> Patux       89.0G  42.0G      0     46  20.2K   999K
> Patux       89.0G  42.0G      0      1      0  67.0K
> Patux       89.0G  42.0G      6     50   337K  1.76M
> Patux       89.0G  42.0G      1     30   169K   413K
> Patux       89.0G  42.0G     52     83  3.10M  4.23M
> Patux       89.0G  42.0G     13    235   830K  13.1M
> 
> Which configures there isn't much IO going on during these lengthy
> operations. But that stands to reason, and that the whole process is CPU
> bound, no, given the fact that we're only pulling headers (i.e. lots of
> small bits of data)? 

Well, envelopes and headers are not as small as most people expect:

SELECT AVG(PartTable.datasize) AS avg,
      median(PartTable.datasize) AS median,
      MAX(PartTable.datasize) AS max,
      MIN(PartTable.datasize) AS min,
      COUNT(*) AS count
FROM PartTable
LEFT JOIN PimItemTable ON PimItemTable.id = PartTable.pimItemId
WHERE PartTable.partTypeId = 
      (SELECT id
            FROM PartTypeTable
            WHERE ns='PLD'  AND name=$PAYLOADTYPE')
AND PimItemTable.mimeTypeId =
      (SELECT id
            FROM MimeTypeTable
            WHERE name='message/rfc822')

$PAYLOADTYPE            avg        median                max            min             count
-----------------------------------------------------
ENVELOPE		      485.91             449        233354	        45        320864
HEAD                        3578.03          3574        171116                  0        320867 

* HEAD are all email headers, ENVELOPE encodes subset of headers needed to list
    emails in KMail when you open a folder, and to build threads tree
* values are in bytes, except for "count"
* "count" should theoretically by the same, the fact that I have more HEAD parts
    than ENVELOPEs indicates that some messages are incomplete and/or orphaned
* median() calculated using function from https://wiki.postgresql.org/wiki/Aggregate_Median


> It cannot exactly help that all this probably transits
> over the DBus rather than through a dedicated socket, though that would
> have to be confirmed (which is why this goes to the list).

Data actually go through a socket. DBus is only used to notify clients about
changes ("Item ABC added into collection XYZ", etc), but then the clients will
request the actual Item from the Server via the socket anyway.

The collection sync (i.e. retrieving new data from remote server), is slow in
several places: internet connection (talking to the IMAP server), the IMAP
server, parsing IMAP responses, then there's unfortunately a bit of unnecessary
but currently unavoidable deserialization into KMime and serialization back to
binary data, which are then serialized into ASAP and sent to server, which has
to deserialize the command again, do some book keeping and finally write the
data.

So yep, this takes some time other than IO. We know about several places that
could be optimized for better performance already (mostly the parsers), and for
Frameworks we have plans to actually get rid of some of the slow parts
completely (most notably switch from text-based ASAP to a binary data
stream), and I have some more plans regarding parallelization and batch
processing of requests.


Daniel

> 
> Cheers,
> R.

-- 
Daniel Vrátil | dvratil at redhat.com | dvratil on #kde-devel, #kontact, #akonadi
Software Engineer - KDE Desktop Team, Red Hat Inc.

GPG Key: 0xC59D614F6F4AE348
Fingerprint: 4EC1 86E3 C54E 0B39 5FDD B5FB C59D 614F 6F4A E348
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kdepim-users/attachments/20141114/a322b552/attachment.sig>
-------------- next part --------------
_______________________________________________
KDE PIM users mailing list
Subscription management: https://mail.kde.org/mailman/listinfo/kdepim-users


More information about the kdepim-users mailing list