[Kde-pim] Akonadi: single database design mistake?

Dmitry Torokhov dmitry.torokhov at gmail.com
Tue Nov 29 18:38:19 GMT 2011


It looks like majority of data is envelopes and headers of the emails;
the idea is that you do not need to rescan your entire maildir folder,
reading possibly thousands separate files, in order to display mailbox
contents. The issue as I said is that before akonadi taking over the
world such storage was a separate file on the filesystem taking a couple
MB per mailbox whwreas not they all lumped together into a single table
so any query has to traverse entire index for _all_ folders in _all_
accounts one might have.

On Tue, Nov 29, 2011 at 07:17:54PM +0100, Anders Lund wrote:
> I find this message disturbing and worrying.
> 
> I have 
> 1.1G ~/Mail
> 1.5G ~/.local/share/akonadi
> 
> The majority of the akonadi directory is taken up by parttable. What is in 
> that, and why, assuming it is data that is already in my mail? I do not even 
> use akonadi for mail atm, is it a leftover from broken attempts at using 
> kdepim 4.7? Is there a way to remove that without loosing data (kaddressbook 
> unfortunately uses akonadi to STORE data - namely groups, though iiutc that 
> was never meant to be done)?
> 
> The size of the table on the harddisk does not worry me, but possible 
> performance when attempting an update, planned for KDE 4.8, does!
> 
> 
> On Tirsdag den 29. november 2011, Dmitry Torokhov wrote:
> > Hi,
> > 
> > So I have upgraded from very usable setup with Fedota 15/KDE 4.6.x to
> > Fedora 16 with shiny new KDE 4.7.3 and brand new KDE PIM suite and as
> > many users found that the new version Akonadi/Kmail2 is pretty much
> > unusable. The conversion of my archive mailbox (mixed maildir) and 2
> > IMAP accounts ran for over 8 hours and I am not quite sure what state it
> > is in the moment as Kontact is trying to open my work mailbox for over
> > 10 minutes now (ever since I got to my desk and woke up my laptop).
> > 
> > Looking at the Akonadi mysql database I see that I have 3 large tables:
> > - parttable:			587230 rows;
> > - pimitemflagrelation:		241688 rows;
> > - pimitemtable:			294840 rows;
> > 
> > So akonadi data takes about 700Mb on my hard drive (compared to 283 Mb
> > of actual local mail), and this is with disabled "Full index" option on
> > all mailboxes:
> > [dtor at dtor-d630 ~]$ du -hc .local/share/akonadi/
> > 5.3M    .local/share/akonadi/file_db_data
> > 4.0K    .local/share/akonadi/db_data/test
> > 212K    .local/share/akonadi/db_data/performance_schema
> > 1008K   .local/share/akonadi/db_data/mysql
> > 570M    .local/share/akonadi/db_data/akonadi
> > 681M    .local/share/akonadi/db_data
> > 4.0K    .local/share/akonadi/db_misc
> > 686M    .local/share/akonadi/
> > 686M    total
> > 
> > The mysql took so far:
> > 
> > [dtor at dtor-d630 ~]$ ps -f -C mysqld
> > UID        PID  PPID  C STIME TTY          TIME CMD
> > dtor     14230 14227 84 Nov27 ?        09:43:05 /usr/libexec/mysqld
> > --defaults-file=/home/dtor/.local/share/akonadi//mysql.conf
> > --datadir=/home/d
> > 
> > CPU is pegged by mysqld/akonadi_imap_resource and other "agents" and
> > everything is _slow_. I hate to think what the experience would be if I
> > had  standard hard drive and not an SSD.
> > 
> > And this brings me to the question: is stuffing everything into a single
> > database such a good idea?
> > 
> > Previous version of Kmail and other MUAs use per-mailbox/per-folder
> > indices which are significantly smaller and thus take significantly less
> > resources to populate, read and update. If a mailbox grows too big (for
> > example my LKML folder is at ~100K mails at the moment) then user pays
> > penalty only if he needs to actually open that mailbox (adding new guids
> > to indices is fairly inexpensive so regular fetches are not really
> > noticeable), otherwise entire index is most likely still in buffer cache
> > and stays there while it is needed.  Re-fetching index is reading just a
> > couple of MB from disk. Compare this with single database of akonadi
> > where we need to parse entire index to select all mails for given
> > folder. For decent performance mysqld needs to have indices in its
> > memory, thus taking resorces away from other applications.  This setup I
> > guess would work well for a dedicated server (like Zimbra install) but
> > not for individual desktop/laptop case.
> > 
> > So, while unified API to access various kinds of data is a nice thing to
> > have, do all provides actually have to share single storage? Would they
> > not be better off using dedicated storage in certain cases?
> > 
> > Thanks.
> 
> 
> -- 
> Anders
> _______________________________________________
> KDE PIM mailing list kde-pim at kde.org
> https://mail.kde.org/mailman/listinfo/kde-pim
> KDE PIM home page at http://pim.kde.org/

-- 
Dmitry
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/



More information about the kde-pim mailing list