[Akonadi] [Bug 334218] New: synchronizations of large folders with filesystem contents hogs a Sandybridge core for minutes stat()ing every file in it

Martin Steigerwald Martin at Lichtvoll.de
Fri May 2 12:22:08 BST 2014


https://bugs.kde.org/show_bug.cgi?id=334218

            Bug ID: 334218
           Summary: synchronizations of large folders with filesystem
                    contents hogs a Sandybridge core for minutes stat()ing
                    every file in it
    Classification: Unclassified
           Product: Akonadi
           Version: GIT (master)
          Platform: unspecified
                OS: Linux
            Status: UNCONFIRMED
          Severity: normal
          Priority: NOR
         Component: Maildir Resource
          Assignee: kdepim-bugs at kde.org
          Reporter: Martin at Lichtvoll.de

Even after working around [Bug 332684] New: [Maildir] lots of stats calls to
/etc/localtime on synchronizing folders by setting an TZ environment variable
synchronizing large folders with filesystem contents hogs one CPU core for
minutes.


Reproducible: Always

Steps to Reproduce:
1. Have a large maildir folder.
2. Synchronize it.

Actual Results:  
akonadi_maildir_resource hogs one Sandybridge core for minutes. SSDs are under
utilized. MySQL barely visible.

Expected Results:  
Synchronizing large folders is faster.

Akonadi stats every file. Is it necessary? For a folder with 250000 mails that
are 250000 calls to stat().

While just listing folder contents with

martin at merkaba:~/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory>
/usr/bin/time find kernel-ml | wc -l
0.21user 0.35system 0:00.68elapsed 82%CPU (0avgtext+0avgdata 59316maxresident)k
13648inputs+0outputs (1major+17920minor)pagefaults 0swaps
250167

is blazingly fast here. We have high CPU usage here as well… but I bet thats
due to Linux caching the directory entries and inodes:

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
454480 453722  99%    0,98K  28405       16    454480K btrfs_inode
434616 418562  96%    0,19K  20696       21     82784K dentry

So, wouldn´t it be sufficient to only stat() the files that are new or have
updated timestamps?

martin at merkaba:~/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory>
/usr/bin/time find kernel-ml -ls | wc -l
0.70user 0.36system 0:01.07elapsed 99%CPU (0avgtext+0avgdata 59536maxresident)k
32inputs+0outputs (0major+18010minor)pagefaults 0swaps
250167

indicated that also the timestamps can be provided quickly. So I´d:

1) list the fs folder contents for filenames and timestamps (mtime).
2) compare with database.
3) only stat() the files that are new or have been updated meanwhile.

Result: Blazingly fast folder sync?


Part of the CPU time used I see no activity of akonadi maildir resource in
strace. Other time is stat()-ing files like this:

[pid 4137]
stat("/home/martin/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory/kernel-ml/new/1376733031.R234.merkaba",
{st_mode=S_IFREG|0644, st_size=4079, ...}) = 0 [pid 4137]
stat("/home/martin/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory/kernel-ml/new/1376733031.R322.merkaba",
{st_mode=S_IFREG|0644, st_size=8056, ...}) = 0 [pid 4137]
stat("/home/martin/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory/kernel-ml/new/1376733031.R342.merkaba",
{st_mode=S_IFREG|0644, st_size=2771, ...}) = 0 [pid 4137]
stat("/home/martin/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory/kernel-ml/new/1376733031.R608.merkaba",
{st_mode=S_IFREG|0644, st_size=4492, ...}) = 0 [pid 4137]
stat("/home/martin/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory/kernel-ml/new/1376733031.R665.merkaba",
{st_mode=S_IFREG|0644, st_size=13036, ...}) = 0 [pid 4137]
stat("/home/martin/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory/kernel-ml/new/1376733031.R738.merkaba",
^C{st_mode=S_IFREG|0644, st_size=6870, ...}) = 0


Related observations also indicate that Akonadi is doing this work needlessly:

Bug 334209 - synchronizes folder contents during runtime needlessly

Bug 334216 - synchronizes folder with filesystem after downloading and
filtering mails needlessly


Again blazingly fast ThinkPad T520 with Sandybridge and Dual SSD BTRFS RAID 1
setup.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the Kdepim-bugs mailing list