[Akonadi] [Bug 338402] File system cache is inneficient : too many file per directory

Martin Steigerwald ms at teamix.de
Wed Jul 1 08:46:54 BST 2015


https://bugs.kde.org/show_bug.cgi?id=338402

--- Comment #14 from Martin Steigerwald <ms at teamix.de> ---
(In reply to Daniel Vrátil from comment #13)
> Hi all,
> 
> so the problem with files just endlessly piling up in file_db_data should
> finally be fixed. Now for the original bug report: the file_db_data
> containing too many files.
> 
> So I was thinking about how to decrease the file count - obviously the right
> solution is the levelled cache as Bastien pointed out. I think in our case
> one level should be enough. The filenames of the external payload parts
> consist of incremental database unique ID, so I am planning to use the last
> two digits of the ID for the folder name to ensure even distribution of
> files into the cache folders.

I don´t know how the IDs are computed. So I take your call on what part of it
to use.

> To have some numbers here, using modulo 100 for folder name means 100
> folders in file_db_data (file_db_data/00 - 99/). With 1 million emails in
> Akonadi (which is a performance baseline for me) and with average ratio of
> external vs internal cache being cca 1:2 we get cca 500 000 external files
> in file_db_data, so that means cca 5 000 files per folder. That sounds like
> a reasonable number to me. 
>
> With 2 levels of indirection (using last 3rd and 2nd digit for L1 and last
> digit for L2 (so file_db_data/00-99/0-9/) we would have 100 folders with 10
> folders in each so 1000 folders in total. That would give us cca 500 emails
> per folder with 1 000 000 emails in Akonadi. For the baseline of 1 000 000
> emails two levels of indirection seem to be unnecessary.

That sounds reasonable. 1 million mails sounds like a reasonable baseline. Many
users will have less. My accounts have a bit more, but even with 2 million
mails and and a ratio of 1:2 it will be just 10000 files a folder.

> Regarding migration, we cannot do automatic migration on start, that would
> take too much time and resources to perform during start, so only newly
> created files would be moved to the cache folders. It would be possible to
> implement the full migration as part of akonadictl fsck though, so users
> could run it manually.
> 
> What are your opinions?

I like this idea in general. I just wonder whether to combine it with a larger
default threshold size externally. Cause, I use

[%General]
Driver=QMYSQL
SizeThreshold=32768

and except for loosing files that you already fixed, this worked really well
for me. Maybe 32768 is a bit too agressive, but I really wonder whether the
default of 4 KiB is the best value to choose. Maybe 8 KiB would be good as I
bet many mails are just a tad bid larger than 4 KiB. I think I will try to dig
out a file size statistic tool to measure the mail sizes in my local Maildir
for my private POP3 account with more 1 million mails.

Of course, this change can be made independendently, but as it directly affects
the number of files in filesystem cache, I thought I mention it here. But well
so or so, if the default value is raised, it will be less files, so one level
of indirection is enough and it can really be done independently. Might be
better as well in order to test the impact of each change independently. I
would also remove my custom setting in order to test this change.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the Kdepim-bugs mailing list