[Kde-pim] Akonadi MySQL backend: tuning for larger accounts or switching to MariaDB with a different storage engine?

Martin Steigerwald Martin at lichtvoll.de
Wed Mar 26 19:22:28 GMT 2014


Am Mittwoch, 26. März 2014, 15:04:27 schrieb Daniel Vrátil:
> On Wednesday 26 of March 2014 13:06:23 Martin Steigerwald wrote:
> > Hi!
> > 
> > I reported
> > 
> > Bug 332626 - MySQL tuning: adaption of MySQL tuning options for larger
> > accounts
> > https://bugs.kde.org/show_bug.cgi?id=332626
> > 
> > 
> > But its somewhat arguable, I know.
>
> I actually talked about optimizing mysql.conf with MariaDB people on FOSDEM
> in February.

Thanks for your prompt reply, Daniel.

What were the results of your talk? Did you receive any recommendations? I am 
a Linux Performance analysis and tuning generalist, so I just have the basic 
knowledge about it.

> > I think I saw quite an improvement with raising the default low innodb
> > buffer pool size of Akonadi of 80 MiB, which is even below the MySQL
> > default of 128 MiB, to 256 MiB helped with a 11 GiB large POP3 account,
> > especially when synchronizing linux-kernel-ml folder with about 223000
> > unread mails. And also on retrieving and filtering new mails there, which
> > can make KMail quite unresponsive on a quite powerful machine
> > (Sandybridge,
> > 8 GB RAM, dual SSD BTRFS RAID 1). I will do some testing, yet it is
> > difficult to know for sure. mysqltuner.pl script gave some clear
> > recommendations tough, which are to high for a desktop machine tough, I
> > think (>1 GiB of innodb buffer pool size). It seems it recommends
> > adjusting
> > to actual data size in database, which would only be suitable if all of
> > the
> > data is accessed often.
> 
> The problem with innodb_buffer_pool_size is, that it directly impacts the
> amount of RAM the mysqld process takes, which  makes it difficult to find a
> good compromise between what LKML subscribers need and what users with
> couple of calendars and an adressbook need.

Interestingly after moving Akonadi data to local storage (Ext4 SoftRAID1), I 
did not find much difference between 80 MiB and 512 MiB of innodb buffer size.

Also as I reduced the POP3 setup account to 80 MiB default again which I am 
still using now, I do not see any visible differences either. I will observe 
this for a bit more, as I really had the impression that raising it on that 
setup helped. Sure thing, some benchmarks make sense.

Christian Mollekopf wrote in a bug report, he is even just using 8 MiB without 
issues, so…

> > What is the amount data Akonadi accesses frequently? Are there any
> > experiences with that? I bet it depends highly on how the user uses it and
> > the amount of mails in a single folder to be synchronized.
> 
> This is difficult to say and I don't think anyone ever tried to benchmark or
> measure it. Indeed it does depend on how you use Akonadi, an
> 
> When you open a folder in KMail, we run
> 
> SELECT * FROM PartTable WHERE parentCollection = XY AND partType =
> "PLD:ENVELOPE"
> 
> which will return all envelopes in the collection (that's subject, date,
> from, to, cc, bcc, ...)
> 
> akonadi=# SELECT MIN(datasize), MAX(datasize), AVG(datasize) FROM PartTable
> WHERE partTypeId = 3;
>  min |  max  |         avg
> -----+-------+----------------------
>   45 | 23354 | 487.8883434774372617
> (1 row)
> 
> akonadi=# SELECT datasize AS median FROM PartTable WHERE partTypeId =3 ORDER
> BY datasize DESC OFFSET (SELECT (COUNT(*) + 1) / 2 FROM PartTable WHERE
> partTypeId = 3) LIMIT 1;
>  median
> --------
>     451
> 
> (partTypeId = 3 == SELECT id FROM PartTypeTable WHERE ns = 'PLD' AND
> name='ENVELOPE')
> 
> Which makes roughly 6.5MB of data in a collection of 15 000 items. I don't
> think we have any application (except for the indexer) that would request
> the full payload of all items in a collection of such size. I know
> KAddressbook and KOrganizer do that, but the collections there are much
> smaller, and so are the actual payloads.

Interesting. I think there are some MySQL status variables that may help with 
estimation. Here some examples from the POP3 setup with currently 80 MiB 
InnoDB buffer pool size[1]:

[1] https://dev.mysql.com/doc/refman/5.5/en/server-status-variables.html#statvar_Innodb_buffer_pool_pages_dirt

[2] https://mariadb.com/kb/en/xtradbinnodb-server-status-variables/#innodb_buffer_pool_bytes_dirty

| Innodb_buffer_pool_pages_data            | 4873         |
| Innodb_buffer_pool_bytes_data            | 79839232     |

Actually data in pool.

| Innodb_buffer_pool_pages_dirty           | 89           |
| Innodb_buffer_pool_bytes_dirty           | 1458176      |

Dirty (modified) memory in InnoDB pool.

| Innodb_buffer_pool_pages_flushed         | 24448        |
| Innodb_buffer_pool_pages_free            | 0            |
| Innodb_buffer_pool_pages_misc            | 247          |
| Innodb_buffer_pool_pages_total           | 5120         |
| Innodb_buffer_pool_read_ahead_rnd        | 0            |

Well obviously, with my gigant POP3 account:

martin at merkaba:~/.local/share> du -sh local-mail     
11G     local-mail
martin at merkaba:~/.local/share> find local-mail -type f | wc -l           
815046
martin at merkaba:~/.local/share> find local-mail -type d | wc -l
1001

Obviously folders need to be somewhat divided as for cur / new / tmp.

| Innodb_buffer_pool_read_ahead            | 698532       |
| Innodb_buffer_pool_read_ahead_evicted    | 98723        |

Those are unused readaheads. Read, but evicted before use

| Innodb_buffer_pool_read_requests         | 914046494    |
| Innodb_buffer_pool_reads                 | 15625921     |

Now this is interesting: These are the number of read requests InnoDB couldn´t 
satisfy from the buffer pool. Quite low actually, only 1,7% of the total read 
requests.

After clicking between some folders of 5000-30000 mails each, I get:

| Innodb_buffer_pool_read_requests         | 945120475    |
| Innodb_buffer_pool_reads                 | 16222353     |

Now I switch between three of those for a while, ext4-ml, btrfs-ml and 
fsdevel-ml and introduce a new one, xfs-ml (all between 10000-30000 unread 
mails):

| Innodb_buffer_pool_read_requests         | 962699391    |
| Innodb_buffer_pool_reads                 | 16520208     |

Hmmm, I start to doubt the usefulness of raising the buffer pool size 
considerably. Maybe I am missing something here tough.

| Innodb_buffer_pool_wait_free             | 0            |

Also there are no waits for free pages at all even after my quick tests above:

| Innodb_buffer_pool_write_requests        | 1682523      |

I ponder whether raising buffer pool size much would really bring benefits… a 
slight increase if more than 4 GB of RAM or more than 8 GB of RAM or so are 
available may make sense, but even there I have no actual data to back this. 
But I bet this needs a more long time observation. I restarted Akonadi today 
to reduce buffer pool size to 80 MiB again, so its not running for long so far.

I will observe this a bit and also try the kernel-ml case with 220000 mails in 
one folder. Thus I´d not optimize for this one by default :). Maybe it gives 
an impression on a relation between folder mail count and InnoDB buffer 
behavior.

> > What do you think?
> 
> I think we can try to play with the values a bit, having some kind of
> benchmark or test tool would be nice :-)

I bet there are test cases for Akonadi and one could probably be used. But 
would need to create an account with some mails in it first (for POP3) or use 
an IMAP account with some mails in it.

> > Do you have any plans to switch to MariaDB and probably investigating
> > storage engines which may be more suitable for tuning to desktop loads?
> 
> MariaDB is a drop-in replacement for MySQL, so it's supported by default
> (I've been using MariaDB instead of MySQL for over year before I switched
> to PSQL)
> 
> AFAIK the only engine that we can really use is XtraDB, which is just an
> enhanced fork of InnoDb.

I see already digged further into this.

> > InnoDB is difficult as innodb_buffer_pool_size needs to take free memory
> > into account which can change quite rapidly on desktops or anything else
> > than a dedicated database server. A engine which uses Linux pagecache for
> > most of its caching would be interesting I think.
> 
> Maybe we could have some kind of "initial setup" tool that would check
> available memory and would update innodb_buffer_pool_size depending on that
> (taking in account some max limit, min limit, max percentage, ...)

Well I am backing off a bit at the moment. Maybe those read bursts I observed 
where not related to InnoDB buffer pool size, but to something else. If I see 
it again, I try to look in Akonadiconsole as to what Akonadi is doing there.

Ciao,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/


More information about the kde-pim mailing list