T7846: akonadi > MySQL ERD Review

Fri Feb 2 01:38:47 GMT 2018

pablo added a comment.

  **Round #2 **

  I've spot-checked the ERD's and they appear to be correct based on the provided SQL.  In fact, I pointed the ERD tool to my Neon VM and told it to reverse engineer the run-time akonadi database.

  **Observations**

  **[partTypeTable]**

  - The Alternate Key is on (name, ns).  Is it necessary to  use both columns?

  //Associative Tables do not have a Surrogate Key - slight anti-pattern//

  A logical many-to-many relationship is physically implemented by creating an //Associative Table//.  For example, in the ERD //PIMItemTagRelation// is such a table as it resolves the many-to-many between **[PIMItemTable]** and **[TagTable]**.

  The anti-pattern is that most (not all) of these tables do not have a //surrogate key// (e.g. **id** defined as an //identity// data type) as the other tables.

  Furthermore, If an associative table is a parent to another table, then it **does** gets a surrogate id.  I understand the variance.

  An additional issue with these tables, those without a surrogate id, is the primary key is composed of the foreign keys.  Nearly all of these tables have two parents so we end up with an index like the following:  (FK1, FK2).  The problem is there is no supporting index should we Join only with FK.  To solve that, the model defines an index on FK2.

  The exception to the above is **[RelationTable]**, it has three parents.  As expected, the composite key is on its three FK's and two additional indexes, one on FK2 and one on FK3 is created.

  What I recommend we (eventually) do is:

  - Ensure we can drop the existing, composite Primary Key
  - Add a surrogate id to each table and make it the Primary Key
  - Create separate indexes on each Foreign Key

  By doing the above, the optimizer/planner can determine the most selective index to use when determining an execution plan for a query.  Additionally, all the tables are consistent which means any functionally dependent code is also consistent.

  **[CollectionTable]**

  - I would investigate further whether the index on **(name, parentId)** is necessary.  It is possible it's necessary if we have an extremely high fan-out from the parent to the next level of children (e.g. 20, 50, 100 next-level children)?
  - Generally indexes on low cardinality columns (e.g. gender) are not helpful.  Unless of course one is fetching only by that column **and** perhaps a small percentage of the data qualifies by the value.  With this in mind, I would investigate if we need indexes on //enabled//, //syncPref//, //displayPref// and //indexPref//.  I suspect we should be able to drop them.  I'm curious to know the SQL which uses these columns.  I suspect other columns are used which would have existing supporting indexes.

  F5687811: akonadi-mysql.pdf <https://phabricator.kde.org/F5687811>

  F5687812: akonadi-mysql-birds-eye.pdf <https://phabricator.kde.org/F5687812>

TASK DETAIL
  https://phabricator.kde.org/T7846

To: pablo
Cc: dvratil, kde-pim, pablo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-pim/attachments/20180202/063fe4b0/attachment.html>