[Kroupware] OpenGroupware.org formed through SKYRiX open sourcing its Groupware server

Helge Hess helge.hess at skyrix.com
Sun Jul 13 03:21:58 CEST 2003


On Sonntag, 13. Juli 2003, at 1:46 Uhr, Andrew Kohlsmith wrote:
>> AFAIK the problem with E4L is, that they store *every* MAPI attribute
>> as a single SQL row which obviously doesn't scale.
> I'm not sure what you mean here -- There are basically two tables; 
> object
> and properties.  If you wanted to pull all the properties for an 
> object you
> do the appropriate select on the property table.

Exactly ! There should be a table for each MAPI object class. In MAPI 
each message has associated a certain message class, like 
IPM.Appointment or IPM.Contact. Those message classes actually map 
pretty well to SQL tables (and indeed, Exchange itself is becoming more 
and more a full featured SQL server).

So there should at least be tables like
   appointment	(MAPI IPM.Appointment)
   contact		(MAPI IPM.Contact)
   task			(MAPI IPM.Task)

eg if you have 1000 appointments in a folder, on a table based approach 
(which requires "understanding" MAPi records) you need to store 1000 
rows in the database. And you need to do a select on a single table 
over 1000 rows to find records with no joins.

In the E4L approach you don't store MAPI attributes in the colums of a 
table but rather as separate rows in the property table. This implies 
that if an appointment has, say 30 properties, you need to store and 
query 30000 rows for 1000 appointments ! And - even worse - you need to 
make the database perform a join between those two tables.
This alone makes the E4L data model at least 5 times slower (taking 
into account indexes, by a "dump" implementation or improper setup of 
SQL indexes it would be 30 times slower having to query 30 times the 
amount of data).

But this is made worse by the fact that properties of *all* messages 
are stored in a single table. So the database has to perform a join on 
at least three times the data (which is completely unnecessary, since 
the properties of the various MAPI message classes are distinct).

> Seems like it'd scale _wonderfully_ well.

I don't see how.

Simple calculation: 1000 users each having 1000 appointments, 100 jobs, 
100 contacts and 4000 messages - 20 attributes in average, which is 
few. The properties table will have to deal with 1000 x 1000 x 100 x 
100 x 4000 x 20 rows:
   800.000.000.000.000

No wonder it's slow even for 10 users.

> I've been playing with it some more, and the performance issue is more 
> to do
> with the current Python server than anything else; they are doing 
> sorting
> and whatnot in the Python server instead of in the DB, and have stated 
> why
> and that it'll be done correctly in the new version, scheduled to be
> released at the end of the month.

Even if they are using the SQL sorting they will quickly run into 
scalability problems with such a table.

> While you're correct that there is extra overhead, I don't think that 
> it is
> going to be very significant.  Time will tell.

Sure ;-)

>> The problem is that the E4L connector does not "understand" the MAPI
>> messages. It just splits them up into separate rows and stores them as
>> raw attributes. Since a MAPI message has about 50-100 individual
>> attributes you end up with 50-100 times the rows actually required (in
>> other words, with a MAPI plugin which properly handles MAPI messages
>> you can serve about 50 times more users on the same hardware
>> [oversimplified]).
> Yes, this is exactly what it appears that E4L is doing.

Well, IMHO it obviously isn't since it can't scale. The approach 
doesn't map well on how SQL databases work which results in the speed 
problem.

Even if it works well for a certain amount of users, and solution 
actually using the capabilities of a SQL database will be able to host 
way more users on the same hardware.

BTW: Kolab's IMAP approach actually might map very well to how Outlook 
works and probably is way faster than E4L.

best regards,
   Helge
-- 
OpenGroupware.org	- http://www.opengroupware.org/



More information about the Kroupware mailing list