21 août 2007
Drupal 5.2 - authmap DB Table
Yes, I picked the second table on my list, authmap. The first one is probably an error from someone that copy-paste bad code? Or a newbie? Or a moron?
This one seems to be too complex to design too : there's one primary key and 3 columns. Ouch!
They indexed the authname (name of authentified user), so it is not a problem...
But on user.module / user_set_authmaps(), there's one full tablescan on update or delete. Each time an user authmap entry will be deleted or modified, a full authmap table-scan will be generated on MySQL.
Sorry I didn't read it well at first, in fact, there's a loop...
The truth is, for each authmap entry of an user, on per Drupal module at least, there will be one full table-scan of authmap. Two full table-scan of authmap for each removed right! Wow!
Imagine having 100 000 user accounts, 40 Drupal Modules, and removing or changing one user right. MySQL will traverse more than 4 millions DB records! Ouch!
Quick fix is: to create a combined index on (uid, module) to avoid table-scan.
Real solution: create index on (uid), rewrite this part, reading all authmap entry for the user, generate one grouped INSERT to insert added rights, and one DELETE to delete at once all removed rights.
Anyway you have a similar problem on user.module / user_delete() where a table scan occurs each time you delete an user! Solution or Quick-Fix above will fix that too.
The two first tables used by Drupal 5.2 have been examined.
Two simple tables with only 4 columns in each.
And albeit their simplicity, both are potential disaster waiting for your website to fly!
Drupal 5.2 DB Design isn't correct, nor even bad, this is a pure disaster that shouldn't happen outside beginner's DB courses.
Wanna learn more? How scalability has been thrown away? stay online!
Drupal 5.2 - Access DB Table
First table I look for on Drupal 5.2 is Access Table. First table on the list and firsts problems.
Wrong design, poor SQL, no MySQL or general DB understanding, and it's a table involved in EACH PAGE VIEWED, either CACHED or NON-CACHED.
There's 2 problems, one a total disaster on the MySQL point of view, the other just a burden.
Burden is on statistics.module / statistics_top_visitors(), there's a condition on Type that is not indexed. Each and every entry on the access log will need a table scan on access table.
Poor design, and poor statistics performances!
Solution: put an index on Type at least, and you could use a combined index on (Type, Mask) to avoir accessing data while filtering.
The disaster is on bootstrap.inc / drupal_is_denied(). This function is called on each and ever ypage, even cached ones, and is commented as "we tried to optimize it as much as possible".
Using a SORT order and a condition that is not index lead to a full table-scan, and a physical filesort, on MySQL, for each and every page viewed!
Obvious to say that either table-scan or filesort should be avoided as much as possible on a DB server, and that both together is a serious problem, even on statistically rare sql statement.
But having both table-scan and filesort on a statement that occurs on each and every page results on a total performance hog!
You ask yourself why Drupal couldn't handle load? Why MySQL seems to be overloaded albeit concurrent user number is poor? Poorly designed database and badly written SQL queries.
Solution: use a MAX() for mysql instead of a sort, and add a combined index on (Type, Mask).
(This could lead from 10X to 100X speed improvment on this unique query, that occurs at each page request!)
Should I give you a look to all other Tables?
I mean if the first table with 1 primary key and 3 columns could drag down the entire web site, with "optimized" sql code, is it because of an error here, or all the DB Design and queries as to be examined and redone?!?
Drupal 5.2 - Ouch!
I used Drupal 4.7 for time on a precedent employment and it didn't fulfill expectations, as I pointed out.
Security and Performances are largely beyond what you expect for personal homepage or blogs, not talking about a real web site, we discovers that one (1) user with an ADSL or Cable broadband access could overload 8 bi-cpu PHP servers, if we could a strong enough MySQL server to handle this load!
From the start Drupal 4.7 lacked support for MySQL replication or clustering, the two main technologies to scale up popular websites. Worse, the DB Design choices unpair the ability to implement clustering, and the PHP architecture impair the ability to use master-slaves replication. Even putting session in MEMORY table was impossible. (yes Sessions are on DB, and on MyISAM Tables!)
On my new hire, we use Drupal 5.2 to create "Fast To Market" website, and they expect me to rewrite it (using PHP5) or strengthen existing developments
I thought Drupal 5 is Object. I thought Drupal 5 is rewritten cleanly. I thought Drupal 5 has been reviewed to have a correct DB schema from start. And I hoped Modules where cleanly written and their tables almost corrects.
I was wrong: Drupal 5 is cleanly designed, from an user point of view, or a web administrator.
But it's PHP code is poorly written and the Database Design is a nightmare!
I will take example after example, as I cycle from one table to another, only convering Drupal 5.2 core code (no external Module involved).