My JavaScript book is out! Don't miss the opportunity to upgrade your beginner or average dev skills.

Sunday, June 07, 2009

MySQL UPDATE DELAYED - Just My Solution

The Problem

MySQL supports DELAYED option as INSERT statement but, for some obscure reason nobody got so far, it does not support UPDATE DELAYED or DELETE DELAYED.

Why DELAYED Is Necessary

As quick summary, INSERT DELAYED is something truly useful when we would like to let MySQL decide when it is time to perform that insert without blocking tables and users. This is a problem almost specific for engines like MyISAM, MEMORY, ARCHIVE, and BLACKHOLE. This post is not about "why not InnoDB", this is about DELAYED with these engines, specially with MyISAM one.
To better understand a common case where an UPDATE DELAYED could be extremely useful, just think about a counter, how many users visited that id. Is such information that useful to be not approximated and to possibly block every other select because of an update that nobody will verify with a microscope? In my opinion, such information is "just a plus" and if the page execution could be faster without blocking anybody and putting those UPDATES in a queue as is for INSERT DELAYED in order to perform it in a shot and in "the best moment ever" ... why on earth we should have not such option?

UPDATE LOW_PRIORITY Is NOT The Same

Unfortunately, we could choose to let the user wait a bit more thanks to LOW_PRIORITY option with an UPDATE statement but still: why that user should wait more for such irrelevant information?

Split The Problem! That's My Idea

Apparently there are no good options to emulate an UPDATE DELAYED with MySQL so I tried to use INSERT DELAYED plus some trick able to delegate the LOW_PRIORITY update to somebody else (crontab or database via event), without blocking any user and making page response faster with hundreds of thousands of rows for a single static table.

Example Tables

Try to imagine we have a generic service based on an id, which is unique and used in every other part of the database as main relation. Now for this id we could have other N fields, it does not matter, as long as we use a total field to know how many users visited that id.

CREATE TABLE IF NOT EXISTS counter (
id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
/* one or more fields (possibly static) */
total INT(3) UNSIGNED NOT NULL
) ENGINE=MyISAM;

/* let's populate the counter table with 3 ids */
INSERT INTO counter (total) VALUES (0), (0), (0);

The REPLACE DELAYED query will be a suicide for a big table with thousands of records, while INSERT DELAYED cannot be performed in the main table or every key will be duplicated for N times. To perform an INSERT DELAYED we need then another table, with just a single field, possibly the same type used for the main key, in this example the id.

/* specific table for delayed inserts */
CREATE TABLE IF NOT EXISTS counter_total (
id INT(10) UNSIGNED NOT NULL,
KEY id (id)
) ENGINE=MyISAM;


/* visits for ids, 3 users in id 1, 2 in id 2, 1 in id 3 */
INSERT DELAYED INTO counter_total VALUES (1);
INSERT DELAYED INTO counter_total VALUES (2);
INSERT DELAYED INTO counter_total VALUES (1);
INSERT DELAYED INTO counter_total VALUES (3);
INSERT DELAYED INTO counter_total VALUES (2);
INSERT DELAYED INTO counter_total VALUES (1);

Last INSERTs are to emulate 6 users visiting id 1 3 times, id 2 twice, and id 3 only once. Since we are using INSERT DELAYED it does not matter how many users are surfing the page and how many are in the same id, we simply do not care and we let MySQL decide when it is about the time to shot every insert in one go for the WRITE ONLY table, counter_total. Moreover, since the main id is the key for this solution, the query is so simple that its execution time will be extremely fast, even in my laptop.

Every N Seconds, Minutes, Hours ...

At this point the counter_total table will contain N main ids, what we have to do know is:

  1. block possible insert delayed locking the counter_total table

  2. have a snapshot of visited ids and total visit, via function COUNT

  3. reset counter_total to keep it small and fast

  4. release counter_total to allow queued INSERT DELAYED to be executed

  5. find a way to quickly update with low_priority the main counter table


To solve above points we could use a TEMPORARY TABLE created directly with ENGINE=MEMORY. This will ensure us CPU+RAM speed performances without using that big amount of RAM in any case (if we do, we need to reduce the delay between one call and another one).
Here is the sequence to perform the entire operation in a flash!

/* WebReflection UPDATE DELAYED Workaround */
CREATE TEMPORARY TABLE tmp_counter (
id INT(10) UNSIGNED NOT NULL,
total INT(3) UNSIGNED NOT NULL
) ENGINE=MEMORY;
LOCK TABLE counter_total WRITE;
INSERT INTO
tmp_counter (
SELECT
counter_total.id,
COUNT(counter_total.id)
FROM
counter_total
GROUP BY
counter_total.id
)
;
DELETE FROM counter_total;
UNLOCK TABLE;
UPDATE LOW_PRIORITY
counter AS c
LEFT JOIN
tmp_counter AS tc
ON(
c.id = tc.id
)
SET
c.total = (
c.total + tc.total
)
;

That's it, tested over hundred of thousands of records in a reasonable 0.2 time and with an old laptop that apparently, if "used" properly, can still give me a lot :P

As Summary

A counter is a classic example where an UPDATE DELAYED could be useful. Unfortunately MySQL does not support it (probably version 6) so we can use the supported INSERT DELAYED (please note with replication will be a normal INSERT due to synchronization problems but it will be for a write only table). To do this, we need to create another table and lock it when we decide that it is about the time to update the main one. With a truly simple sequence of queries, this could be a super fast operation, rather than a bottleneck if for some reason 100 users are visiting our website with "inline update".

Now, do you have a better solution? If Yes, please share it, thanks ;)

P.S. I need to say thanks to Olga for the temporary suggestion, I was almost struggling with missed MyISAM transactions :D

6 comments:

Mikuso said...

I'm not a guru on this by far, but have you tried:

INSERT DELAYED ... ON DUPLICATE KEY UPDATE ...

?

Mikuso said...

Sorry, ignore my last comment, ON DUP is ignored when combined with DELAYED

Andrea Giammarchi said...

Sorry Mikuso, I did not ignore your comment because I am sure a lot of people think it's that simple ... well, it's not, otherwise I would not have spent a minute for this ;)

Pablo said...

hi there, Ive readed your entire post and found it very usefull. I work on a website with huge traffic and too many visits per items (so your examples comes right up for me).
Many thanks .. im building some test around your examples.

Cheers!

Mrten said...

i think that should be an inner join in the update, not a left join.

Gunja said...

I have implemented this, and whilst it works. I cannot insert into the Counter table (the one with just the id), when the totals are being calculated as I get the error:

oh no... INSERT DELAYED can't be used with table 'counter' because it is locked with LOCK TABLES

So whilst your updating (creating the temp table, you run into this problem. It IS doucmented on this page: http://dev.mysql.com/doc/refman/5.0/en/insert-delayed.html

"An error occurs for INSERT DELAYED if used with a table that has been locked with LOCK TABLES because the insert must be handled by a separate thread, not by the session that holds the lock."