Update multiple MySQL rows with a single query
This tip deals largely with sortable lists. When creating a list that will be sorted using an effects library like Script.aculo.us, I usually want to save the results of the sorting to the database. This is a usually simple task due to methods like Sortable.serialize() which take the contents of my list and store them in a form to be appended to my AJAX query string. Here’s a simple example:
adambondurant.com
oaklog.com
...
After creating my sortable and moving around an object from my list (which I won’t get into here), I serialize the list and get a query string like this:
feeds[]=27&feeds[]=28...
When I submit those parameters to my server, my parameters might look like this (in PHP):
[feeds] => Array
(
[0] => 27
[1] => 28
...
)
From here, I have a few options for storing the order of the feeds in my database: I can iterate through the array, assigning each position in the array to its corresponding value (feed ID). This option works for small lists, but sometimes I need to update 100 items at a time, and performing 100 UPDATE statements doesn’t sit well with me.
Another option is to leverage MySQL’s expression syntax to wrap the update into a single query. I can do this using MySQL’s CASE statement like so:
UPDATE components
SET position = CASE component_id WHEN 27 THEN 0 WHEN 28 THEN 1 END
WHERE parent_id = 19
AND component_type = 'Feed'
AND user_id = 1
AND component_id IN (27, 28)
The question I had, however, is which option performs better. In theory, I figured the latter was better because it didn’t have to hit the database n times, it was hit only once. This is why profiling exists, so I tried my best to do some testing.
I started with 100,000 rows in the components table, and my test script looped 10,000 times, updating the positions on 80 items each time. I purposefully constructed my SQL queries outside the profiled loop so I could see a comparison of SQL to SQL without much PHP overhead. Granted, I used a database abstraction layer (Zend DB) but this test is not so much about eeking out as much performance as it is comparing two similar approaches.
Here are the numbers I came up with when issuing the CASE-based approach (1 query) 10,000 times:
Lowest: 0.00206017494202 Highest: 0.221321105957 Average: 0.0034388064146
Here are the numbers performing 80 queries, 10,000 times:
Lowest: 0.0084331035614 Highest: 0.227936029434 Average: 0.0145111063004
It’s a bit strange to me that the highest time to issue the queries was pretty consistent, but what I really garnered from the testing was that it was about 422% faster to use the CASE-based approach than to issue individual queries. Maybe that was to be expected, but it’s still nice to see some data.
July 17th, 2008 at 4:21 pm
This was exactly what I was searching for. I reorganised some databasetables and I managing multiple rows now. I will use the CASE syntax, thank you.
August 18th, 2008 at 10:51 am
How does this compare to the following:
UPDATE components SET position = (@pos := @pos - 1) WHERE (@pos := 0) = 0 ORDER BY LOCATE(CONCAT(’|', id, ‘|’), ‘|28|27|’);