Quick note on bulk performance using on duplicate key update

Had an interesting thought today, I wonder how much faster ( if at all ) updating via insert on duplicate key in bulk was compared to individual update statements.  The client  app I was looking at receives updates to a counter from various severs around the world and they are updating a row that contains time metrics.  For instance update traffic set visistcount = visistcount + 1000 where customer = 123 and hour = ’12’ and date=’7/15/2008′.  These statements are driven by feeds from the various servers, so it lends itself to bulk operations very easily.   It seemed like batching these up will minimally save the parse and network travel time.

Basically instead of :

update bulk_update set val=10+val where val1 = 20;
update bulk_update set val=10+val where val1 = 21;
update bulk_update set val=10+val where val1 = 22;
update bulk_update set val=10+val where val1 = 23;

use something like this:

insert into bulk_update ( val , val1 , testdate ) values ( 10, 20, now() ),( 10, 21, now() ),( 10, 22, now() ),( 10, 23, now() ) on duplicate key update val=val+values(val);

A real quick test showed that the bulk on duplicate key syntax resulted in about a 30% performance improvement.

This entry was posted in mysql, performance. Bookmark the permalink.

One Response to Quick note on bulk performance using on duplicate key update

  1. burtonator says:

    It can be a lot faster in other ways too…. it can reduce ethernet latency for example.

    It also means that there are less overall transactions and if you’re on innodb this means less fsync if you’re running innodb_flush_log_at_trx_commit=1