Fun with innodb stats

I scrubbed the SQL here to protect the innocent, but check this fun stuff out.  Working with a client who randomly started seeing huge spikes in CPU and disk activity on their server after weeks of seemingly running fine.  I tracked it down to a subset of long running queries.  These queries typically run in around a second per run, but out of the blue they started taking 600 seconds.

Here is the explain for the first query:

Explain:


+—-+————-+——-+——-+——————————-+————+———+——————————-+——–+———————————————-+
| id | select_type | table | type  | possible_keys                 | key        | key_len | ref                           | rows   | Extra                                        |
+—-+————-+——-+——-+——————————-+————+———+——————————-+——–+———————————————-+
|  1 | SIMPLE      | t2    | range | PRIMARY,ProdSource,ProdItmID | PRIMARY    | 4       | NULL                          | 886558 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | t1    | ref   | PRIMARY,ProdSource,ProdItmID | PRIMARY    | 8       | const,database1.t2.ProdItmID |      1 | Using where                                  |
|  1 | SIMPLE      | t3    | ref   | PRIMARY,ProdSource,ProdItmID | ProdItmID | 4       | database1.t1.ProdItmID       |     30 | Using where; Distinct                        |
+—-+————-+——-+——-+——————————-+————+———+——————————-+——–+———————————————-+
3 rows in set (0.00 sec)

The explain was vastly different then it was when I checked this query a few weeks ago.

The query itself was such that the t1.Primary=NNN and t2.Primary in ( NNNN, NNNNN, NNNNN ) so logically t1.Primary should have been the driver here.  So I did a straight_join.

Explain straight join:


+—-+————-+——-+——+——————————-+————+———+————————-+——-+———————————————-+
| id | select_type | table | type | possible_keys                 | key        | key_len | ref                     | rows  | Extra                                        |
+—-+————-+——-+——+——————————-+————+———+————————-+——-+———————————————-+
|  1 | SIMPLE      | t1    | ref  | PRIMARY,ProdSource,ProdItmID | PRIMARY    | 4       | const                   | 63766 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | t2    | ref  | PRIMARY,ProdSource,ProdItmID | ProdItmID | 4       | database1.t1.ProdItmID |    30 | Using where; Distinct                        |
|  1 | SIMPLE      | t3    | ref  | PRIMARY,ProdSource,ProdItmID | ProdItmID | 4       | database1.t1.ProdItmID |    30 | Using where; Distinct                        |
+—-+————-+——-+——+——————————-+————+———+————————-+——-+———————————————-+
3 rows in set (0.00 sec)

Well that looks better, doesn’t it?  But what’s up with the optimizer.  You see according to the docs innodb stats are collected with “10 random dives into each index tree to determine cardinality”  operative word  is random.  It also collects these behind the scenes.  So what happens when I do analyze on my table?


mysql> analyze table product_items1;
+————————–+———+———-+———-+
| Table                    | Op      | Msg_type | Msg_text |
+————————–+———+———-+———-+
| database1.product_items1 | analyze | status   | OK       |
+————————–+———+———-+———-+
1 row in set (0.11 sec)
+—-+————-+——-+——+——————————-+————+———+————————-+——-+———————————————-+
| id | select_type | table | type | possible_keys                 | key        | key_len | ref                     | rows  | Extra                                        |
+—-+————-+——-+——+——————————-+————+———+————————-+——-+———————————————-+
|  1 | SIMPLE      | t1    | ref  | PRIMARY,ProdSource,ProdItmID | ProdSource | 4       | const                   | 12676 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | t2    | ref  | PRIMARY,ProdSource,ProdItmID | ProdItmID | 4       | database1.t1.ProdItmID |    15 | Using where; Distinct                        |
|  1 | SIMPLE      | t3    | ref  | PRIMARY,ProdSource,ProdItmID | ProdItmID | 4       | database1.t1.ProdItmID |    15 | Using where; Distinct                        |
+—-+————-+——-+——+——————————-+————+———+————————-+——-+———————————————-+
3 rows in set (0.00 sec)
mysql> analyze table product_items1;
+————————–+———+———-+———-+
| Table                    | Op      | Msg_type | Msg_text |
+————————–+———+———-+———-+
| database1.product_items1 | analyze | status   | OK       |
+————————–+———+———-+———-+
1 row in set (0.17 sec)
+—-+————-+——-+——-+——————————-+————+———+——————————-+——–+———————————————-+
| id | select_type | table | type  | possible_keys                 | key        | key_len | ref                           | rows   | Extra                                        |
+—-+————-+——-+——-+——————————-+————+———+——————————-+——–+———————————————-+
|  1 | SIMPLE      | t2    | range | PRIMARY,ProdSource,ProdItmID | PRIMARY    | 4       | NULL                          | 886558 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | t1    | ref   | PRIMARY,ProdSource,ProdItmID | PRIMARY    | 8       | const,database1.t2.ProdItmID |      1 | Using where                                  |
|  1 | SIMPLE      | t3    | ref   | PRIMARY,ProdSource,ProdItmID | ProdItmID | 4       | database1.t1.ProdItmID       |     29 | Using where; Distinct                        |
+—-+————-+——-+——-+——————————-+————+———+——————————-+——–+———————————————-+
3 rows in set (0.01 sec)

mysql> analyze table product_items1;
+————————–+———+———-+———-+
| Table                    | Op      | Msg_type | Msg_text |
+————————–+———+———-+———-+
| database1.product_items1 | analyze | status   | OK       |
+————————–+———+———-+———-+
1 row in set (0.11 sec)

+—-+————-+——-+——-+——————————-+————+———+——————————-+——–+———————————————-+
| id | select_type | table | type  | possible_keys                 | key        | key_len | ref                           | rows   | Extra                                        |
+—-+————-+——-+——-+——————————-+————+———+——————————-+——–+———————————————-+
|  1 | SIMPLE      | t2    | range | PRIMARY,ProdSource,ProdItmID | PRIMARY    | 4       | NULL                          | 886558 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | t1    | ref   | PRIMARY,ProdSource,ProdItmID | PRIMARY    | 8       | const,database1.t2.ProdItmID |      1 | Using where                                  |
|  1 | SIMPLE      | t3    | ref   | PRIMARY,ProdSource,ProdItmID | ProdItmID | 4       | database1.t1.ProdItmID       |     41 | Using where; Distinct                        |
+—-+————-+——-+——-+——————————-+————+———+——————————-+——–+———————————————-+
3 rows in set (0.00 sec)

mysql> analyze table product_items1;
+————————–+———+———-+———-+
| Table                    | Op      | Msg_type | Msg_text |
+————————–+———+———-+———-+
| database1.product_items1 | analyze | status   | OK       |
+————————–+———+———-+———-+
1 row in set (0.13 sec)

+—-+————-+——-+——-+——————————-+————+———+——————————-+——–+———————————————-+
| id | select_type | table | type  | possible_keys                 | key        | key_len | ref                           | rows   | Extra                                        |
+—-+————-+——-+——-+——————————-+————+———+——————————-+——–+———————————————-+
|  1 | SIMPLE      | t2    | range | PRIMARY,ProdSource,ProdItmID | PRIMARY    | 4       | NULL                          | 886558 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | t1    | ref   | PRIMARY,ProdSource,ProdItmID | PRIMARY    | 8       | const,database1.t2.ProdItmID |      1 | Using where                                  |
|  1 | SIMPLE      | t3    | ref   | PRIMARY,ProdSource,ProdItmID | ProdItmID | 4       | database1.t1.ProdItmID       |     39 | Using where; Distinct                        |
+—-+————-+——-+——-+——————————-+————+———+——————————-+——–+———————————————-+
3 rows in set (0.00 sec)

mysql> analyze table product_items1;
+————————–+———+———-+———-+
| Table                    | Op      | Msg_type | Msg_text |
+————————–+———+———-+———-+
| database1.product_items1 | analyze | status   | OK       |
+————————–+———+———-+———-+
1 row in set (0.11 sec)

+—-+————-+——-+——+——————————-+————+———+————————-+——-+———————————————-+
| id | select_type | table | type | possible_keys                 | key        | key_len | ref                     | rows  | Extra                                        |
+—-+————-+——-+——+——————————-+————+———+————————-+——-+———————————————-+
|  1 | SIMPLE      | t1    | ref  | PRIMARY,ProdSource,ProdItmID | ProdSource | 4       | const                   | 12676 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | t2    | ref  | PRIMARY,ProdSource,ProdItmID | ProdItmID | 4       | database1.t1.ProdItmID |    16 | Using where; Distinct                        |
|  1 | SIMPLE      | t3    | ref  | PRIMARY,ProdSource,ProdItmID | ProdItmID | 4       | database1.t1.ProdItmID |    16 | Using where; Distinct                        |
+—-+————-+——-+——+——————————-+————+———+————————-+——-+———————————————-+
3 rows in set (0.00 sec)

Thoughts? I think maybe we should look into a hack to turn off auto-stats gathering, and do a deeper dive then the 10 random pages. I think many people would be willing to schedule an analyze once a week for more consistent executions. thoughts?

This entry was posted in mysql, performance, rant. Bookmark the permalink.

7 Responses to Fun with innodb stats

  1. Mark Callaghan says:

    The hack that I want is a stats table in the mysql database into which table-index stats can optionally be added. Then a background job can update the table as needed and changes will propagate to slaves.

    Bad stats cause bad query plans for us all of the time. This is a huge problem and today’s workaround is to put hints into too many queries to force a good join order.

  2. matt says:

    I agree a table or a way to shutoff stats per table would be great… Been looking at this, an easy but not as useful patch here would be to allow you globally shutoff stats collection ( would have to allow it if no stats are set ) and then allow a global var to set the page depth used by the stats update. I think that could be relatively easy to put in.

  3. Mark Callaghan says:

    In the big Google patch we have code to set the number of samples via a SET command. But I prefer a better ugly hack.

  4. Shantanu Oak says:

    Try using “select straight_join” syntax to help the optimizer select the best index plan while joining tables.

  5. KevinF says:

    yes please, a table allowing us to set the stats the way we want them would be wonderful. Imagine being able to set stats to reflect a terabyte DB when you have a 100MB test environment? App Devlopers could much better gauge their code’s performance.

    Maybe the XtraDB folk want to take on something like this. But yeah, shutting off the background threads would be nice. I hate surprises in the night when stats go awry.

  6. matt says:

    I have got a hack that shuts down the background stats collection ( with the exception of when you run analyze ) … but its sort of flaky, so I am trying to figure out why before I push out to launchpad.

  7. Pingback: Big DBA Head! - Database Brain Power! » Patch to Disable Innodb’s Dynamic Stat Generation