You know you have really screwed up when …
You crash MySQL/Innodb and you trace the error to a function that says:
“This function checks the consistency of an index page when we do not
know the index. This is also resilient so that this should never crash
even if the page is total garbage. ”
Oops… I guess its not that resilient after all:)
Waffle Grid Like Features in EnterpriseDB
Yves pointed this out to me:
http://www.internetnews.com/software/article.php/3825426/
“Moving beyond just providing Oracle compatibility, EnterpriseDB is adding new scalability features to their database. One of them is a feature called ‘”Infinite Cache,” which is based on the open source memcached application.
Mlodgenski commented that distributed memory caches are common, though management from a system management and a database perspective is often difficult.
“What we did is we took the distributed memory cache that was sitting in front of the database and put it behind the database,” Mlodgenski said. “This allows a simple SQL (define) interface for developers that allows access to the distributed memory cache and now the system management overhead is handed by the database itself.”
Mlodgenski added that Infinite Cache leads to a 10x to 20x performance gain for read application loads. ”
You can read more here:
http://www.enterprisedb.com/docs/en/8.3R2/perf/Performance_Guide-02.htm
and here:
http://www.enterprisedb.com/products/postgres_plus_as/overview.do
I guess it justifies that Waffle is a good idea.
Testing Waffle
So Ronald Bradford has been getting some EC2 AMI’s setup for Waffle the last couple of days. Really he is the first person to work with us on testing out the 0.5 cream release. It’s funny but when you are heads down in the code you tend to overlook some of the easy and obvious bugs. Just by having him walk through the install we have opened up several new bugs and come up with some new Idea’s on how to make things easier in the future. I hope by early next week we can have some of annoying bugs fixed and code pushed to launchpad. When that happens I would encourage others to give it a whirl and help us make Waffle Grid better performing, more user friendly, and more stable.
You can check out some of the bugs here.
Thanks to Ronald for his help on this!
MyISAM with key_buffer larger than 4 GB
In many applications, MyISAM can be used successfully if the proportion of write operations is only a small fraction of the read operations. As the tables grow, the 4 GB limitation in the key_buffer size caused performance issues and other strategies. Wait… did I miss something… From a recent comment to a post from Matt I re-read the MySQL documentation and realized I missed a big change that occurred with versions 5.0.52 and 5.1.23 (http://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html#sysvar_key_buffer_size)
As of MySQL 5.1.23, values larger than 4GB are allowed for 64-bit platforms
So, key_buffer as large or even larger than 32 GB are possible. That is an important design consideration that needs to be taken into account when tuning a database.
5 Minute DBA MyISAM Example Config Files
I have been asked this numerous times ever since I posted my InnoDB sample configuration files, Do I have sample configuration parameters for a MyISAM setup? We are seeing less and less people use MyISAM, but it is still popular ( especially in prepackaged form i.e. wordpress). So i figured why not adjust my sample InnoDB configs and make them suitable for MyISAM.
The biggest limitation to MySIAM is used to be the default key buffer only can could only be sized up to 4GB ( This was fixed in 5.0.52) . While you can create separate key buffers and assign indexes to them, it’s not very common ( Common as in present in low-end shops who need 5 minute dba help) in part because I think people do not fully understand it and you do have to plan for it. Setting up secondary caches is a database/application specific setup, it can not really be generalized, so I am not going detailing adding additional key caches here. My assumption is if you at the point of adding multiple key caches, you probably should not be copying your config of someone’s blog:)
As for the configs themselves they should look very similar to the InnoDB configs I published. In fact I change very little between the InnoDB and MyISAM configs.
Let me publish my disclaimers here. These are not the optimal settings for all environments, they are just meant to be starting points. The intended audience here are those developers and sysadmins who are DBA’s for 5 minuts at a time. They want a config that’s better then the default, “set it and forget it”. I may have missed a couple parms, so feel free to comment and I will update as needed.
Let get down to it:
|
32 bit system 2GB of memory Dedicated DB Box All MyISAM [mysqld] user = mysql pid-file = /var/run/mysqld/mysqld.pid socket = /var/run/mysqld/mysqld.sock port = 3306 basedir = /mysql/ datadir = /data01/data tmpdir = /tmp thread_cache_size = 128 table_cache = 512 key_buffer = 768M sort_buffer_size = 256K read_buffer_size = 256K read_rnd_buffer_size = 256K max_allowed_packet = 1M tmp_table_size=16M max_heap_table_size=16M query_cache_size=64M query_cache_type=1 log_output=FILE slow_query_log_file=/mysql/slow1.log slow_query_log=1 long_query_time=3 log-error=/mysql/error.log myisam_recover = force,backup myisam_sort_buffer_size=128M skip-innodb |
32 bit system 4GB of memory Dedicated DB Box All MyISAM [mysqld] user = mysql pid-file = /var/run/mysqld/mysqld.pid socket = /var/run/mysqld/mysqld.sock port = 3306 basedir = /mysql/ datadir = /data01/data tmpdir = /tmp thread_cache_size = 256 table_cache = 1024 key_buffer = 1500M sort_buffer_size = 256K read_buffer_size = 256K read_rnd_buffer_size = 256K max_allowed_packet = 1M tmp_table_size=32M max_heap_table_size=32M query_cache_size=128M query_cache_type=1 log_output=FILE slow_query_log_file=/mysql/slow1.log slow_query_log=1 long_query_time=3 log-error=/mysql/error.log myisam_recover = force,backup myisam_sort_buffer_size=256M skip-innodb |
64 bit system 8GB+ of memory Dedicated DB Box All MyISAM [mysqld] user = mysql pid-file = /var/run/mysqld/mysqld.pid socket = /var/run/mysqld/mysqld.sock port = 3306 basedir = /mysql/ datadir = /data01/data tmpdir = /tmp thread_cache_size = 256 table_cache = 1024 key_buffer = 4000M sort_buffer_size = 256K read_buffer_size = 256K read_rnd_buffer_size = 256K max_allowed_packet = 1M tmp_table_size=64M max_heap_table_size=64M query_cache_size=128M query_cache_type=1 log_output=FILE slow_query_log_file=/mysql/slow1.log slow_query_log=1 long_query_time=3 log-error=/mysql/error.log myisam_recover = force,backup myisam_sort_buffer_size=512M skip-innodb |
On machines with more then 8GB, feel free to increase the key buffer accordingly.
I am probably being overly cautious on some of these, but these are not supposed to be optimal… they are supposed to be better then default. Adjust as needed.
Leaving Sun/MySQL but not the community
I thought I would put out a quick note about this right now, I am going to be leaving Sun next week. I do have some plans which will start to materialize over the next few weeks and months so stick around for that. Even though I am leaving the company, I will still be lurking and working on all think database and performance related. This should not have any bearing on BigDBAHead.com or Waffle Grid, hopefully this will only mean I will post more and spend more time doing crazy and cool things!
I wanted to thank everyone at MySQL who I worked with and learned from over the past few years.
WaffleGrid: Cream Benchmarks, stable and delivering a 3x boost
Lets get down to how the latest version of Waffle Grid performs.
Starting off simple lets look at the difference between the wafflegrid modes. As mentioned before the LRU mode is the “classic” Waffle Grid setup. A page is put into memcached when the page is removed from the buffer pool via the LRU process. When a page is retrieved from memcached it is expired so its no longer valid. In the New “Non-LRU” mode when a page is read from disk, the page is placed in memcached. When a dirty page is flushed to disk, this page is overwritten in memcached. So how do the different modes perform?

| 4GB Memcached, Read Ahead Enabled | TPM | % Increase |
| No Waffle | 3245.79 | Baseline |
| Waffle LRU | 10731.34 | 330.62% |
| Waffle NoLRU | 10847.52 | 334.20% |
You can see here that with 100% of the data fitting in memcached we get about a 3x boost in performance over a non-waffle enabled setup. Note these tests here have the read-ahead enabled as well as the doublewrite buffer. With the ability to have all the data pages in memcached, the Non LRU solution shows up just a small touch faster the the classic LRU.
WaffleGrid: 0.5 Cream Release
I wanted to let everyone know that we are releasing Waffle Grid 0.5 code named Cream. This release fixes the nasty secondary index bug that plagued the butter release. I have been running tests on this code base for about a week straight with no errors. While I think this release is much more stable I would remind everyone this is still not a fully GA release. This release includes the ability to choose the mode of Waffle grid. By setting innodb_memcached_enable to 1, we will push pages to memcached when a disk read is done or when a page write is done, setting this to 2 will enable the classic LRU mode. If you decide to set this to 1 ( non-lru) I would recommend using the standard memcached, as with previous versions the LRU mode works better with our slightly altered memcached ( expire from memcached on get ). I will be posting benchmarks and more details within the next couple of days. Right now you can grab the patch on launchpad.
Be Wary of Large Log File Sizes
As some people have mentioned here and here Increasing the innodb log file size can lead to nice increases in performance. This is a trick we often deploy with clients so their is not anything really new here. However their is a caveat, please be aware their is a potentially huge downside to having large log file sizes and that’s crash recovery time. You trade real-world performance for crash recovery time. When your expecting your shiny Heartbeat-DRBD setup to fail-over in under a minute this can be disastrous! In fact I have been some places were recovery time is in the hours. Just keep this in mind before you change your settings.
Waffle: Progress and a Rearchtecture?
So I spent several hours over the last few days on the Secondary index bug. Out of frustration I decided to try and bypass the LRU concept all together and try going to a true secondary page cache. In standard Waffle a page is written to memcached only when it is expunged ( or LRU’d ) from the main buffer pool. This means anything in the BP should not be in memcached. Obviously with this approach we missed something, as Heikii pointed out in a comment to a previous post, it seems likely we are getting an old version of a page. Logically this could happen if we do not correctly expire a page on get or we bypass a push/lru leaving an old page in memcached to be retrieved later on.
So I was thinking why not bypass the LRU process? While I feel this is the most efficient way to do this, its not the only way. I modified innodb to use the default LRU code and then modified the page get to push to memcached on any disk read. Additionally I added a second push to memcached when we flush a dirty page. This should ensure ( hopefully ) that memcached is always up to date. This way really means all of your pages will have some limited persistence in memcached, in that they never expire ( unless you run out of space ).
We decided against doing Waffle this way previously for a couple of reasons. First we are duplicating the memory footprint. The LRU method is efficient in its either in memcached or in the BP, but should not be in both. The “persistent” way can have pages in both… that seems like a waste. Additionally it would be possible in the “persistent” way to have a BP that is 32G and a memcached pool of 32G and each is an exact copy of each other. The second reason is recoverability. In the LRU process we do not care if we miss a set. Since a get expires the cache, a get from memcached wipes out the page. So if the set fails, nothing is put into memcached and its read from disk the next time. The “persistent” way means a failed set means your data is potentially out of sync. So we need to code around it, we either need to retry the missed set, build a black list of pages to read from disk, or disable the entire cache when an error occurs.
Now on the flip side. Having the cache be somewhat persistent means less memcached sets, especially during the LRU process which should speed things up. After reading from disk once, you should never have to go back to disk for a read, so once again faster performance. Hindsight I am not sure that a set failure when using the LRU method is as safe as it needs to be anyways. If you miss a get, and a set, then come back online your data is probably going to be corrupt, so we probably need to add a blacklist/disable on failure feature anyways. Another plus is you can use the standard memcached ( no need for an MRU ) with the persistent method.