Speaking @ Percona Live in London Next Week!

A quick note, I am speaking at Percona live in London next week… its should be a rip roaring time. I have two topics I am speaking on.

The first is on building a MySQL Data Access Layer with Ruby and Sinatra. While this may seem a bit odd, its actually very cool and useful. With < 100 lines of code you can build some pretty awesome web services to expose your data to the outside world or even better yet build web services to replace developer maintained ORM's! Additionally if you use this as part of a broader framework you can achieve all kinds of coolness. In fact all the code that I will show was written in < 2 hours, and it only took that long because I was trying to make it flow into the presentation. The second presentation, How I learned to stop worrying and Love Big Data should be fun. I promise the following: An interesting take on Big data, Dr Strangelove, Simpson's Humor, Daleks, wacky signs, Darth Vader, and a Monty Python reference... oh yeah this should be fun.

Posted in Matt, mysql, NOSQL, performance | Comments Off on Speaking @ Percona Live in London Next Week!

Innodb Compression: When More is Less

So Vadim posted on the MySQL Performance Blog about poor benchmarks when running innodb compressed pages.  I ran some tests a few weeks ago and did not see the same results as him and checked into my previous tests and compared them to his #’s.  In a round about way verifying his thoughts on Mutex contention I found that increasing the BP sized with compressed data decreases the transactional throughput. The test was run with an uncomressed data set size of 6GB, 3.1GB compressed read-only.

2G, NOZIP 3217.19
8G, NOZIP 4479.81
8G, NOZIP, 16BP 4424.3
1G,ZIP 1120.3
2G,ZIP 1181.8
4G,ZIP 38
8G, ZIP 33.6
8G, ZIP, 4BP’s 226
8G, ZIP, 8 BP’s 544.7
8G, ZIP, 12BP’s 3009.79
8G, ZIP, 16BP’s 3026.1


You can see that adding memory to the innodb buffer pool slows things down. Looking at the Innodb status, you can see things getting locked up. What interesting though is you can mitigate alot of this simply by making use of multiple buffer pools.

Here is where it waiting:

OS WAIT ARRAY INFO: reservation count 529411, signal count 133604
--Thread 140360422000384 has waited at /var/lib/buildbot/slaves/percona-server-51-12/DEB_Ubuntu_maverick_amd64/work/Percona-Server-5.5.10-rc20.1/storage/innobase/buf/buf0buf.c line 3483 for 0.0000 seconds the semaphore:
S-lock on RW-latch at 0x412fb68 '&buf_pool->page_hash_latch'
a writer (thread id 140360421799680) has reserved it in mode exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file /var/lib/buildbot/slaves/percona-server-51-12/DEB_Ubuntu_maverick_amd64/work/Percona-Server-5.5.10-rc20.1/storage/innobase/buf/buf0buf.c line 3483
Last time write locked in file /var/lib/buildbot/slaves/percona-server-51-12/DEB_Ubuntu_maverick_amd64/work/Percona-Server-5.5.10-rc20.1/storage/innobase/buf/buf0lru.c line 1626
--Thread 140355466110720 has waited at /var/lib/buildbot/slaves/percona-server-51-12/DEB_Ubuntu_maverick_amd64/work/Percona-Server-5.5.10-rc20.1/storage/innobase/buf/buf0lru.c line 813 for 0.0000 seconds the semaphore:
Mutex at 0x412fb28 '&buf_pool->LRU_list_mutex', lock var 1
waiters flag 1
--Thread 140355465910016 has waited at /var/lib/buildbot/slaves/percona-server-51-12/DEB_Ubuntu_maverick_amd64/work/Percona-Server-5.5.10-rc20.1/storage/innobase/buf/buf0buf.c line 4398 for 0.0000 seconds the semaphore:
Mutex at 0x412fb28 '&buf_pool->LRU_list_mutex', lock var 1
waiters flag 1
--Thread 140360422201088 has waited at /var/lib/buildbot/slaves/percona-server-51-12/DEB_Ubuntu_maverick_amd64/work/Percona-Server-5.5.10-rc20.1/storage/innobase/buf/buf0buf.c line 4398 for 0.0000 seconds the semaphore:
Mutex at 0x412fb28 '&buf_pool->LRU_list_mutex', lock var 1
waiters flag 1
--Thread 140355466311424 has waited at /var/lib/buildbot/slaves/percona-server-51-12/DEB_Ubuntu_maverick_amd64/work/Percona-Server-5.5.10-rc20.1/storage/innobase/buf/buf0buf.c line 4398 for 0.0000 seconds the semaphore:
Mutex at 0x412fb28 '&buf_pool->LRU_list_mutex', lock var 1
waiters flag 1
--Thread 140355466512128 has waited at /var/lib/buildbot/slaves/percona-server-51-12/DEB_Ubuntu_maverick_amd64/work/Percona-Server-5.5.10-rc20.1/storage/innobase/buf/buf0buf.c line 4398 for 0.0000 seconds the semaphore:
Mutex at 0x412fb28 '&buf_pool->LRU_list_mutex', lock var 1
waiters flag 1
Mutex spin waits 455865, rounds 16856962, OS waits 524701
RW-shared spins 29949, rounds 192489, OS waits 3591
RW-excl spins 499, rounds 26672, OS waits 117
Spin rounds per wait: 36.98 mutex, 6.43 RW-shared, 53.45 RW-excl

Long story short, if your using compression + Innodb you may want to look into using multiple buffer pools until this is fixed.

Posted in benchmark, innodb internals, mysql | Comments Off on Innodb Compression: When More is Less

MySQL Bootcamp at Collaborate 2011 pt 2

Hi All,

I am going through some of the sessions for IOUG’s Collaborate 2011 Conference and trying to fill in slots for the bootcamp, and while we have some great sessions we could use a few more sessions. Specifically I would love to get a couple of sessions on a few the following topics:

  • InnoDB in General
  • InnoDB Internals & scalability
  • General Overview of available Storage Engines (Pros/Cons)
  • MySQL options for Very Large databases ( ala Partitioning, Sharding )
  • MySQL Monitoring Options
  • NDB Cluster

These are just a few suggestions, please feel free to submit any topic thats related to MySQL…  even if its not a fit for the bootcamp, there is a MySQL track that it would fit into.  IOUG extended the MySQL Deadline until next Monday for us, so let’s get some more papers in!  You can submit here

Now I know  many folks are torn because the MySQL UC falls during this same time…  but there should be enough room for both conferences.  I know many DBA’s and Developers who are torn when it comes to which conferences to attend as they are forced into supporting multiple database technologies (Oracle, SQL Server, DB2, MySQL, etc) .  Collaborate gives these people a place to go and learn about more then Just MySQL.  Recently I have seen many “Classic Oracle DBA’s”  asking more questions and trying to Learn more about MySQL.  This is one opportunity where the MySQL community has the chance to leave a lasting impression on Oracle users who have not had any MySQL experience before.

This is untamed territory!  Come on down to Orlando and help Oracle users understand the benefits of MySQL!


Posted in Matt, mysql | Comments Off on MySQL Bootcamp at Collaborate 2011 pt 2

Helping to Organize a MySQL Bootcamp @ IOUG’s COLLABORATE 11

I am helping IOUG Organize a MySQL bootcamp at their Collaborate conference in Orlando. This is actually a great opportunity to reach out to a lot of Oracle talent looking for more information and training on MySQL. An IOUG Bootcamp is compiled from several 1 hour technical sessions starting from introductory level topics on day 1 to advanced topics the final day. The idea is this format will help those not familiar with MySQL, get a crash course in MySQL while also providing people with a wide range of targeted sessions that they can come in an out of as they see fit. It’s like the tag line of most Carnivals, Fun for all ages…

The reason I am posting is we are going to need lots of help from you. We are looking for speakers for the conference at the moment… you can submit your papers here: http://www.ioug.org/callforspeakers. Think of it like this, this is an opportunity to be a pioneer… reaching people who may have never attended a MySQL Related conference, heard the good word of MySQL, or maybe even know much about Open Source. Hmmm, maybe I should dress up as a cowboy or pioneer for the event! The question is will that attract or turn people away. Maybe we should vote on it!

Posted in Matt, mysql | 1 Comment

Speaking in Chicago on Tuesday September 28th

I wanted to drop a quick note and let everyone know I am going to be speaking at an IOUG event on 9/28/2010 in Downtown Chicago. I will be targeting DBA’s, Developers, and users who want to know more about MySQL but do not have the time to devote a ton of time to learning everything little thing. I will be covering DBA 101 tasks in my 5 minute DBA talk, Developer & DBA common mistakes, common high availability architectures, and talking about the various versions, forks, and patches of MySQL that are floating around in the community.

You can register here:

I hope to see you there.

Posted in 5 minute dba, mysql | Comments Off on Speaking in Chicago on Tuesday September 28th

Great to see everyone at the UC

It was awesome to see everyone at the 2010 mysql UC. Sorry if I did not get a chance to chat with everyone, time just flew by! I had great turn out for my two sessions and had a lot of great conversations with people. If people are looking for my slides they are posted on the User Conference Website here: http://en.oreilly.com/mysql2010/public/schedule/speaker/75377 .. Thanks Everyone!

Posted in 5 minute dba, Matt, mysql | Comments Off on Great to see everyone at the UC

Quick Cassandra Notes Part 1

Trying to use the Ruby bindings to do benchmarking, so far things are going rather slow compared to other benchmarks in Python. This could be the size of the data I am testing with as well. Still looking into things however so far loading 1 Million rows into cassandra takes ~4.6GB, while loading the exact same data into mysql takes ~950M. A 4.5x increase in storage is a lot, not sure if that will maintain as I get more data into the system, or if there is just a lot more overhead at the start. Will load 2M and 3M rows to see.

Also you have to “warm” cassandra like other databases… after loading my 1M rows, I ran some quick tests. 125 ops/s, 288 ops/s, 311 ops/s, 1530 ops/s, 1872 ops/s, 1868 ops/s…

It looks like I am really bottlenecked by the thrift calls in ruby ( per profile )… strange I am seeing the CPU tap out at 1 core when testing with this data set, testing with a smaller dataset or with python I use multiple cores… must just be a red haring.

Getting occasional ruby socket timeouts from thrift… need to look into that.

Posted in Uncategorized | Tagged , , , | Comments Off on Quick Cassandra Notes Part 1

More Debate, More Flame, More Choosing the correct tool for the job

You have to love all the debating going on over NOSQL -vs- SQL don’t you? With my UC session on choosing the right data storage tools ( does this sound better then SQL-vs-NoSQL?) I have been trying to stay current with the mood of the community so i can make my talk more relevant. Today I was catching up on reading a few blogs posts and I thought I would pass along these two: Pro SQL and Pro NoSQL … these represent the two very different views on this subject. (Note I think there are misleading facts and figures in these that should be flushed out more, but they are a good sample of what I am talking about). Sure lots of people have posted on this and even talked on it ( I am sure you have all seen Brian’s NOSQL -vs- MySQL presentation from open sql camp last year). You see there is a huge angery bitter flame war over who is right and who is wrong. People have very strong opinions on whether SQL or NOSQL is the anti-christ. We should organize a debate at some time. So who is right? My opinion is no one is.

The fact of is if a solution meets your needs and it works it is not wrong (it may have flaws or risks to different degrees). In the case of an RDBMS -vs- NOSQL, for some applications one is better then others. The issue I think we all run into is not really the merit of NOSQL -vs- a traditional RDBMS its the willingness to accept alternative views. Too many shops out in the world are all about the new hotness and not about what’s best for their application or organization. While other people would rather die then allow there database to be taken away from them. For some apps, durability is not a big deal for others it is. Everyone has different requirements. Just because Digg or Twitter or Rackspace is doing NOSQL and it works for them does not mean you have to use it, or that it will even work for you. In fact, if you leap without thinking you may in fact hurt yourself more then solve your problems. Every situation is unique and before you jump head first into one solution or another take a breath and analyse the situation. Ask questions like : Why are we thinking about NOSQL? Is just because of HA ( hey RDBMS’s can handle that! ), is it to replace sharding? Is it to do something else? … Ask yourself about the work you need to do: do you need to do complex joins? How much data will your really have? What sort of workload do you have? Really define your goal, then research and test solutions. I am sure that the big names using Cassandra or Hbase did not read a blog post somewhere and start converting everything that day, and you should not either.
Continue reading

Posted in benchmark, linux, Matt, mysql, NOSQL, performance | 1 Comment

New Benchmark I am working on that tests MYSQL -vs- NOSQL

I am giving a talk in a couple of weeks at the 2010 MySQL User Conference that will touch on use cases for NOSQL tools -vs- More relational tools, the talk is entitled “Choosing the Right Tools for the Job, SQL or NOSQL”. While this talk is NOT supposed to be a deep dive into the good, bad, and ugly of these solutions, rather a way to discuss potential use cases for various solutions and where they may make a lot of sense, being me I still felt a need to at least do some minor benchmarking of these solutions. The series of posts I wrote last year over on mysqlperformanceblog.com comparing Tokyo Tyrant to both MySQL and Memcached was fairly popular. In fact the initial set of benchmark scripts I used for that series actually has been put to good use since then testing out things like a pair gear6 appliances, memcachedb, new memcached versions, and various memcached API’s.

When I started really digging into some of the other popular nosql solutions to expand my benchmarks it became apparent that most of these tools have fairly well defined API’s for Ruby, however in general the API’s for perl in some cases may not exist at all or are rather immature at this point. So I decided to rewrite my initial benchmark suite in Perl. With the help of my co-presenter for this talk ( Yves ) we are writing a tool that will hopefully be able to test the same basic tests against a wide variety of solutions. Currently I have tests written for Tyrant, Memcached, Cassandra, and MySQL. We will be expanding these tests to include Redis and MongoDB for sure (Maybe NDB) … beyond that I am not 100% sure. The challenge is going to be writing code that not only tests basic features, but also can test the advanced features of these solutions. After all a simple PK lookup can be done on all of these solutions, but that’s not necessarily the bread and butter of a solution like MongoDB or even Cassandra. Its the extra features that make these more compelling. We will be releasing the code when its ready.

I have not started my more exhaustive benchmarks yet… as I am still writing parts of the benchmark, but I have been running a few benchmarks. I generally hate publishing or mentioning results until I have taken the time to analyse them and ensure I did not miss anything, but what the hell. In a very short read only test, using PK based lookups to compare Innodb -vs- cassandra -vs- memcached ( a really small data set that should easily fit into memory on both on my laptop **single node **) I end up averaging ~1.2K reads per second from Cassandra, ~ 4K reads per second from Innodb, and ~ 17K reads per second in memcached. Now as I setup more benchmarks I will test multi-node performance, tune the configs for the workload, etc… but it is interesting to see the early performance difference.

More later.

Posted in benchmark, linux, Matt, mysql, NOSQL, performance, Tokyo | 3 Comments

A few key Tokyo Cabinet Notes

I wanted to publish a few interesting gotcha’s , facts, and settings people who use or want to use Tokyo Cabinet/Tyrant should know.

A quick overview, Tokyo Tyrant is the network daemon that sits ontop of Tokyo Cabinet. This means that in order to access cabinet from another server you have to access it though Tyrant. In the context of this post consider when I say Tokyo to mean the entire stack.

#1. Tokyo Cabinet allows for a single write thread. Multiple processes can try and write through tyrant but they will wait. In order to get around this limitation you need to shard your data. Using something like a memcached api ontop of a hash table is one effective way to do this.

#2. Tokyo is not durable. This means in the event of system crash you will lose data. You can call a sync process to sync data to disk, but this locks the writer process. Your best bet is to use replication to ensure you have a copy of the data and backup often.

#3. Settings for Tokyo Cabinet Files can be set via Tokyo Tyrant by adding the settings after the cabinet file: i.e.


Some of these settings only take place on file creation or on optimize so make sure you check the documentation.

#4. By Default there is a limit of 2GB per file to Cabinet files, this can be worked around by setting the #opt setting for your table type. For instance #opt=HDBTLARGE enables large files for the hash table. This setting takes place on creation or when you optimize. You will corrupt your file if you hit 2GB without this setting. If you experience this, your best bet is to restore from a backup that is < 2GB and switch the large file flag. (Note if I am correct you can only change the file to support large tables by using the cabinet mgr tools, i.e. running tchmgr -tl cabinet.tch against an offline file ) #5. Run optimize on a regular basis, I have seen files shrink by as much as 90% from running optimize. * To run optimize on a table from tyrant you can run tcrmgr optimize -port xxx localhost ( This will lock writes ) * To run optimize a table from the cabinet command use the mgr for the correct table type ( i.e. tchmgr for the has table ). #6. Increase the number of Tyrant threads from the default 8 if your having issues with refused connections. This is done on the command line when starting tyrant: ttserver -thnum 16 #7. Log your Tyrant errors to a log file by using the -log flag when starting Tyrant. By default just setting the log will also log info/warning messages, disable this by setting the -le flag which tells tyrant to only log errors. #8. If your using a cabinet “table” database make sure you build the indexes you need otherwise your probably going to get rather slow performance. #9. In terms of performance the BNUM setting typically has the largest impact on performance. According to the docs “specifies the number of elements of the bucket array”. Every table type is a bit different, so check the docs for the exact settings. #10. For hash tables setting xmsiz can make a huge difference. This defines the memory allocated to mapping objects.

Posted in 5 minute dba, NOSQL, Tokyo | Comments Off on A few key Tokyo Cabinet Notes