Reducing latency of queries with NBD Cluster realtime extensions
2009-02-09 at 09:05 am Yves TrudeauI was recently involved in a project where the main requirement was the smallest possible latency for queries. The queries were simple insert and update statements, the update being by primary key. The cluster was using regular Gigabits Ethernet and formed by 3 servers with 8 cores each, one Management node and 2 data nodes. Since throughput was not a concern, a design decision has been made to locate the MySQL daemons on the same servers as the data nodes, in order to save network hops. MySQL chooses the closest NDB data node as its transaction coordinators and the closest on will be reachable on localhost.
Apart from the architectural decision to host MySQL and the NDB nodes on the same boxes, we decided to benchmark the new real time options of NDB. The options are the following:
From the definition, the impact of the later two parameters might be obvious for some but for other like me, I could not make and educated guess as of their values. Since I had at hand a simulator that could replicate the production load, I decided to attempt a variational study of both parameters. The execution time of the queries was extracted from the slow query log, the long_query_time being set to 1 us.
The first variational study was on the SchedulerExecutionTimer parameters, with SchedulerSpinTimer set to its default value of 0. As presented below, the variation of this parameter had a very little impact on the latency of the queries. This parameter would probably have an impact on the throughput put that was out of the scope of the engagement.
Being a bit disappointed by the variational study of SchedulerExecutionTimer I decided to move on to the variational study SchedulerSpinTimer, leaving SchedulerExecutionTimer to its last value of 400. With the variation of SchedulerSpinTimer, the results are much more interesting as one can see below.
The variation of SchedulerSpinTimer has a large impact and the latency is pushed toward lower values as the parameter is increased from 0 to 400. Another interesting aspect, although with all the tests, a small fraction of the queries were behaving badly, with times above 2 ms, the fraction is also going doing with SchedulerSpinTimer. In fact, with a value of 400, there are less than half of those long queries compared to a value of 0.
As a conclusion, if you are concerned by latency, look at SchedulerSpinTimer. To further reduce the query time, high speed interconnects like Dolphin can be used and with such high speed nics, maybe 0.2 to 0.3 ms could be removed.


Ah, thanks. I’m sure this saved me (or Johan) a couple of days work to find out the same!
If you use Dolphin driver, even for localhost
communication it will decrease communication
overhead even more.
Does “LockMaintThreadsToCPU” make any difference? What if you leave it unset?
nice post!