Reducing latency of queries with NBD Cluster realtime extensions

I was recently involved in a project where the main requirement was the smallest possible latency for queries. The queries were simple insert and update statements, the update being by primary key. The cluster was using regular Gigabits Ethernet and formed by 3 servers with 8 cores each, one Management node and 2 data nodes. Since throughput was not a concern, a design decision has been made to locate the MySQL daemons on the same servers as the data nodes, in order to save network hops. MySQL chooses the closest NDB data node as its transaction coordinators and the closest on will be reachable on localhost.

Apart from the architectural decision to host MySQL and the NDB nodes on the same boxes, we decided to benchmark the new real time options of NDB. The options are the following:

  • RealtimeScheduler: enable the real time schedule, was set to 1
  • LockExecuteThreadToCPU: assign a given CPU to the NDB execute thread, was set to 7, a CPU not servicing interrupts
  • LockMaintThreadsToCPU: assign a given CPU to the NDB maintenance thread, was set to 6, a CPU not servicing interrupts
  • SchedulerExecutionTimer: This parameter specifies the time in microseconds for threads to be executed in the scheduler before being sent. Setting it to 0 minimizes the response time; to achieve higher throughput, you can increase the value at the expense of longer response times
  • SchedulerSpinTimer: This parameter specifies the time in microseconds for threads to be executed in the scheduler before sleeping.
  • From the definition, the impact of the later two parameters might be obvious for some but for other like me, I could not make and educated guess as of their values. Since I had at hand a simulator that could replicate the production load, I decided to attempt a variational study of both parameters. The execution time of the queries was extracted from the slow query log, the long_query_time being set to 1 us.

    The first variational study was on the SchedulerExecutionTimer parameters, with SchedulerSpinTimer set to its default value of 0. As presented below, the variation of this parameter had a very little impact on the latency of the queries. This parameter would probably have an impact on the throughput put that was out of the scope of the engagement.

    Variation of SchedulerExecutionTimer

    Being a bit disappointed by the variational study of SchedulerExecutionTimer I decided to move on to the variational study SchedulerSpinTimer, leaving SchedulerExecutionTimer to its last value of 400. With the variation of SchedulerSpinTimer, the results are much more interesting as one can see below.

    Variation of SchedulerSpinTimer

    The variation of SchedulerSpinTimer has a large impact and the latency is pushed toward lower values as the parameter is increased from 0 to 400. Another interesting aspect, although with all the tests, a small fraction of the queries were behaving badly, with times above 2 ms, the fraction is also going doing with SchedulerSpinTimer. In fact, with a value of 400, there are less than half of those long queries compared to a value of 0.

    As a conclusion, if you are concerned by latency, look at SchedulerSpinTimer. To further reduce the query time, high speed interconnects like Dolphin can be used and with such high speed nics, maybe 0.2 to 0.3 ms could be removed.

    About Yves Trudeau

    I work as a senior consultant in the MySQL professional services team at Sun. My main areas of expertise are DRBD/Heartbeat and NDB Cluster. I am also involved in the WaffleGrid project.
    This entry was posted in mysql, NDB Cluster, yves. Bookmark the permalink.

    4 Responses to Reducing latency of queries with NBD Cluster realtime extensions

    1. Henrik Ingo says:

      Ah, thanks. I’m sure this saved me (or Johan) a couple of days work to find out the same!

    2. If you use Dolphin driver, even for localhost
      communication it will decrease communication
      overhead even more.

    3. JD Duncan says:

      Does “LockMaintThreadsToCPU” make any difference? What if you leave it unset?