Using local tables and replication in a clever way with NDB

Tuning queries for MySQL NDB cluster is way trickier than tuning for any other storage engines. This is especially true for highly normalized schema. Look at the following query trying to retrieve all male account from a given city. Since a city is not unique, we also need to specify the state.

SELECT a.First_name, a.Last_name
FROM Account a INNER JOIN Cities c ON a.City_id = c.Id
  INNER JOIN States s ON c.State_id = s.Id
  INNER JOIN Gender g ON a.Gender_id = g.Id
WHERE c.Name = ‘Columbia’ AND s.Name = ‘South Caroline’ and g.Name = ‘Male’;

If you look at how the query will be executed with NDB, you realize that many hops over the network will be needed, at least one per table. If you push the cluster a lot, these extra hops will limit the performance. Since tables like Cities, States and especially Gender are nearly static, there is little value to use the NDB storage engine for these. A better approach would be to another storage engine like InnoDB or MyISAM. Because of the way NDB cluster works, we often end up with a very large number of possible connections to the MySQL servers and at such high number of connections, InnoDB performs very poorly so MyISAM is the best choice.

The recommendation here is to either maintain identical copies of these tables on each MySQL servers or to setup replication for one given database, called “Dictionary” and moved these tables there. Of course, using replication for the “Dictionary” tables will prevent the use of the cluster replication.

About Yves Trudeau

I work as a senior consultant in the MySQL professional services team at Sun. My main areas of expertise are DRBD/Heartbeat and NDB Cluster. I am also involved in the WaffleGrid project.
This entry was posted in mysql, NDB Cluster, yves. Bookmark the permalink.