High performance replacement of the MySQL Memory storage engine with NDB
2009-03-13 at 09:17 am Yves TrudeauPeople often wants to use the MySQL memory engine to store web sessions or other similar volatile data.
There are good reasons for that, here are the main ones:
- Data is volatile, it is not the end of the world if it is lost
- Elements are accessed by primary key so hash index are good
- Sessions tables are accessed heavily (reads/writes), using Memory tables save disk IO
Unfortunately, the Memory engine also has some limitations that can prevent its use on a large scale:
- Bound by the memory of one server
- Variable length data types like varchar are expanded
- Bound to the CPU processing of one server
- The Memory engine only supports table level locking, limiting concurrency
Those limitations can be hit fairly rapidly, especially if the session payload data is large. What is less known is that NDB Cluster can creates tables that behave similarly to a Memory table without the limitations. With NDB, if you create a table with the following syntax:
mysql> set ndb_table_no_logging=1; Query OK, 0 rows affected (0.00 sec) mysql> create table WebSession (session_key char(30), session_payload varchar(4000), primary key (session_key) using hash) engine=ndb; Query OK, 0 rows affected (0.75 sec) mysql> set ndb_table_no_logging=0; Query OK, 0 rows affected (0.00 sec)
Describing the table using the ndb_desc tool shows us the following:
root@yves-laptop:/usr/local/mysql-5.1.29-ndb-6.3.19# ./bin/ndb_desc -d test WebSession -- WebSession -- Version: 31 Fragment type: 5 K Value: 6 Min load factor: 78 Max load factor: 80 Temporary table: yes Number of attributes: 2 Number of primary keys: 1 Length of frm data: 291 Row Checksum: 1 Row GCI: 1 SingleUserMode: 0 ForceVarPart: 1 TableStatus: Retrieved -- Attributes -- session_key Char(30;latin1_swedish_ci) PRIMARY KEY DISTRIBUTION KEY AT=FIXED ST=MEMORY session_payload Longvarchar(4000;latin1_swedish_ci) NULL AT=MEDIUM_VAR ST=MEMORY -- Indexes -- PRIMARY KEY(session_key) - UniqueHashIndex NDBT_ProgramExit: 0 - OK
The important point here is the “Temporary table: yes” which means the table will not be logged to disk in the RedoLog and will not be dumped in the LCP. So basically, no disk activity for this table. What are the characteristics of those NDB temporary tables:
- Row level locking
- Hash index for the primary key
- Varchar not expanded, true varchar size
- Scalable, can be spread across a large number of servers, more memory, more CPU
- Many MySQL servers can access the session table in a consistent way. A good practice is a local MySQL daemon on each application server
- Less volatile then the Memory engine, as long as the cluster is up, the table is there
Too good to be true… I recently experimented with NDB temporary tables for session of a popular online game. We pushed the cluster very hard, with 10 servers simulating 4 millions active simultaneous game sessions and the cluster handled that very easily. The cluster was large, 24 data nodes (2 per server) and 20 SQL nodes. At full load, the ndbd processes were running at 20% of CPU usage. Even with such a large cluster, the session load would have too large for the disks of the ndb nodes without the use of “set ndb_table_no_logging=1;”.
So, one thing to remember, if you need a large temporary table, have a look to MySQL NDB cluster.
Nice tip!
never heard of this variable…I should have…is it possible it has been documented only recently?
kind regards,
Roland
So you think people will be better off running NDB rather than memcached for this?
Roland: Unfortunately, many recent NDB features are not documented. I learned about this one from Johan.
Mark: It depends some people don’t want to add another moving parts. NDB can be a full database, not only be use for web session. Other tables can be persistent on disk. Performance wise, NDB, at least with the NDB API, is almost on par with memcached.
This is interesting….
Perhaps the NDB team can create a new handler class for as a replacement to Memory so that MySQL could use it for all it’s temporary tables?
The problem with NDB cluster is that a global checkpoint GCP is needed for a create/alter table. Normally, there is a GCP at each 2 seconds, so a creating and dropping a temporary table can take up to 4 seconds. See http://www.bigdbahead.com/?p=204 for another way of handling efficiently temporary files.
I am sure that performance with NDB is awesome. But memcached is much simpler. And for this use case, most of the awesome-ness of NDB is not being used but you still get the complexity.
Yves correct me if I am wrong, but you can still have replicas built in here as well. This would provide some fault tolerance for this solution. So even if a data node goes done your data/database is still up.
Mark I agree memcached is sooooo much simplier. But their is a lot of memory database love out their, not 100% sure why. But I have some customers who refuse to use memcached.
Additionally this solution does offer a few other features not found in memcached ( security, the ability to search on multiple keys, the ability to join, etc. )… so their maybe a place for this.
My preference is for the query cache and the HEAP engine to be fixed to scale on SMP. As you mention, there is a lot of value to having the data in the database. Any data subject to audit controls (SOX, HIPPA, PCI) requires access control and keeping it in a database table or the query cache provides that. Also, if the data is kept in the query cache or a HEAP table, then you don’t have the overhead of managing separate memcached processes.
I don’t think too much work is required to fix both the HEAP engine and the query cache. But I have not heard of anyone doing this. Someone? Anyone? From the outside looking in MySQL seems to be investing much more energy in memcached than in fixing these problems.
matt, sure you have replicas. With Cluster you always have (unless you have set NoOfReplicas=2). We have implemented this with quite a few customers using Cluster for session management (but then we have also stored a “shard catalog/index” on Cluster as well):
http://johanandersson.blogspot.com/2008/12/cluster-session-management-tips-and.html
BR
johan
Johan, I obviously missed your post… Mine is indeed very similar.