Waffle Grid: Async IO Concerns

Last night I was on a plane to a client site and I was reviewing the waffle grid code for inclusion of multi-gets when calling read-ahead functions (potentially in other area’s as well)…  their I noticed something that may slow down our performance in waffle grid.  You see we read from memcached in the buf_read_page_low function.  This function is responsible for checking wether a page exists in the buffer and if not making an IO request for it via the function fil_io.  What I saw in buf_read_page_low was:  we check the buffer, then check memcached, then go get the page off disk… you follow?  Alright The get page from disk part is challenge.  Let me do a quick deep dive on the internal function calls.

You see buf_read_page_low is called by a few different functions in a few different ways, the ones I am concerned with are the functions that are pasing a sync of false ( you do not want synchronous IO, send it to the background thread).  If Sync is passed as false, fil_io recieves this and sets a mode variable to OS_AIO_NORMAL..

if (sync) {
mode = OS_AIO_SYNC;
} else if (type == OS_FILE_READ && !is_log
&& ibuf_page(space_id, block_offset)) {
mode = OS_AIO_IBUF;
} else if (is_log) {
mode = OS_AIO_LOG;
} else {

Now the fil_io calls the os_aio function which you pass mode to , see here:

ret = os_aio(type, mode | wake_later, node->name, node->handle, buf,
offset_low, offset_high, len, node, message);

Mode is then used in os_aio to either queue the OS_AIO_NORMAL requests or execute the OS_AIO_SYNC requests. Here is the check and call for OS_AIO_SYNC:

if (mode == OS_AIO_SYNC
&& !os_aio_use_native_aio
) {
/* This is actually an ordinary synchronous read or write:
no need to use an i/o-handler thread. NOTE that if we use
Windows async i/o, Windows does not allow us to use
ordinary synchronous os_file_read etc. on the same file,
therefore we have built a special mechanism for synchronous
wait in the Windows case. */

if (type == OS_FILE_READ) {
return(os_file_read(file, buf, offset,
offset_high, n));

ut_a(type == OS_FILE_WRITE);

return(os_file_write(name, file, buf, offset, offset_high, n));

So anything calling this with OS_AIO_NORMAL should get a response back in microseconds, while anything calling this with OS_AIO_SYNC will wait milliseconds for the disk to respond.

Ok… now back to waffle and memcached. We are checking memcached for innodb pages even when the IO is async…  this means calls ( that use buf_read_page_low )  that typically take microseconds, are going to be stuck waiting for memcached hits or misses.  So I think we minimally have to add a check in the code that only does the call for sync IO.  However we are discussing moving the memcached code down a level into the fil_io,the os_aio call, or maybe even the os_file_read call.  That would ensure that the async io does its memcached lookup in the background, and does not hinder other functions.

This entry was posted in innodb internals, mysql, Waffle Grid. Bookmark the permalink.