Well its about time I posted this:) This is part 1 of 3 in my Ramsan series.
For those who have paid attention to my blog, know I love talking IO! I also love performance. Absolutely love it. Love disk, disk capactiy, io performance, solid state.. So as I march towards my UC session on MySQL Performance on Solid State Disk my goal is to try and test as many high end solid state disk systems as possible. All the vendors have been great, giving me access to some really expensive and impressive toys. I finished up testing Texas Memory System’s flash appliance the RamSAN 500 this week and wanted to post some numbers and some thoughts. TMS makes RamSAN appliances that merge disk and RAM into a really fast SANS. First I go a ways back with TMS, I deployed and Oracle Rac installation on one of their RamSAN 300′s several years back and was impressed by the sheer raw power of the device. The two main flavors of devices they ship are ddr based and flash based SAN’s. The ddr based SAN actually backs up its data from memory to disk at different interval to ensure data retention which is awesome. The flash device meanwhile uses DDR as a cache on top of flash. The idea is your hot data is served by lots of ddr ( 16-64 GB ), while everything else comes off the flash devices. TMS was very generous to provide me with access to a system connected to a RamSAN-500 ( flash based ) for a couple of weeks. The system had 32Gb of cache and a TB of flash.
Testing a device like this is challenging, because the standard benchmark tools I generally use just don’t push the disk enough. In my dbt2 & sysbench tests I ended up hitting CPU limits long before I hammered the disk. So for these tests, I started building a new benchmark. This benchmark was not finished in time ( it still is not finished ) to fully flex the RamSAN but it did show some interesting numbers. So lets jump right in.
I used the fileio tests in sysbench to test various datasizes. The fileIO test is a staple of my benchmarking, but at higher concurrencies I started to see cpu spikes that may have limited the amount I drove the system. I still wanted to run this test because a lot of flash out their starts to bottleneck as you get closer to filling the drive.
The smaller datasets saw a nice boost in performance from the internal DDR cache on the RamSAN. You will notice as the size of the test increased we started hitting the flash as opposed to the cache more and more. Still 16-17K (16192 byte ) iops is excellent, nearly 8x higher then my single Intel devices were capable of. All of these tests were based on 16 threads. But what if we have more threads? Can push this a bit higher?. Lets try more threads!
The test that used 32 threads actually saw a nice jump in iops, showing I had not maxed out the device yet. Unfortunately 32 threads started to hit CPU bottlenecks… but I think I could have driven the system even more, possibly hitting 40+ threads before the cpu cried uncle.
These numbers are awesome, as a point of reference recently I tested a midrange 16 Disk raid10 San at a client site and came away with only 2500 iops on a 16 thread 20gb test, so 30000 iops is impressive.
Stay tuned for part two…