I have been out on several client sites over the last 3-4 months and the one thing that I see time and time again is the proliferation of commodity hardware. Not only are smaller servers popping up all over the place, but these servers are getting more and more powerful. Now you can not order hardware without getting dual or quad core machines with the capability to run a 64 bit OS. The machines built 2 or 3 years ago are nothing more then ebay fodder now. Items that end up collecting dust in a back room some where. Look at the evolution that we have gone through. Since 2005 look at what has become affordable in the commodity space: we have got dual core AMD&Intel, quad core Intel, AMD Opterons, 64bit processing, and a slew of chipsets and manufacturing processes in between. Cache on these processors has jumped from 512KB to 8MB in some cases. Not only has the processor seen marked improvements, so has the architecture on the motherboard. The Front Size bus has moved from older 533MHz, 667MHz, and 800MHz to 1066MHz or 1333MHz. Memory speeds have increased to operate at higher frequencies.
With all these advances in computing power no wonder commodity hardware is growing. I can go out to Dell right now and order the following for under 4K:
DELL Poweredge 2900
2- Quad Core Xeons @ 2.33GHZ ( 8 Cores total! )
8GB of Memory
292GB of disk space in RAID 5 (4 – 15K sas drives)
That seems like one powerful machine. Much much more powerful then you could have had for the same price 3 years ago. Mangers feel that commodity hardware like this will out perform anything they could have had for 10X the price previously.
The problem here is while you have increased the speed and performance of most of the computer, your disk is still little better then it was 5 years ago.
I headed over to storagereview.com to look at some of their articles and benchmarks on drive performance. The last Seagate Cheetah drives released ( 2006 ) were the 15.5 versions which are still used in many of todays servers. Storage review has a great article here on the 15.5 performance:
http://www.storagereview.com/ST3300655LW.sr … The differences between the 15.4 and 15.5 versions are not that great.
They actually end the review with “
The story in the end is not about groundbreaking performance but rather a steady and reliable increase in capacity. The Cheetah 15K.5 offers double the capacity of previous-generation units while slightly improving upon the 15K.4 under multi-user loads. This higher capacity comes at the cost of higher power and noise levels.
So while in the last 3 years or so you have in some cases quadrupled the amount of data that can be processed by your computer’s processors, you have done little to increase the performance of the physical hard drives themselves. The thing I am seeing more and more of is that this trend is causing an old bottleneck to once again come to light. Instead of 2 active CPU threads hitting disks, you now have 8 active CPU threads hitting disk at one time. I used to see this all the time on large multi-core machines. I was fighting these fires 7 or 8 years ago on large HP machines with 12 or 16 cores. The performance on my Oracle servers suffered even with large disk arrays with dozens of drives behind them. Now that commodity hardware is reaching the processing power of the big boxer servers of yesteryear it is time to once again look into the old problems that plagued us back then.
What can be done? In high processing environments you must plan things out carefully. Here are some general guidelines, these may vary from workload to workload.
#1. 4 Disks is not enough, the more disks the better. 4 disks maybe sufficient for a small traffic site with limited amounts of data, but not for high volume.
#2. DO NOT Size for space, size for IO throughput. Who cares if you only need 100GB of disk space for your database. That number is secondary to how much of that 100GB is going to be read, written, or updated. You may have to purchase a terrabyte of storage over 12 disks to support the IO throughput of your 10GB database… its a sad fact of life.
#3. A good controller card is your friend. Battery backed cache is also your friend. Spend some money on them. A lot of cache can cover a multitude of sins.
#4. Memory can also cover a multitude of sins. 2 or 4 GB is not enough, having this little memory in the system will only ensure you are going to read from disk more often.
#5. Index, better indexes can result in a lot less disk io
#6. Use the correct data-types. You do not need to use bigints, repeat after me you do not need to use bigints.
#7. You may have to add an external disk array, many servers only have enough room for 4-6 drives.
Another note, your disk subsystem may end up costing 2-3 times as much as the server, its ok… there really is no such thing as a commodity disk system:)
I really want to try looking at the MTRON SSD (http://www.dvnation.com/) disks vs the standard SATA raptors and SCSI Cheetah drives. I am really eager to see if adding 4 of these will equal to performance of X regular drives. This maybe a way to stay within the server chassis and avoid an external disk array.