Friday, March 4, 2011

Balancing Power, Price, and Performance in the Server CPU World


Selecting an Appropriate Server CPU
For many businesses, performance isn't the top priority when it comes to selecting a server; chances are that low power and CAPEX budget are higher on the list. AMD's newest Opteron 4100 series is targetting exactly those businesses. The 4100 is the little brother of the Opteron "Magny-Cours" 6100. The Opteron 6100 crams up to two hex-core or quad-core chips in one package. In contrast, the "Lisbon" Opteron package contains only one chip. The "Lisbon" Opteron with C32 socket thus comes with the same improvements that the Opteron 6100 had over the hex-core "Istanbul":
  • Support for DDR3 memory (low voltage also supported)
  • Higher HyperTransport speeds.
  • Improved C1E sleep state.












The dual socket capable Opteron 4100 tries to find a place between the relatively cheap but single socket Xeon 3500/3600 series and the more expensive dual socket Xeon 5600 series. We chose three AMD Opterons and two Intel Xeons for a closer look.
The hex-core Opteron 4162 EE promises to consume no more than 32W (35W TDP), or an amazing low ~5W per core. The chip runs at a modest 1.7GHz and comes with an affordable $316 price tag. You can get a slightly faster 1.8GHz version, the 4164 EE, but that chip costs more than twice as much ($698). As we are searching for low power and inexpensive CPUs, it didn't make the cut. The only disadvantage other than the lower clock speed is the lower clocked HT3 link at 2GT/s instead of 6.4GT/s.
If that is still too expensive for you, AMD has also a quad-core 2.2GHz Opteron 4122 at probably the lowest price ever for a dual socket server CPU: $99. The CPU needs 75W on average according to AMD (95W TDP). You'll probably want to pay a little more for the 2.6GHz 4130 ($125), but unfortunately we didn't get that CPU in our labs. Adding about 15-18% to the performance numbers of the 4122 should tell you what the 4130 is capable of.
Most of you are probably looking for a good balance between power, throughput, single threaded performance, and price. The hex-core 2.1GHz Opteron 4170 HE is a good candidate at only $174. AMD promises that average power should be around 50W under load, with a maximum of 65W.
Simply stated, Intel does not like to play in those price ranges. The cheapest Xeon is priced at $188, and offers you the four cores of the E5603. At 1.6GHz, without Hyper-Threading, and with the L3 cut in half (4MB) we doubt that it will be a good alternative. It also needs a bit more power: 80W.
The only "decent" Xeon in the low price ranges is the Xeon E5606 (four "Westmere" cores at 2.13GHz, 8MB L3, no HT). Unfortunately, we didn't have this chip in the lab. To give you an idea where it would land, we added a Xeon E5506 at 2.13GHz, which is based on the older "Gainestown/Nehalem" architecture and has less L3 (4MB). Based on our past experiences you should add about 10 to 20% of performance to get an idea where the E5606 would land. In general, the Opterons will need to surpass this older chip to be compelling.
The low power Intel chips are priced a bit higher. We asked Intel, and the "slowest" low power chip they would send is the Xeon L5630. It offers four cores with Hyper-Threading (eight threads) at 2.13GHz, 12MB of L3, and consumes a very low 40W TDP. It will need to beat all the Opterons with a decent margin to justify the rather heavy $550 price tag.
In summary, it looks like AMD might have found a some unclaimed territory here as Intel does not offer low power and cheap Xeons. The question of course is whether the performance/watt/price ratio is interesting enough, and that's what we're here to find out.

Server Benchmark Configurations
We used two relatively basic servers, both made to be affordable and low power. The Intel server comes recommended by Intel, and the Opteron based server is similarly recommended by AMD.
The intel server is an Intel SR1690WB 1U server:

CPU2x Xeon E5506 2.13GHz or
2x Xeon L5630 2.13GHz
RAM8x4GB (32GB) Samsung DDR3-1333 CH9
MotherboardIntel S5500WBV
ChipsetIntel 5500
BIOS versionS5500.86B.01.00.0054,092820101104
PSUDelta Electronics DPS-650SB B Rev
The AMD server was also a 1U server, the Tyan Tyan YR190B8228, a 1U "Twin" server. The twin server consists of a 1U chassis containing two completely separate servers.
CPU2x Opteron 4162 EE 1.7GHz
2x Opteron 4122 2.2GHz or
2x Opteron 4170 HE
RAM8x4GB (32GB) Samsung DDR3-1333 CH9
MotherboardTyan B8228Y190X2-045V4H
ChipsetAMD SR5650
BIOS versionYR190-B8228-x2_v101
PSU3Y Power Technology YM-2451C RevA 450W

The disk system was identical for each server. We equipped each with a Western Digital 64GB SSD SSC-D0064SC-2100 as the boot disk with an Adaptec 5085 PCI-E 8x SAS controller connected to a Promise JBOD J300s. We placed the VMs on six SAS disks (Fujitsu MAX3073RC) in RAID-0. The Oracle OLTP databases are on two Intel SLC X25-E SSDs.

Virtual Performance on vSphere 4
Performance might not be the top priority when you buy these type of servers, but it is still important. The Tyan Server can have up 96GB of RAM, so you don't want the CPU to become a bottleneck when you are consolidating your workloads. You also want quite a bit of future headroom in case you like postponing the capital investment in new servers.
So we turned to our vApusMark II, which runs eight virtual machines. Each virtual machine contains a real-world application, not just benchmarks. 

vApusMark II score

Interestingly, the L5630 wins with a 17% margin thanks to Hyper-Threading. The Opteron 4170 HE holds its own pretty well considering it is three times cheaper. However, it is important to understand that the lower the price of the CPU, the less impact it will have in the overall capital investment in the servers.

SQL Server 2008
The Flemish/Dutch site Nieuws.be is a web 2.0 website launched in 2008. It gathers news from many different sources and allows the reader a personalized view on all the news. The Nieuws.be site sits on top of a large database—more than 100GB and growing. This database consists of a few hundred separate tables, which have been carefully optimized by our lab (the Sizing Server Lab). We use a log from their site for this test.
99% of the loads on the database are selects and about 5% of them are stored procedures. Network traffic is 6.5MB/s average and 14MB/s at the most, so our Gigabit connection still has a lot of headroom. Disk Queue Length (DQL) is at 2 in the first round of tests, but we only report the results of the next rounds where the database is in a steady state. We measured a DQL close to 0 during these tests, so there is no tangible impact from the hard disk speed.
Note: Starting with our twelve-core Opteron review, we are using a new heavier log. The Nieuws.be application has become more popular and more complex, the database has grown, and queries have become more complex too. The results are no longer comparable to previous results. They are similar, but much lower.

Nieuws.be MS SQL Server 2008 (new heavy log)

At first you might surprised that the Intel chips win here, as this was one of the benchmarks where AMD's high-end CPUs excel. But of course, we should not forget that the Magny-cours chips like the Opteron 6174 had twice as many cores as its Xeon competitors. The "Lisbon" CPU only has 50% more cores than its typical Xeon alternatives.

Idle Power Use
According to AMD, the key market for the low power "Lisbon" Opteron EE and HE is the cloud and web hosting space, where achieving the lowest power consumption possible is priority number one. So we measured the idle power draw running vSphere 4.1. The power policy selected was "Balanced".

idle power test

Before we start analyzing the numbers, we must say that the Intel platform can probably go lower. It is possible to use a 450W PSU in this server, but we only had the 650W version available. This is an 80 Plus Silver certified PSU, so it should be at least 75% efficient at 10% load and 85% efficient at 20% load, and we're at around 15% load for this test. Although the idle numbers might be somewhat high for the Intel platform, our real-world power numbers on the next page shouldn't be off by more than a few watts, and those real-world numbers are the ones that count.
The AMD system came equipped with 16GB of low power DIMMs, but we did not use them for this test. The reason is that our vApusMark II takes 14GB per tile, and our standard test for dual socket systems is to use two tiles. Those two tiles need at least 28GB of RAM, plus some space for the ESX console and hypervisor (0.3 to 0.8 G RAM for the console + 0.5GB to 1GB for the vmkernel). We can show you what effect these low power DIMMs have at idle:

idle power test, Opteron 4162 with different RAMs

Even at idle, a hypervisor has more work to do than a supervisor OS like Windows 2008 R2. The result is a 10% higher power draw at idle. Each regular 1.5V DDR3 DIMM seems to take 0.5W at idle (2 Watt/4). If you replace four regular DDR3 DIMMs with four low power DIMMs at 1.35V, you can cut the RAM power draw in half. Low power DIMMs seem to save 0.3W per DIMM (1.2W / 4) at idle. So all in all, the RAM subsystems seems to draw little power at idle.
When the SQL server 2008 was running at 90% CPU load we measured a difference of 4.9W between four low power and 1.5V DDR-3 DIMMs. So low power DIMMs save you about 1.25W per 4 GB DIMM.

Real-World Power Use
In the real world you do not run your virtualized servers at their maximum just to measure the potential performance, but neither do they run idle. The user base will create a certain workload and expect this workload to be performed with the lowest response times. The general idea behind this new benchmark scenario is that each server runs exactly the same workload and that we then measure the amount of energy consumed. 

vApusMark Equal load power test

You get what you pay for of course. The Opteron 4122 is extremely cheap, but these are the most leaky chips that pass the quality tests. Remember the quad-core 2.2GHz will consume at the most 95W (or almost 25W per core) while the hex-core 4170 HE will never pass 65W (11W per core) despite the fact that they are running at almost the same clockspeed (2.1GHz). So the 4170 HE is roughly more than twice as efficient.

Conclusion

A 2P server CPU for less than $100 sounds great, but if that quad-core CPU—the Opteron 4122—consumes more than CPUs that are more powerful, it's probably a bad idea for a server. Since servers run 24/7, electricity and cooling costs contribute almost as much as the CAPEX costs to the TCO.
The Tyan YR190B8228 for example is a very inexpensive server, with a barebone price of only ~$1500 for two servers. Add to that the CPUs, the storage card that accesses the external storage, a few SATA disks, and lots of RAM, and those two servers will cost anywhere between $2500 and $4000. Saving $100 per CPU or $400 per two servers is less than 15% of the CAPEX costs. Saving a few hundred dollars just to waste them later on more electricity and cooling costs just doesn't make sense.
AMD targets those running "Small Business Servers", but even SMEs do not turn off their servers very often. That means that the Opteron 4122 and 4130 only have a place in a cheap workstations that are on eight hours per day. In that case, the 2.6GHz Opteron 4130 ($125) might prove to be a better option than using a low-end Phenom X4. However, we advise against using these CPUs in any 24/7 server.
The Opteron 4170 HE is a much better deal. It is worth investing a few dollars more in getting an HE Opteron instead of a non-HE Opteron 4000. A low-end Xeon E560x series CPU is also viable.
The Opteron 4000 HE consumes very little at idle, which is good for SMB servers as they idle a lot. As a low budget virtualization server, it consumes about 20% more power than the Intel L5630 but saves you almost $400 per CPU. It does this while performing "good enough" in many situations.
Concerning the Opteron 4162 EE, we agree with AMD that this is a good CPU for hosting and cloud environments, but not always. The Opteron 4162 EE makes sense for "budget hosting'", which is a pretty large market. (By "budget hosting" we mean that people accept the possibility of lower availability as they just want to pay as little as possible to get their website on the internet, i.e. Tier 3)
The moment you start looking at "enterprise class hosting" (i.e. Tier 2 and better), where the hosting providers invest in redundancy features to guarantee higher availability, the Xeon L5630 is the CPU to get. The extra capital investment in more expensive CPUs will be noise in the TCO calculation, and the Xeon consumes slightly less while offering up to 40% more CPU power.
Our conclusion is that if you are looking for very cheap server CPUs, the Opteron 4170 HE (2.1GHz) and 4174(2.3GHz) are very interesting options. Resist the urge to go for the Opteron 4xxx without the HE markings. The moment performance comes into play, the Xeon L5630 is the performance/watt champion, without any doubt. However, there are situations where you are completely power limited and care very little about CPU performance as the number of VMs is limited by memory or disk access I/O. In that case the Opteron 4162 EE offers the lowest power consumption for the lowest price.











0 comments:

Post a Comment

 

Copyright © tech2in Design by DeepDey | Blogger Theme by TechRival | Powered by Deep's