GeForce GTX 580 review

Lord of the Cards - The return of the king?

When NVIDIA launched the GeForce GTX 480 in Q1 2010, their worst fears became reality. The high-end Fermi part was launched and then was gutted and slaughtered over three trivial aspects; those being high-power consumption, loud noise levels and a GPU that ran far too hot.

The flipside of that coin was the fact that the performance was actually spot on. To date the GeForce GTX 480 is the fastest kid on the DX11 block offering stunning performance. Yet the dark clouds that started hovering above the Fermi launch was something they never got rid of, up until the GeForce GTX 460 launch.

That made the GeForce GTX 480 probably the worst selling high-end graphics card series to date for NVIDIA. Throughout the year we've reviewed a good number GTX 480 cards and we've always tried to be very fair. We firmly (not Fermi) believe that if NVIDIA addressed the heat and noise levels from the get go, the outcome and overall opinion of the GTX 480 would have been much more positive as more enthusiast targeted end users can live with that somewhat high TDP. Good examples of these are KFA2's excellent GeForce GTX 480 Anarchy and more recently the MSI GTX 480 Lightning and soon Gigabyte's GTX 480 SOC.

However, the damage was done and NVIDIA needed to refocus, redesign and improve the GF100 silicon. They went back to the drawing board, made the design more efficient and at transistor level made some significant changes. As a result they were able to slightly lower the TDP, increase the shader processor count and increase the overall clock frequency on both core and memory domains.

The end result is the product you've all been hearing about for weeks now, the GeForce GTX 580. A product that is more silent then the GTX 280/285/480 you guys are so familiar with, a product that keeps temperatures under control slightly better and noise levels that overall are really silent. All that still based on the 40nm fabrication node, while offering over 20% more performance compared to the reference GeForce GTX 480.

Will NVIDIA have it right this time? Well they'd better hope so, as real soon AMD's Cayman aka Radeon HD 6970 is being released as well. These two cards will go head to head with each other in both price and performance, at least that's what we hope.

Exciting times with an exciting product, head on over to the next page where we'll start up a review on the product that NVIDIA unleashes today, the GeForce GTX 580.

The GeForce GTX 580 graphics processor

So we already stated that for the GeForce GTX 580 NVIDIA went back to the drawing board and introduced a new revision based on the GF100 ASIC, now labeled as the GF110.

With this release, NVIDIA now has a full range of products out on the market from top to bottom. All the new graphics adapters are of course DirectX 11 ready. With Windows 7 and Vista also being DX11 ready all we need are some games to take advantage of DirectCompute, multi-threading, hardware tessellation and new shader 5.0 extensions. DX11 is going to be good and once tessellation kicks into games, much better looking.

GeForce GTX 580 : 512 SP, 384-bit, 243W TDP
GeForce GTX 480 : 480 SP, 384-bit, 250W TDP
GeForce GTX 470 : 448 SP, 320-bit, 225W TDP

The GPU that empowers it all has small architectural changes, some stuff was stripped away and some additional functional units for tessellation, shading and texturing have been added. Make note that the GPU still is big, as the fabrication node is still 40nm. TSMC canceled the 32nm fab node preventing this chip from being smaller.

Both the GF100 and GF110 graphics processors have sixteen shader clusters embedded in them (called SMs). For the GeForce GTX 480 one such a cluster was disabled and on the GeForce GTX 470 two were actually disabled. The GTX 580 has the full 512 shader processors activated, meaning a notch more performance just based on that alone already. So that's 512 shader processors, 32 more than the GTX 480 had.

Finally, to find some additional performance, the card got clocked a chunk faster at 772 MHz as well, whereas the GeForce GTX 480 was clocked at 700 MHz.

	GeForce 9800 GTX	GeForce GTX 285	GeForce GTX 295	GeForce GTX 470	GeForce GTX 480	GeForce GTX 580
Stream (Shader) Processors	128	240	240 x2	448	480	512
Core Clock (MHz)	675	648	576	607	700	772
Shader Clock (MHz)	1675	1476	1242	1215	1400	1544
Memory Clock (effective MHz)	2200	2400	2000	3350	3700	4000
Memory amount	512 MB	1024 MB	1792 MB	1280	1536	1536
Memory Interface	256-bit	512-bit	448-bit x2	320-bit	384-bit	384-bit
Memory Type	gDDR2	gDDR3	gDDR3	gDDR5	gDDR5	gDDR5
HDCP	Yes	Yes	Yes	Yes	Yes	Yes
Two Dual link DVI	Yes	Yes	Yes	Yes	Yes	Yes
HDMI	No	No	No	Yes	Yes	Yes

For Fermi NVIDIA made their memory controllers GDDR5 compatible, which was not the case on GT200 based GeForce GTX 260/275/285/295, hence their GDDR3 memory.

Memory wise NVIDIA has large expensive memory volumes due to their architecture, we pass 1 GB as standard these days for most of NVIDIA's series 400 and 500 graphics cards. Each memory partition utilizes one memory controller on the respective GPU, which will get 256MB of memory tied to it.

The GTX 470 has five memory controllers (5x256MB) = 1280 MB of GDDR5 memory
The GTX 480 has six memory controllers (6x256MB) = 1536 MB of GDDR5 memory
The GTX 580 has six memory controllers (6x256MB) = 1536 MB of GDDR5 memory

As you can understand, the massive memory partitions, bus-width and combination of GDDR5 memory (quad data rate) allow the GPU to work with a very high framebuffer bandwidth (effective). Let's put most of the data in a chart to get an idea and overview of changes:

Graphics card	GeForce GTX 470	GeForce GTX 480	GeForce GTX 580
Fabrication node	40nm	40nm	40nm
Shader processors	448	480	512
Streaming Multiprocessors (SM)	14	15	16
Texture Units	56	60	64
ROP units	40	48	48
Graphics Clock (Core)	607 MHz	700 MHz	772 MHz
Shader Processor Clock	1215 MHz	1401 MHz	1544 MHz
Memory Clock / Data rate	837 MHz / 3348 MHz	924 MHz / 3696 MHz	1000 MHz / 4000 MHz
Graphics memory	1280 MB	1536 MB	1536W MB
Memory interface	320-bit	384-bit	384-bit
Memory bandwidth	134 GB/s	177 GB/s	192 GB/s
Power connectors	2x6-pin PEG	1x6-pin PEG, 1x8-pin PEG	1x6-pin PEG, 1x8-pin PEG
Max board power (TDP)	215 Watts	250 Watts	244 Watts
Recommended Power supply	550 Watts	600 Watts	600 Watts
GPU Thermal Threshold	105 degrees C	105 degrees C	97 degrees C

So we talked about the core clocks, specifications and memory partitions. Obviously there's a lot more to talk through. Now, at the end of the pipeline we run into the ROP (Raster Operation) engine and the GTX 580 again has 48 units for features like pixel blending and AA.

There's a total of 64 texture filtering units available for the GeForce GTX 580. The math is simple here, each SM has four texture units tied to it.

GeForce GTX 470 has 14 SMs X 4 Texture units = 56
GeForce GTX 480 has 15 SMs X 4 Texture units = 60
GeForce GTX 580 has 16 SMs X 4 Texture units = 64

Though still a 40nm based chip, the GF110 GPU comes with almost 3 billion transistors embedded into it. The TDP remains the same at roughly 240~250 Watts, while performance goes up ~20%.

TDP = Thermal Design Power. Roughly translated, when you stress everything on the graphics card 100%, your maximum power consumption is the TDP.

The GeForce GTX 580 comes with both a 6-pin and 8-pin power connector to get enough current and a little spare for overclocking. This boils down as: 8-pin PEG = 150W + 6-pin PEG = 75W + PCIe slot = 75W is 300W available (in theory).

Geforce GTX 580 reference graphics card, powered by the GF110 GPU

Video processor

Now, we are not going to explain PureVideo all over again but FERMI based graphics cards, thus GeForce series 400/500, have the latest model video processor embedded, which actually is similar to the one used in the GT220/240/ION2 regarding video capabilities. The VP4 engine now also supports MPEG-4 ASP (MPEG-4 Part 2) (Divx, Xvid) decoding in hardware as an improvement over the previous VP3 engine such as those used in ION based systems.

In short, NVIDIA can offload the decoding of pretty much any MPEG format, the only thing not supported is MPEG-1 which I doubt anyone still uses.

What is also good to mention is that HDMI audio has finally been solved. The stupid S/PDIF cable to connect a card to an audio codec, to retrieve sound over HDMI is gone. That also entails that NVIDIA is not bound to two channel LPCM or 5.1 channel DD/DTS for audio.

Passing on audio over the PCIe bus brings along enhanced support for multiple formats. So VP4 can now support 8 channel LPCM, lossless format DD+ and 6 channel AAC. Dolby TrueHD and DTS Master Audio bit streaming are not yet supported in software, yet in hardware they are (needs a driver update).

DXVA 1080P videos processed by your GPU

The x.264 format is often a synonym with Matroska MKV, a media file container which often embeds that x.264 content, a much admired container format for media files. Especially the 1920x1080p movies often have some form of h.264 encoding dropped within the x.264 format. As a result, you'll need a very beefy PC with powerful processor to be able to playback such movies, error free without frames dropping and nasty stutters as PowerDVD or other PureVideo HD supporting software by itself will not support it. Any popular file-format (XVID/DIVX/MPEG2/MPEG4/h.264/MKV/VC1/AVC) movie can be played on this little piece of software, without the need to install codecs and filters, and where it can, it will let DXVA enable the playback.

DXVA is short for Direct X Video Acceleration, and as you can tell from those four words alone, it'll try whenever it can to accelerate content over the GPU, offloading the CPU. Which is what we are after.

There's more to this software though:

A much missed feature with NVIDIA's PureVideo and ATI's UVD is the lack of a very simple yet massively important function, pixel (image) sharpening.

If you watch a movie on a regular monitor, PureVideo playback is great. But if you display the movie on a larger HD TV, you'll quickly wish you could enable little extras like sharpening. I remember the GeForce 7 series having this natively supported from within the Forceware drivers. After GeForce 8 series was released, that feature was stripped away and to date it has to be the most missed HTPC feature ever.

Media Player Classic has yet another advantage, as not only does it try to enable DXVA where possible through the video processor, it can also utilize the shader processors of your graphics cards and use them to post-process content.

A lot of shaders (small pieces of pixel shader code) can be executed within the GPU to enhance the image quality.

Media Player Classic HT edition has this feature built in, you can even select several shaders like image sharpening and de-interlacing... combine them and thus run multiple shaders (enhancement) simultaneously. Fantastic features for high quality content playback. In the screenshot in the upper right corner (click it) you can see MPC HT edition accelerating an x.264 version of Bounty Hunter in 1080P. Thanks to the massive amount of shader cores we can properly post-process and enhance image quality as well, shader based image sharpening (Complex 2) is applied here.

Download Media Player Classic HC (this actually is free public domain software). The GPU is doing all the work, as you can see the h.264 content within the x.264 file container is not even a tiny bit accelerated over the CPU. Read more about this feature right here in this article. You can click on the image to see a full 1080P screenshot.

Hardware installation

Installation of any of the GeForce GTX 580 card is really easy. Once the card is installed and seated into the PC we connect the one 6-pin and one 8-pin PEG power connectors to the graphics card. Preferably your power supply is compatible; most high-end PSUs built after the year 2008 have these connectors as standard:

GeForce GTX 580 needs one 6-pin PEG connector and one 8-pin PCIe PEG connector.

Preferably the PEG headers should come directly from the power supply and are not converted from 4-pin Molex peripheral connectors. Don't forget to connect your monitor, you can now turn on your PC, boot into Windows, install the latest compatible NVIDIA GeForce Forceware driver and after a reboot all should be working. No further configuration is required or needed.

Power consumption

Let's have a look at how much power draw we measure with this graphics card installed.

The methodology: We have a device constantly monitoring the power draw from the PC. We simply stress the GPU, not the processor. The before and after wattage will tell us roughly how much power a graphics card is consuming under load.

Our test system is based on a power hungry Core i7 965 / X58 system. This setup is overclocked to 3.75 GHz. Next to that we have energy saving functions disabled for this motherboard and processor (to ensure consistent benchmark results). On average we are using roughly 50 to 100 Watts more than a standard PC due to higher CPU clock settings, water-cooling, additional cold cathode lights etc.

Keep that in mind. Our normal system power consumption is higher than your average system.

Measured power consumption

System in IDLE = 187W

System Wattage with GPU in FULL Stress = 447W
Difference (GPU load) = 260W
Add average IDLE wattage ~ 20W
Subjective obtained GPU power consumption = ~ 280 Watts

Mind you that the system wattage is measured from the wall socket and is for the entire PC. Below, a chart of measured Wattages per card. Overall this is much higher than reference, this is due to an increased GPU voltage to allow easy overclocking and the standard higher clock frequencies.

Power Consumption Cost Analysis

Based on the Wattage we can now check how much a card like today will cost you per year and per month. We charge 0,23 EUR cents (or dollars) per KWh, which is the standard here.

Graphics card	TDP in KWh	KWh price	2 hrs day	4 hrs day
Graphics card measured TDP	0,28	0,23	0,13	0,26

Cost 5 days per week / 4 hrs day	€ 1,29
Cost per Month	€ 5,58
Cost per Year 5 days week / 4 hrs day	€ 66,98

We estimate and calculate here based on four hours GPU intensive gaming per day / 5 days a week with this card.

Above, a chart of relative power consumption. This is the entire PC (not just the card as just measured) with the GPU(s) stressed 100%, and the CPU(s) left in idle. Mind you that these are the extreme peak measurements, overall power consumption will, obviously, be lower.

But what is interesting is that power consumptions was actually higher than the GTX 480, not lower as NVIDIA claims. Yet we also really need to mention that the board used (engineering sample) had an older BIOS and that power consumption on this board might be a tad higher as a result of it.

Here is Guru3D's power supply recommendation:

GeForce GTX 580

On your average system the card requires you to have a 650 Watt power supply unit.

GeForce GTX 580 in SLI

A second card requires you to add another 250 Watts. You need a 900+ Watt power supply unit if you use it in a high-end system (1 KiloWatt recommended if you plan on any overclocking).

For each other card (3-way SLI) that you add, just add another 250 Watts and 20A on the 12V rails as a safety margin. What would happen if your PSU can't cope with the load?:

bad 3D performance

crashing games
spontaneous reset or imminent shutdown of the PC
freezing during gameplay
PSU overload can cause it to break down

There are many good PSUs out there, please do have a look at our many PSU reviews as we have loads of recommended PSUs for you to check out in there.

3DMark Vantage (DirectX 10)

3DMark Vantage focuses on the two areas most critical to gaming performance: the CPU and the GPU. With the emergence of multi-package and multi-core configurations on both the CPU and GPU side, the performance scale of these areas has widened, and the visual and game-play effects made possible by these configurations are accordingly wide-ranging. This makes covering the entire spectrum of 3D gaming a difficult task. 3DMark Vantage solves this problem in three ways:

1. Isolate GPU and CPU performance benchmarking into separate tests,
2. Cover several visual and game-play effects and techniques in four different tests, and
3. Introduce visual quality presets to scale the graphics test load up through the highest-end hardware.

To this end, 3DMark Vantage has two GPU tests, each with a different emphasis on various visual techniques, and two CPU tests, which cover the two most common CPU-side tasks: Physics Simulation and AI. It also has four visual quality presets (Entry, Performance, High, and Extreme) available in the Advanced and Professional versions, which increase the graphics load successively for even more visual quality. Each preset will produce a separate, official 3DMark Score, tagged with the preset in question.

The graphics load increases significantly from the lowest to the highest preset. The Performance preset is targeted for mid-range hardware with 256 MB of graphics memory. The Entry preset is targeted for integrated and low-end hardware with 128 MB of graphics memory. The higher presets require 512 MB of graphics memory, and are targeted for high-end and multi-GPU systems.

Overclocking & Tweaking the graphics card
As most of you know, with most videocards you can apply a simple series of tricks to boost the overall performance a little. You can do this at two levels, namely tweaking by enabling registry or BIOS hacks, or very simply to tamper with Image Quality. And then there is overclocking, which will give you the best possible results by far.
What do we need?
One of the best tools for overclocking NVIDIA and ATI videocards is our own Rivatuner that you can download here. If you own an ATI or NVIDIA graphics card then the manufacturer actually has some very nice built in options for you that can be found in the display driver properties. Based on Rivatuner you can alternatively use MSI AfterBurner which will work with 90% of the graphics cards out there. We can really recommend it, download here.
Where should we go?
Overclocking: By increasing the frequency of the videocard's memory and GPU, we can make the videocard increase its calculation clock cycles per second. It sounds hard, but it really can be done in less than a few minutes. I always tend to recommend to novice users and beginners, to not increase the frequency any higher than 5% on the core and memory clock. Example: If your card runs at 600 MHz (which is pretty common these days) then I suggest that you don't increase the frequency any higher than 30 to 50 MHz.
More advanced users push the frequency often way higher. Usually when your 3D graphics start to show artifacts such as white dots ("snow"), you should back down 10-15 MHz and leave it at that. Usually when you are overclocking too hard, it'll start to show artifacts, empty polygons or it will even freeze. Carefully find that limit and then back down at least 20 MHz from the moment you notice an artifact. Look carefully and observe well. I really wouldn't know why you need to overclock today's tested card anyway, but we'll still show it.
All in all... do it at your own risk.
Original This sample Overclocked
Core Clock: 772MHz Core Clock: 772MHz Core Clock: 855MHz
Shader Clock: 1544MHz Shader Clock:1544MHz Shader Clock: 1710Hz
Memory Clock: 4008MHz Memory Clock:4008MHz Memory Clock: 4424 MHz
Now we left the fan RPM control at default in all circumstances. We reached a very decent overclock guaranteeing even better results.
This is a reference card, without voltage tweaking your limit will roughly be 850~875 MHz on the core (1700~1750 on the shader processors). Memory can be clocked at 4400 MHz (effective).
Should you tweak GPU voltage a little with AfterBurner (download here) and set it 0,1v higher then you can take it up a notch more towards 900~950 MHz - but with GPU Voltage tweaks your temps will rise to over 90 C real fast.
With the 'normal' overclock our temperature now hovers at roughly 87 degrees C under load, and that's okay. dBA levels remain normal, slightly higher at 42~43 dBA, and that's under full stress. Here's what that does to your overall performance.
3DMark Vantage - set up in Performance mode
Here we have the card with Call of Duty: Modern Warfare 2, maxed out image quality settings as before with 4xAA 16xAF, keep in mind that the dark blue line represents baseline GeForce GTX 580 default performance, the red line is the added value in overall framerate.
But let's make it really heavy on the graphics card -- Battlefield Bad Company 2. We maxed out image quality settings as before with 8xAA 16xAF, way more GPU stringent, and with 8xAA applied we see that even this title benefits nicely from our overclock.

Final words and conclusion
Nice, yeah I certainly like what NVIDIA has done with the GF110 GPU. No matter how you look at it, it is new silicon that runs much more efficiently and thanks to more shader processors, higher clocks, faster memory and tweaks and optimizations at transistor level we get extra performance as well. The end result it a product that at is, give or take, 20% faster than the GTX 480, which is a product that already was blazingly fast of course.
Now, that GTX 480 already was the fastest chip on the globe, yet was haunted by high noise levels and heat issues. We stated it in all our reviews, would the GTX 480 have been more silent and running less hot, then everybody would have been way milder with their opinion of that product.
The GeForce GTX 580 is exactly that, it is the GTX 480 in a new jacket, now comes with vapor chamber cooling, improved PCB and most of all a more refined GPU. Don't get me wrong here, the GPU itself is still huge, but who cares about that when the rest is right? And the rest is right... much higher performance at very okay noise levels with a GPU that runs at decent GPU temperatures. The one downside we measured was an increase power consumption, slightly higher than ther GTX 480. Now we really need to mention that the board used for this article (engineering sample) had an older BIOS and that power consumption on this board might be a tad higher as a result of it.
The cooling performance is better thanks to the vapor chamber cooler, a technology that has been widely adopted in many CPU and VGA cooling solutions already in the past. Even with an overclock towards 850 MHz on the GPU core rediculously stressed we still did not pass 85~87 Degrees C, and trust me that the GPU humped, stressed and dominated with a whip. Mind you that in your overall gaming experience the temperatures will definitely be somewhat lower as we really give the GPU a kick in the proverbial nuts here. 80 Degrees C on average is a number I can safely state is what you'll be seeing.
If you take a reference baseline GTX 480 and compare it to this product we already have 20% faster performance. At an overclocked 850 MHz clock frequency that's accumulated 25% to 30% extra performance easily over that baseline GTX 480 NVIDIA product, and in the world of high-end that is a mighty amount of extra performance. Of course any game to date will play fine at the highest resolutions with a minimum of 4x Antialising enabled and the very best image quality settings. So performance is just not an issue, as well as heat and noise.
Now if you come from an factory overclocked GTX 480 like the KFA2 Anarchy or MSI Lightning, then the difference is NIL really as these cards are clocked faster at default. There's no reason to upgrade whatsoever. However if you're in the market for a pre-overclocked GTX 480 or this reference 580, then obviously the 580 must have your preference as that card at default is as fast as the standard overclocked GTX 480 cards, and then has more room for tweaking. That certainly is a bitter message for KFA2, MSI and Gigabyte who all recently released the supadupa OC models of the GTX 480. It's the reality though, the GTX 580 is the logical choice here.
The new advanced power monitoring management function is well ... dissapointing. If the ICs where there as overprotection for drawing too much power it would have been fine. But it was designed and implemented to detect specific applications such as Furmark and then throttle down the GPU. We really dislike the fact that ODMs like NVIDIA try to dictate how we as a consumer or press should stress the tested hardware. NVIDIA's defense here is that ATI has been doing this on R5000/6000 as well, yet we think the difference is that ATI does not enable it on stress tests, yet is simply a common safety feature when you go way beyond specs. We have not seen ATI cards clock down with Furmark as of recently, unless we clocked say the memory too high after which it clocked down as safety feature. No matter how you look at it or try to explain it, this is going to be a sore topic for now and in the future. There are however many ways to bypass this feature and I expect that any decent reviewer will do so. Much like any protection, if one application does not work, we'll move on to the next one.
Alright, time to round up this review. Saying that the GeForce GTX 580 is merely a respin product would not do NVIDIA justice, this certainly is a new taped out revision that has been tweaked and made more efficient. The end result is a card that is something we expected early 2010, but is faster. The only thing that can ruin NVIDIA's all new release will be AMD's upcoming Cayman (Radeon HD 6970). The performance and pricing of that card is still an unknown. Anyway, if priced right and if it falls within your budget, then we do like to recommend the GeForce GTX 580, but we are afraid that the 479 EUR (499 USD) price tag will scare away many people. High-end anno 2010 should be 400 EUR tops, imho.
The GeForce GTX 580... well this is the product that really should have been launched in Q1, it would have made all the difference in the world for NVIDIA. Though we do not see any groundbreaking new stuff, the performance went up with a good enough margin and 'the feel' of the product is just so much better compared to the GeForce GTX 480 launch. Definitely a card I wouldn't mind having in my PC, now then .. Call of Duty Black Ops bring it on !

tech2in

Thursday, February 24, 2011