RV770: AMD ATI Radeon HD 4870 Review - RV770: The Architecture Review

As well as the difference in texturing configuration, the SFUs in nVidia’s architecture only appear at this point whereas there is an SFU in each MSPU in ATI’s architecture. So, for every SIMD you get 16 SFUs whereas as each SM only has two.

(centre)”nVidia combines 10 TPCs to create the shader power of GT200”(/centre)

Moving out a further step, things look even more different. Whereas nVidia combines three SMs and adds texturing capabilities (eight per TPC) to create a Texture/Processing Cluster (TPC) then combines 10 of these. ATI on the other hand skips straight to adding 10 SIMD cores together. The result is RV770 ends up with 640 SPUs, 160 SFUs (remember these can also do everything the SPUs can), and 40 texture units whereas GT200 has 240 SPUs, 60 SFUs, and 80 texture units. In comparison, R600 and RV670 had four SIMDs, making for a total of 256 SPUs, 128 SFUs, and 16 texture units.

(centre)”ATI uses 10 SIMDs to create the shader power of RV770”(/centre)

Now it only takes a brief glance at those numbers to realise that somewhere along the way nVidia and ATI’s tactics have massively differing results as ATI doesn’t even pretend the 800 (640 + 160) SPUs in RV770 come close to competing with the 240 on GT200. We already mentioned that ATIs SPUs can only work on one thread at a time, so in many ways the 800 total SPUs could be considered to only have the same processing power as 160 SPs, which does more closely reflect average real world performance figures.

However, it’s not quite as simple as that. If software is written in such a way that it benefits from the extra processors in ATI’s architecture then it will run considerably faster, if it isn’t then it will be slower than nVidia’s cards. The only real conclusion we can draw at the moment is that nVidia’s ”simpler” approach will likely give more consistent performance in the short term. If software developers begin to embrace more complicated routines than ATI’s hardware could pull away in the long term.

The same is true when you consider RV770 vs GT200 for GPGPU applications. While RV770 has theoretically more compute power, at 1.2 TeraFLOPs compared to GT200s 933GigaFLOPs, it requires programs to be written in such a way that they take full advantage. Time, and future benchmarks, will tell which turns out to be the ”best” method.

Unlike other sites, we thoroughly test every product we review. We use industry standard tests in order to compare features properly. We’ll always tell you what we find. We never, ever accept money to review a product. Tell us what you think - send your emails to the Editor.