Fermi, then, is the overlying architecture that the chips used inside the GTX 480, GTX 470, and future nVidia cards will be built on. It shares some of the basic elements of the last few generations of nVidia designs but due to the demands of DirectX 11, quite a few elements have been rethought.
Starting with what is the same, the basic building block of Fermi is still the CUDA Core (Core) or Stream Processor as it used to be known. This little processor is the basic number-crunching unit that does the donkey work in terms of calculations for rendering all those pretty graphics in your games or churning through data for other GPU accelerated tasks like video encoding and ray tracing. However, moving up a level, while things still look vaguely similar they are fundamentally different.
In G80 and GT200 - the chips used to power the 8800GTX and GTX 280, respectively - these Cores were clustered together into groups of eight in what is called a Streaming Multiprocessor (SM). Above this was the Texture/Processor Cluster (TPC), which added texture units and more memory to proceedings. With Fermi, though, an SM now includes 32 Cores and four texture units as well as something called the PolyMorph engine, which requires us to go back to basics to explain.
Contrary to what you might think, a graphics card doesn't do everything when it comes to rendering a 3D scene on your computer. The CPU actually sets up the wire frame model onto which all the fancy effects you see are then plastered. However, because a CPU is doing so many other things – like AI, physics, animation – at the same time, these wireframes have to be kept quite simple to maintain a decent level of performance. This is why, despite all the advances in graphics we've seen in recent years you still get games characters with pointy heads and corrugated iron that when you get up close you realise is completely flat – it's just too computationally intensive to construct all the triangles required to accurately represent the complex surfaces of a realistic world.
The solution (in part) is to pass over some of the work of creating the basic geometry of a scene to the GPU. This is done using two basic techniques called tessellation and displacement mapping, which make their debut for DirectX-based games with the new DirectX11 API.
Tesselation works by simply filling in the gaps between the vertices of a basic wireframe model, creating a much more realistic, smooth surface. It doesn't add more detail but just gets the model to a stage where it has a more natural look.
Meanwhile, a displacement map is a texture (2D image) that expresses height information that when applied to a model is used to alter the relative position of vertices in the model. This adds all the little details of the model that really bring it to life. The result is infinitely more realistic 3D models that inherently require less other graphical trickery to make them look life-like. Because it's actually affecting the core geometry as well, other graphical effects, like applying shadows, are greatly improved as they follow the accurate lines of the complex model rather than the basic one, i.e. you don't get pointy shadows.