Home » Opinions » Research @ Intel 2007 » Speculative Threading and Ray Tracing

Speculative Threading and Ray Tracing

Speculative Parallel Threading

One of the most exciting of these developments is called Speculative Parallel Threading (SPT). What this looks to do is accelerate single threaded applications - e.g. nearly all current software - by splitting calculations up and performing them on different cores.

Normally, to take advantage of multiple cores, a program has to be specially engineered to do so, which is not easy to do and for a large number of applications isn't practical. The reason it's so difficult for a lot of programs is because of the dependency of one calculation on another. Let's look at a basic example.

x + y = z , a + b = c

These two calculations can be run simultaneously on different cores because all the elements of each are completely independent.

( x + y ) / z = u

This sort of calculation, on the other hand, is impossible to split across multiple cores using conventional methods because each portion of the equation is dependent on another; x + y must be calculated before dividing by z.

So, what SPT does is calculate the value of x + y on the first core, as usual (this is the main thread - see diagram below). At the same time the value of x + y is speculated (guessed) and used to calculate the value of ( x + y ) / z on a second core (this is the speculative thread). Then, if the calculated value of x + y equals the guessed value, ( x + y ) / z is also correct so the processor can move on to the next calculation, effectively skipping a calculation and speeding up the program as a whole. If the guessed value of x + y is incorrect it is simply dropped and the calculation continues on the first core so no performance hit is taken. Of course this a very simplified version of how things work but it demonstrates the concept.

The red patches in each bar graph below show when the speculated value was incorrect so you can see it's quite an accurate method.

The demo we were shown of this had a quad-core CPU taking less than a third of the time of a single core CPU to perform a given task, which is quite an impressive return. Of course, being just a demo, it's difficult to say how performance will be for everyday applications but I was given assurances the technology would be just as effective in the real world. A question remains over how scalable this technique is because as more and more threads are speculated the likelihood of the guess being incorrect increases. However, for the near future with quad and eight core CPUs, this is a very exciting prospect.

There is just one, rather large, problem though. Unfortunately this technique relies on a new RAM type, called transactional memory, which is not yet available in hardware (the demo was performed on software that emulated this memory) and likely won't be for a while yet - if ever. So, for the moment, tSPT is just a pipe dream.

Interactive Ray Tracing

Ever since the creation of computer generated 3D graphics, there has been two distinct methods for generating lighting/colouring for the models and environments of a scene. The one that you find in all computer games, called Raster, uses a poor approximation of the way real light works, which is why games look rubbish when you turn all the graphics extras off, but is simple enough to be calculated on the fly. The other method, called ray tracing, actually calculates the exact path of light rays entering the eye, from all the light sources in a scene, as they travel through, bounce off and get absorbed by the objects and surfaces in a scene. This method makes even the most basic scene look very realistic but it is far more intense to calculate and, until now, has had to be done offline (i.e. not in real-time). However, with the proliferation of multiple core CPUs it is now possible to calculate ray traced scenes in real-time.



Using a pair of quad-core CPUs, Intel was demonstrating the Quake 4 engine running at a healthy frame rate with ray tracing enabled. The really exciting thing, though, is that the increase in performance is completely linear so the more cores you add, the better performance you get. So, when 8, 16, 32 core processors start to appear ray tracing will truly be an option for us all.

comments powered by Disqus