Welcome Guest LOGIN | REGISTER
Monday 23rd July 2007

Dissecting DirectX 10

Posted at: Monday 23rd July 2007 by Stuart Andrews

Stuart Andrews takes a journey through a DirectX 10 3D graphics pipeline, and explains how GPU architecture has changed since the DirectX 9 days.

DirectX 10 architecture is confusing. You aren't just sending instructions through a pipeline, but also circulating them through a complex interconnected system. Vertex data processed through the vertex portion of the setup engine, the dispatch processor and the ALUs will end up back at the setup engine, ready to be rasterized and sent through the pixel portion of the setup engine, then on to the dispatch processor and the ALUs. However, using this heavily threaded, unified approach offers a huge advantage.

In a DirectX 9 GPU, everything ran smoothly, providing the pixel and vertex work matched the purpose for which the architects had designed the GPU. If a game running on an R580 was performing about six times as much work with the pixel engines as it was with the vertex engines, then 100 per cent of the shader units were being used and everything was running at full speed. If, however, the pixel engines were performing ten times the work of the vertex engines, the latter would be doing nothing, while the former became clogged up. Worse, if the vertex engines had the same workload as the pixel shaders, they would be overloaded, while the pixel engines were starved of data.

This doesn't happen with a unified architecture. Providing the command processor, setup engine and dispatch processor (or the setup engine and global/local schedulers in the G80) work properly, all the units should be busy all the time. In the words of ATi's Richard Huddy, 'a piece of hardware that used to occasionally dip well below 50 per cent efficiency because it was just one part of the pipeline, and even that wasn't used 100 per cent efficiently' has become one 'where it's hard to see any part of the chip ever spending available work cycles not doing anything.'

Got all that? Good. Now let's follow a typical 3D workload through the various departments of a DirectX 10 GPU.

The Front End

The first thing that graphics data exiting the CPU hits is the GPU's 'front end' - the block devoted to organising instructions and incoming data, and then streaming it to the rest of the GPU. In a DirectX 9 GPU such as the R580, the incoming vertex data pretty much went straight to the vertex shaders for processing; in the R600, however, it arrives at the command processor. This checks that the hardware is correctly configured for the task ahead - which has shifted from CPU to GPU for bandwidth and latency reasons - then batches incoming instructions into sensible chunks in order to minimise latency later on. Nvidia has employed equivalent logic in the G80. This batched data hits a global setup engine, which looks at the incoming instructions and separates them into three queues: pixel, geometry and vertex work.

Vertex Setup

For now, let's concentrate on the latter. The command processor takes the vertex data coming from the CPU and passes it to the vertex assembler, ready for vertex shader operations in the ALUs. In the R600, it also passes through the Tesselator beforehand. The vertex shaders then carry out their work, and shunt the data off to one side, where instructions from the setup engine can use it for the next stage; alternatively, they can use the 'stream out' cache, so that it can be used by other vertex or geometry shaders. For example, by combining 'stream out' and a geometry shader, the vertex data can be used to generate a displacement map and create surface relief, or build fully 3D volumetric shadows.

More images for this article:

Submit to:  
Hands On Guides for this article
Comments

Its Stupid to make every body in the world upgrade to vista eventually and even more anoying is for us gamers and PC modders i think its all just a waste of time

Comment by Maddwilz at 7:56pm 10th August 2007



WAGHHHHHH VISTA NO WORKEEEE

theres talk about microsoft doing a turn around and offerin dx10 as an update coz nae body wants 2 spend a fortune on upgrade just like me lol im going back 2 my amiga 1200 and wipeout 2097 yassssssssssss

Comment by GUMBANATOR at 7:23pm 5th August 2007



A Vista Work Around

If you want all the Dx10 benefits in Company of Heroes without getting a Vista machine here is what you do. Take out half your RAM, underclock your CPU to about 75% of the speed and then replace your graphics card with an X1300 or similar. That should nicely replicate the prolapsed frame rate and compromised graphics settings enjoyed by Dx10 Company of Heroes players (unless they have beta drivers).

Comment by Grotmonkey at 9:46pm 31st July 2007



Do I really need Vista for dx10?

I REALLY don't want to buy vista just to play a DX10 game. Anyone know anyway around it?

Comment by clipkilla at 6:04pm 30th July 2007



Make a Comment

Mobile Broadband

Compare prices

Fastest, cheapest 3G mobile broadband dongles from 3, Vodafone, T-Mobile and Orange
from just £10/month

Button link to Mobile Broadbandgenie.co.uk
Powered by
Broadband Genie