GPU Zen: Advanced Rendering Tech PORTABLE
For modern games, it is important to utilize a rendering system that can handle increasingly complex mesh geometry and realistic surface materials. Forward rendering systems support high material diversity but either suffer from overdraw, or require a depth pre-pass, which can be expensive for meshes with a high triangle count, GPU hardware tessellation, alpha-testing or vertex-shader skinning. Deferred rendering systems manage to run efficiently without a depth pre-pass, but only support a limited range of materials and therefore often require additional forward rendering for more diverse materials. Our practical approach to deferred texturing combines the strengths of both rendering systems by supporting a high diversity of materials while only performing a single geometry pass. We go one step further than traditional deferred rendering and completely decouple geometry from materials and lighting. In an initial geometry pass, all mesh instances, that pass the GPU culling stage, are rendered indirectly and their vertex attributes are written, compressed, into a set of geometry buffers. No material specific operations and texture fetches are done (except for alpha-testing and certain kinds of GPU hardware tessellation techniques). A subsequent full screen pass transfers a material ID from the geometry buffers into a 16-bits depth buffer. Finally, in the shading pass for each material, a screen space rectangle is rendered that encloses the boundaries of all visible meshes. The depth of the rectangle vertices is set to a value that corresponds to the currently processed material ID and early depth-stencil testing is used to reject pixels from other materials. All standard materials that use the same shader and resource binding layout are rendered in a single pass via dynamically indexed textures. At this point, material specific rendering and lighting (e.g. tiled [Billeter et al. 13] or clustered [Olsson et al. 12]) are done simultaneously. Figure 2 gives an overview of the rendering process.
GPU Zen: Advanced Rendering Tech
Wolfgang is the CTO of The Forge Interactive. The Forge Interactive is a think-tank for advanced real-time graphics research and a service provider for the video game and movie industry. We worked in the last nearly 13 years on many AAA IPs like Tomb Raider, Battlefield 4, Murdered Soul Suspect, Star Citizen, Dirt 4, Vainglory, Transistor, Call of Duty Black Ops 3, Battlefield 1, Mafia 3, Call of Duty Warzone, Supergiant's Hades and others. Wolfgang is the founder and editor of the ShaderX and GPU Pro books series, a Microsoft MVP, the author of several books and articles on real-time rendering and a regular contributor to websites and the GDC. One of the books he edited -ShaderX4- won the Game developer Front line award in 2006. He is in the advisory boards of several companies. He is an active contributor to several future standards that drive the Game Industry. You can find him on twitter at
From there, the WoW team worked to productize the changes, porting the effect to multiple platforms and ensuring the technique could be brought to as many users as possible. While merging the shader and graphics code for the technique, Blizzard was able to harden some of their internal code, adding more robust support for rendering to MIP levels required by the higher settings of ASSAO. They also had to handle shader-based limitations across various supported graphics APIs. For example, DirectX* Shader Model 4.0 and earlier has no support for Unordered Access Views and limited support for the gather operation. There are also some buffer types used in the original ASSAO sample code that were not directly compatible with other shader languages like Metal*. WoW developers were able to work around these issues with various approaches, either making minor algorithm changes or limiting the technique on some platforms.