Dynamite Combiners

The next phase of hardware came, not from 3Dfx, but from a new company, NVIDIA. While 3Dfx's Voodoo II was much more popular than NVIDIA's product, the NVIDIA Riva TNT (released in 1998) was more interesting in terms of what it brought to the table for programmers.

Like other graphics cards of the day, the TNT hardware had no vertex processing. Vertex data was in clip-space, as normal, so the CPU had to do all of the transformation and lighting. Where the TNT shone was in its fragment processing. The power of the TNT is in it's name; TNT stands for TwiN Texel. It could access from two textures at once. And while the Voodoo II could do that as well, the TNT had much more flexibility to its fragment processing pipeline.

In order to accommodate two textures, the vertex input was expanded. Two textures meant two texture coordinates, since each texture coordinate was directly bound to a particular texture. While they were allowing two of things, NVIDIA also allowed for two per-vertex colors. The idea here has to do with lighting equations.

For regular diffuse lighting, the CPU-computed color would simply be dot(N, L), possibly with attenuation applied. Indeed, it could be any complicated diffuse lighting function, since it was all on the CPU. This diffuse light intensity would be multiplied by the texture, which represented the diffuse absorption of the surface at that point.

This becomes less useful if you want to add a specular term. The specular absorption and diffuse absorption are not necessarily the same, after all. And while you may not need to have a specular texture, you do not want to add the specular component to the diffuse component before you multiply by their respective colors. You want to do the addition afterwards.

This is simply not possible if you have only one per-vertex color. But it becomes possible if you have two. One color is the diffuse lighting value. The other color is the specular component. We multiply the first color by the diffuse color from the texture, then add the second color as the specular reflectance.

Which brings us nicely to fragment processing. The TNT's fragment processor had 5 inputs: 2 colors sampled from textures, 2 colors interpolated from vertices, and a single constant color. The latter, in modern parlance, is the equivalent of a shader uniform value.

That's a lot of potential inputs. The solution NVIDIA came up with to produce a final color was a bit of fixed functionality that NVIDIA calls texture combiners. It is directly analogous to the OpenGL 1.1 fixed-function pipeline texture environment concept, but with extensions for multiple textures and some TNT-specific features.

The idea is that each texture has an environment. The environment is a specific math function, such as addition, subtraction, multiplication, and linear interpolation. The standard GL fixed-function pipeline only allowed the environment functions to use as parameters the per-vertex color, the color sampled from that particular texture, and a constant color. For multiple textures, the environments are executed in sequence: the environment function for texture 0 executes, then for texture 1. The texture 1 environment used the output from texture 0 instead of the per-vertex color.

NVIDIA's texture combiners augmented this significantly. The standard environment functions were very limited in terms of operations. For example, the previous color could be multiplied or added to the texture color, but it could not simply ignore the texture color and multiply with the constant color instead. NVIDIA's texture combiners could do this.

If you look at it from a hardware perspective, what texture combiners provide is a two-opcode assembly language. The available registers for the language are two vertex colors, a single uniform color, the current opcode's texture color, and a zero register. There is also a single temporary register to hold the output from the first opcode.

Graphics programmers, by this point, had gotten used to multipass-based algorithms. After all, until TNT, that was the only way to apply multiple textures to a single surface. And even with TNT, it had a pretty confining limit of two textures and two opcodes.

This was powerful, but quite limited. Two opcodes really was not enough.

The TNT cards also provided something else: 32-bit framebuffers and depth buffers. While the Voodoo cards used high-precision math internally, they still wrote to 16-bit framebuffers, using a technique called dithering to make them look like higher precision. But dithering was nothing compared to actual high precision framebuffers. And it did nothing for the depth buffer artifacts that a 16-bit depth buffer gave you.

While the original TNT could do 32-bit, it lacked the memory and overall performance to really show it off. That had to wait for the TNT2. Combined with product delays and some poor strategic moves by 3Dfx, NVIDIA became one of the dominant players in the consumer PC graphics card market. And that was cemented by their next card, which had real power behind it.

Fork me on GitHub