Textures

Textures
Prev	Appendix A. Basic Optimization	Next

There are various techniques you can use to improve the performance of texture accesses.

Image Formats

The smaller the data, the faster it can be fetched into a shader. As with vertex formats, try to use the smallest format that you can get away with. As with vertex formats, what you can get away with tends to be defined by what you are trying to store in the texture.

Normals. Textures containing normals can use GL_RGB10_A2_SNORM, which is the texture equivalent to the 10-bit signed normalized format we used for attribute normals. However, this can be made more precise if the normals are for a tangent-space normal map. Since the tangent-space normals always have a positive Z coordinate, and since the normals are normalized, the actual Z value can be computed from the other two. So you only need to store 2 values; GL_RG16_SNORM is sufficient for these needs. To compute the third value, do this:

vec2 norm2d = texture(tangentBumpTex, texCoord).xy;
vec3 tanSpaceNormal = vec3(norm2d, sqrt(1.0 - dot(norm2d, norm2d)));

Obviously this costs some performance, so it's a question of how much precision you actually need. On the plus side, using this method means that you will not have to normalize the tangent-space normal fetched from the texture.

The GL_RG16_SNORM format can be made even smaller with texture compression. The GL_COMPRESSED_SIGNED_RG_RGTC1 compressed texture format is a 2-channel signed integer format. It only takes up 8-bits per pixel.

Floating-point Intensity. There are two unorthodox formats for floating-point textures, both of which have important uses. The GL_R11F_G11F_B10F format is potentially a good format to use for HDR render targets. As the name suggests, it takes up only 32-bits. The downside is the relative loss of precision compared to GL_RGB16F (as well as the complete loss of a destination alpha). They can store approximately the same magnitude of values, but the smaller format loses some precision. This may or may not impact the overall visual quality of the scene. It should be fairly simple to test to see which is better.

The GL_RGB9_E5 format is used for input floating-point textures. If you have a texture that represents light intensity in HDR situations, this format can be quite handy. The way it works is that each of the RGB colors get 9 bits for their values, but they all share the same exponent. This has to do with how floating-point numbers work, but what it boils down to is that the values have to be relatively close to one another in magnitude. They do not have to be that close; there's still some leeway. Values that are too small relative to larger ones become zero. This is oftentimes an acceptable tradeoff, depending on the particular magnitude in question.

This format is useful for textures that are generated offline by tools. You cannot render to a texture in this format.

Colors. Storing colors that are clamped to [0, 1] can be done with good precision with GL_RGBA8 or GL_SRGB8_ALPHA8 as needed. However, compressed texture formats are available. The S3TC formats are good choices if the compression artifacts are not too noticable. There are sRGB versions of the S3TC formats as well.

The difference in the various S3TC formats are how much alpha you need. The choices are as follows:

GL_COMPRESSED_RGB_S3TC_DXT1_EXT: No alpha.
GL_COMPRESSED_RGBA_S3TC_DXT1_EXT: Binary alpha. Either zero or one for each texel. The RGB color for any texel with a zero alpha will also be zero.
GL_COMPRESSED_RGBA_S3TC_DXT3_EXT: 4-bits of alpha per pixel.
GL_COMPRESSED_RGBA_S3TC_DXT5_EXT: Alpha is compressed in an S3TC block, much like RG texture compression.

If an image needs to have a varying alpha, the primary difference will be between DXT3 and DXT5. DXT5 has the potential for better results, but if the alpha does not compress well with the S3TC algorithm, the results will be rather worse than DXT3.

Use Mipmaps Often

Mipmapping improves performance when textures are mapped to regions that are larger in texel space than in window space. That is, when texture minification happens. Mipmapping improves performance because it keeps the locality of texture accesses near each other. Texture hardware is optimized for accessing regions of textures, so improving locality of texture data will help performance.

How much this matters depends on how the texture is mapped to the surface. Static mapping with explicit texture coordinates, or with linear computation based on surface properties, can use mipmapping to improve locality of texture access. For more unusual mappings or for pure-lookup tables, mipmapping may not help locality at all.

Ultimately, mipmaps are more likely to help performance when the texture in question represents some characteristic of a surface, and is therefore mapped directly to that surface. So diffuse textures, normal maps, specular maps, and other surface characteristics are all very likely to gain some performance from using mipmaps. Projective lights are less likely to gain from this, as it depends on the geometry that they are projected onto.