Vertex and Pixel Shader Primer

This is intended for those artists and students who have heard the terms Pixel and Vertex Shader, but aren’t really sure what they do or how they work.

Now this is very high level, and doesn’t get into the specifics too deeply, just because that’s an abyss that definitely looks into you as you look into it.

The idea is to just get the concepts across, rather than go too deep.

So lets get started.

A vertex shader and/or a pixel shader (sometimes called a fragment) shader is a very small, self contained program, that runs on the actual graphics card – using the graphics card GPU and not the main device CPU, that manipulates vertices and then, at a lower level, the actual pixel being rendered.

Complicated eh? Yeah, that doesn’t really explain it very well. So, assuming you have _some_ degree of understanding of how 3D models get transformed, what you are doing in the vertex shader is doing the 3D math that takes a normal 3D models vertices, and then applies all the complex 3D math of moving it in space, rotating  and doing other things, like animation, to each vertex of the model.

So, for example, if you have a 3d model of a cube, it has eight vertices, one for each corner.

On the Graphics card, it’s going to run the vertex shader 8 times – once for each vertex. You have to feed the vertex program stuff like the transformation matrix and the 3d->2d matrix, and then the vertex shader itself does the work for you – all on the graphics card and not taking up precious main CPU resources.

Once each vertex has been transformed into a screen coordinate set, the graphics card now knows exactly what pixels on screen are being drawn. It knows that it’s drawing a flat triangle between points x1,y1, and x2, y2 and then x3, y3. So from that it does what is called a raster scan – ie lines usually from left to right on screen, for each line the triangle covers for each pixel in that triangle.

And here’s where it gets clever – for each pixel it’s rendering, it calls the pixel shader. At the vertex level, you are doing 3D math on vertices that make up a model. At the pixel shader level you are doing color math on individual pixels.

So, the GPU does the vertex shaders first, to transform all the 3D vertices into final 2D screen positions – discarding polygons that are back facing once the 3D model has been transformed – then it does the pixel shaders – one for each pixel it’s rendering on this specific render call.

As an aside, shaders are generally written in straight C, and do have some helpful functions built in you can call – like dot product, cross product, sin, cos, arc etc and other helpful math features. But generally, you write the code to do the math yourself.

Generally you package up vertex and pixel shaders up together – in OpenGL for example, you feed OpenGL the ascii code for each shader type, let the OpenGL driver actually compile it to machine language the GPU will understand, then you tie the two together inside the OpenGL Drive, so when you say “Please render this object” and feed it the vertex array and Texture coordinates, the GPU knows “I should use this vertex shader and this pixel shader to attempt to draw this thing”.

So, basically, for each object you want to render – each combination of texture, vertices, stuff-you-want-to-do-to-move-the-object – you have to specify to OpenGL which vertex / pixel shader combination you want it use on the graphics card to actually render that set of data. You can have lots of these small programs loaded into the graphics card at once, at run time you are just picking which set you want to use for any given render call.

Now, there are some other interesting aspects. For example, you can actually pass in variables to these little graphics card programs – you can set up variable types in the programs themselves, and then, at render time, for each rendered model, send the programs different sets of variables. Kinda of like how you can use command line arguments to send to a command line program.

So, you could send in a different set of lights for each model being rendered; in fact, most systems already do this. Why? To explain that, we need to go into what pixel shaders can do in a little more detail.

A pixel shader defines the final color value of each pixel being rendered for the current model being displayed. So the pixel shader is called once for every pixel rendered in that model.

Now these things aren’t free. The GPU is now doing logic processing for every pixel rendered. The more logic you want to do – for example, the more lights you want to have falling on that pixel being rendered, – the more time each pixel takes to render. And the GPU only has so much time to go around. Oh, it has a hell of a lot of time; GPU’s are VERY quick – but the more pixels you want to render, and the higher the resolution you want to render it at, the more time it takes. You can only paint the screen so many times before you run out of GPU time because each pixel is taking a while to process.

It’s for this reason that lots of systems have different types of lights affecting different parts of the scene – you might have expensive pixel shaders for character models, and way less expensive ones for the environment. Even then, the characters – when being rendered – will often only take the 4 or 6 closets lights as program variables because a) doing more than 6 lights is too cost prohibitive in GPU time and b) because there’s a limit to how much data you can actually send a pixel shader from the main program per call.

Interestingly, the pixel shader is not responsible for certain hardware type operations – alpha blending, for example, is still handled at the purely hardware level. You do not, in a pixel shader, sample the screen and then mix that into the resulting pixel color value to make alpha blending work. It’s still purely hardware that does that.

The pixel shader can see textures though – the simplest pixel shader simply samples the texture passed in to the render call, at the appropriate spot for each pixel being rendered (fed into the pixel shader by the hardware in the graphics card)  inside the texture, and then presents that to the end of the shader, which is the value that is then passed to the hardware to be written to the screen.

Another interesting fact is that pixel shaders – by and large, and certainly in OpenGL – cannot see the screen. They can only write to the screen, not read them.

So how to do clever effects like screen ripples, or heat haze? Where you are not rendering actual data to the screen, just affecting what is already there?

Well, in OpenGL, the solution is to never actually render to the screen in the first place. You render to a texture instead, then that screen texture can be passed in as a parameter to a pixel shader, so it is effectively writing to itself. Then, as a very last step, the entire screen textureis actually rendered to the real screen, so you can see the results.

So to do a small ripple effect, you’d start off by asking openGL for a new texture, but in a special way, so OpenGL knows you aren’t going to be feeding it texture data – it just creates the space for that texture internally, but doesn’t try and copy a loaded texture into it. Every time you render something, you point OpenGL at this texture as the destination, rather than the screen, then render normally, as you would if you were rendering to the screen, then when you are done, you render another quad over where you want the ripple to happen, pass in the screen texture as the source, and have the actual pixel shader do some cleverness about where it’s reading from the original texture, thereby writing over itself with itself as the source.

There are many clever things you can do with shaders – mix multiple textures together, pass in lights so you can calculate how light or dark a particular pixel should be – even generate color values on the fly without using textures at all.

The math can often get quite intense – lots of stuff has to be worked out per pixel in the shader and quite often what you have to work from in the first place is not a lot, so everything has to be regenerated per pixel. But the results can often be very impressive – and at no cost to the main CPU which can then go off and do other, more important things, instead.

Hope this is helpful.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>