I’m a POV-Ray fanatic, using it for everything from art to drawing graphs. Those years of using POV have given me a lot of skills with procedurally generated textures & shapes that I seldom get a chance to use in my day job. So, I figured I’d leverage those abilities into my hobby programming projects.
The first step, of course, is implementing Perlin noise.
I started by reading this splendid paper on “Simplex Noise”, Perlin’s faster version of his classic noise. But while reading it it occured to me that it might be possible to do Perlin noise using a single texture lookup, rather than the 8 samples it normally uses (for 3D noise), or the 4 samples for simplex.
Note that I haven’t done much research into efficient implementations of Perlin noise, so it’s quite possible I’ve rediscovered someone else’s technique. If someone’s seen this before please let me know in the comments, and I’ll give credit.
The heart of the perlin noise algorithm is based on a grid, with a random gradient assigned to each gridpoint. These gradients are extrapolated to the position of the sample point, and blended together to create a smooth random noise value. For simplicity’s sake I’m only going to write the maths for one of the dimensions:
output = dot(grad0,uvw)*(1-f(u)) + dot(grad1,uvw-<1,0,0>)*f(u)
Where uvw is the position within the grid square. If we have unit grid spacing, and an input position xyz, then uvw = frac(xyz). grad0 and grad1 are the random gradient vectors for points at u=0 and u=1. We extrapolate the gradient so that they have a value of 0 at the grid point, by dot producting the gradient with the position relative to that grid point… I’ve explained that badly, but it’s important to the trickery later on….
Now, look at the f(u) terms. To people who write shaders, this is obviously a lerp(). In fact, in 3D, it’s a trilinear interpolation, using <f(u),f(v),f(w)> to blend between the gradients of the surrounding 8 grid points.
So, the question I asked myself was: “Can I make the texture hardware do that trilinear blend for me?” The problem is the uvw term inside the dots, inside the lerp. We have to pull it out if we’re going to get a value into a texture.
First, we need to know a useful property of dots and lerps – they’re distributive. i.e.:
dot(a+b,c) = dot(a,c) + dot(b,c) lerp(a+b,c+d,e) = lerp(a,c,e) + lerp(b,d,e)
And, because a lerp is just a multiply and add, we can do this:
lerp(dot(a,b),dot(a,c),d) = dot(a,lerp(b,c,d))
I shalln’t waste space here with the proof of these, I’d guess if you’re following all of this you can probably prove these things for yourself.
So we can rearrange our formula like so, substituting uvw for xyz-p0, where p0 is the position of the next lowest grid point, i.e. p0=floor(xyz):
output = lerp( dot(grad0,xyz-p0), dot(grad1,xyz-p1), f(u) ) output = lerp( dot(grad0,xyz)-dot(grad0,p0), dot(grad1,xyz)-dot(grad1,p1), f(u) ) output = lerp( dot(grad0,xyz), dot(grad1,xyz), f(u) ) - lerp( dot(grad0,p0), dot(grad1,p1), f(u) ) output = dot( xyz, lerp( grad0, grad1, f(u) ) ) - lerp( dot(grad0,p0), dot(grad1,p1), f(u) )
And, of course, p0 & p1 are just the position of the grid points within the array, which we obviously know when we create the array. Meaning we’re only trying to lerp constants, which means we can put those constants in a texture!
Create the texture
Gradient value – Perlin suggests picking from 12 vectors pointing at the edges of a cube. i.e. <-1,1,0>, <1,0,1>, etc. The integer values mean our dot(grad,p) values won’t need a floating point storage format. (Perlin originally suggested these values to save on maths operations, since the dot product just becomes adds and subtracts. On modern vector processors that doesn’t really matter.)
Gradient dot position – I’m using a RGBA8 signed normalized texture, with this value stored in the alpha channel, so my texture can’t be bigger than 64 wide. Because dot(<1,1,0>,<63,63,63>) = 126, so I can just fit a [-126,126] range.
The Fast Perlin Noise Shader
Here it is, the heart of the algorithm. It’s quite simple:
uvw = frac(xyz); p0 = floor(xyz); f = 6*(uvw^5) - 15*(uvw^4) + 10*(uvw^3); p = p0 + f; sample = readTexture( p ); return dot(xyz,sample.xyz) - sample.w;
Tiling the pattern
There’s still one big problem with this optimised trickery: when the texture wraps it will jump from dot(grad63,p63) to dot(grad0,p0), and the gradients will break.
I scratched my head for a while, but couldn’t come up with any simpler solution than just manually tiling, by making the last value the same as the first, on every row, column, etc. Then make sure you wrap the input values at 63.
I also wrote a conventional Perlin implementation, so I could do a comparison. Other than minor precision issues (probably caused by the texture interpolators) there was no difference.
So… yeah. Perlin noise is now fast enough that I can start building a world with it… Wooh!