In my last blog post, I demonstrated my ray tracer for the BBC Micro:Bit. In that post, I talked a little about what a ray tracer is, but didn’t discuss how you actually implement one. If you don’t know what a ray tracer is, make sure you read that post first.

This post discusses a ray tracer in detail, talking you through the computation step-by-step. It’s not the most advanced ray tracer - in fact it’s about as simple as they come. It fires one ray per pixel, and doesn’t support reflection or refraction. However, you should be able to adapt it to your needs once you understand the maths.

I deliberately haven’t included any code samples. Instead, I explain in words, maths, and pictures. That probably seems scary, but trust me when I say that ray tracing just isn’t that complicated. If I included code samples, you would probably just copy them and not really understand what was going on.

By the end of this post, you should have an intuitive understanding of how ray tracers work. You should be able to write the code yourself, and more advanced readers will be able to extend their implementation with extra features.

Step 1: Define your Scene

Ray tracers simulate light rays to render a 3D scene. Before we can start doing that, we need to define the scene that we want to render.

In our simple implementation, the scene contains one camera and many triangles. We only use triangles in our scene because they are the simplest shape to render. We can convert more complex shapes into triangles in a process called triangulation

Each triangle t is a set of 3 coordinates {t1,t2,t3}.

One triangle is displayed

Each coordinate is a 3D coordinate {x,y,z}.

The X axis runs left-right. The Y axis runs up-down. The Z axis runs in-out. The 3 coordinates of the triangle's corners are {-1, 0, 2}, {0, -1, 3}, and {1, 0, 2}

We add four triangles in a pyramid shape and the camera. The camera’s position is also a 3D coordinate. It dictates where we are looking in the scene.

A camera is on the right-hand side of the screen, pointing at the triangle. More triangles have been added to create a square-based pyramid

The camera is special because it has both a position and a direction. This allows us to turn the camera and look around the scene. The camera’s rotation is given as θy (yaw) and θp (pitch). Yaw handles the left-right rotation, while the up-down rotation is determined by the pitch. Together, these let the camera look in any direction.

The yaw and pitch are annotated on an arrow showing the direction the camera is facing. Yaw is rotation about the Y axis, while Pitch is rotation about the X axis.

Step 2: Calculate the Triangle Planes

Triangles are flat and two-dimensional. We can imagine a triangle as a small section of an infinite plane.

The plane is shown as an infinite grid that the triangle is sitting on

We describe a plane as ˉp={a,b,c,k}. Any values of {x,y,z} that satisfy aX+bY+cZ+k=0 are on the plane, and vice-versa.

The a, b, and c components of the plane decide its orientation. We can calculate them by finding the angle of the plane’s normal line. The normal line is the line that is orthogonal (at a right angle) to the plane.

Looking at the plane from the side, we see the normal line coming out of the triangle at a right angle

To get the angle of the normal, we will use the cross-product of two vectors. The cross-product is an algebraic function which, given two vectors, outputs a new vector which is orthogonal to both.

This is a visualisation of the cross product

To find the plane’s normal line we need two vectors on the plane that cross. We can pick any two sides of the triangle.

Another image of the triangle from before

I will use the two lines crossing at t1. We can calculate the angle of those two lines by looking at the difference in coordinates from one end to the other:

ˉa=t2t1ˉb=t3t1

For anyone that’s new to vectors, subtracting them works like this:

ˉaˉb={ˉaxˉbxˉayˉbyˉazˉbz}

The bottom side of the triangle is {2, 0, 0}, and the left side is {1, -1, 1}

Now that we have our two lines ˉa and ˉb, we can calculate the normal line as ˉa×ˉb. This is ×, the cross product operator.

I’m not new to vectors, but I certainly can’t remember how to do a cross product. For anyone in the same camp, here’s how it works:

ˉa×ˉb={ˉayˉbzˉazˉbyˉazˉbxˉaxˉbzˉaxˉbyˉayˉbx}

If you want a more in-depth explanation of how it works, try this ‘Math is Fun’ page. I’m not ashamed to say that I spent a long time on that site. It may be for children, but it sure is helpful!

The normal line comes out of the bottom-right corner with an angle of {0, -2, -2}

Using the cross product, we now have the angle of the normal line ˉn={0,2,2}. The values of ˉn map to the {a,b,c} components of the plane.

To calculate the final component, k, we need to go back to the plane formula:

aX+bY+cZ+k=0

We can substitute in our values from ˉn:

ˉnxˉsx+ˉnyˉsy+ˉnzˉsz+k=0 ˉnˉs+k=0 k=ˉnˉs

Here, ˉs is simply any point on the plane. The last two lines are using the dot product, ˉaˉb for conciseness. It’s defined as:

ˉaˉb={ˉaxˉbxˉayˉbyˉazˉbz}

Now we know that k=ˉnˉs, we can calculate it by substituting the value of our normal line ˉn and any point on the plane ˉs. Pick your favourite corner of the triangle and use that.

The same normal line as before is shown on the plane

In our case, this becomes:

k=(01+20+22)k=22k=4

Meaning the overall vector for the plane is:

ˉp={ˉnxˉnyˉnzˉnˉc}={0224}

Referring back to the original formula for the plane, we can substitute in our values to get 0x2y2z+4=0 or put simply y+z=2. That’s it, we have calculated the formula for the plane!

Step 3: Calculate the Ray Lines

It’s time to start simulating the rays. The vector representation of a line, including both position and angle, is:

λˉm+ˉs

ˉs is the origin point of the ray. In our case, it is always the same as the camera coordinates. m is the angle of the ray. It specifies how the x, y, and z coordinates change as we move along the line. The λ parameter specifies how far along the line we are.

The line starts at S and progresses by M each time lambda increases

You were probably taught a similar formula for 2D graphs in school, y=mx+c. That is equivalent to saying:

{xy}=λ{1m}+ˉc

In other words, for each 1 that x increases, y increases by m. You can move along the line using λ. It is simple to see how this gets extended to work in 3D.

Finding the origin of the ray, ˉs, is easy. All rays start at the camera position ˉc, so we just use that.

Calculating ˉm, the line’s angle, is more complex. This is the ratio of how each coordinate changes as we move along the line.

If ˉm was {1,2,3} then for each increase in x, we would see twice the increase in y and three times the increase in z. Since ˉm is just a ratio, it doesn’t matter whether we say {1,2,3} or {100,200,300}.

We can find ˉm by looking at how the coordinates change along a known portion of the line. We will look at the portion of the line from the camera to the view plane. The view plane is a grid that sits 1m in front of the camera. Each cell in the grid is one pixel that we want to paint on the screen.

A 5x5 grid sits in front of the camera

For now, we will just consider the simplest case. The camera has no rotation, and is just looking straight ahead (in the z direction). We are calculating the angle of the center ray, which goes through the centermost pixel in the grid.

In that case, the change in coordinates between the camera’s position and the center of the view plane is simply ˉm={0,0,1}. In this example, we will treat the view plane as having a fixed width of 0.5m. Since the screen is 5×5 pixels, each pixel has a width and height of 0.1m.

For every pixel right of the center, we must add 0.1m to ˉmx. That means that for the pixel {px,py} (2,2) being the center, ˉm is simply:

ˉm={0.1(px2)0.1(py2)1}

An arrow goes from the camera's origin to the bottom-right pixel of the view plane

However, this ignores the direction of the camera. Thankfully, rotating vectors is quite simple. We can simply rotate ˉm to match the rotation of the camera.

To rotate a vector about the y axis (yaw), we do:

roty(ˉv,θ)={ˉvxcosθ+ˉvzsinθˉvyˉvxsinθ+ˉvzcosθ}

And about the x axis (pitch), we do:

rotx(ˉv,θ)={ˉvxˉvycosθˉvzsinθˉvysinθˉvzcosθ}

Therefore, given yaw θy and pitch θp:

ˉmθpθy=rotx(roty({0.1(px2)0.1(py2)1},θy),θp)

Now that we know both ˉm and ˉs, we can calculate the coordinates of any point along the ray’s path.

Step 4: Intersect Rays with Planes

We’re finally ready to simulate our light rays. We know where our light rays start, where they’re going, and everything they might hit. All that’s left to do is to actually simulate the path and find out what each ray hits.

More precisely, we need to find the intersection point of each pairwise ray/plane combo. To be even more precise, for each pairwise combo of ray ˉr=λˉm+ˉs and plane ˉp={a,b,c,k}, what is the intersection point ˉi?

The ray intersects with the plane to the right of the triangle

Obviously, ˉi must be on both ˉr and ˉp. We can substitute ˉi into the formula for ˉr to get:

ˉi=λˉm+ˉsˉi={λˉmx+ˉsxλˉmy+ˉsyλˉmz+ˉsz}

And we can do the same for ˉp to get:

aˉix+bˉiy+cˉiz+k=0

Then we substitute in the values of ˉi from above:

a(λˉmx+ˉsx)+b(λˉmy+ˉsy)+c(λˉmz+ˉsz)+k=0

Which can then be rearranged to solve for λ:

aλˉmx+aˉsx+bλˉmy+bˉsy+cλˉmz+cˉsz+k=0 aλˉmx+bλˉmy+cλˉmz=aˉsxbˉsycˉszk λ(aˉmx+bˉmy+cˉmz)=aˉsxbˉsycˉszk λ=aˉsx+bˉsy+cˉsz+kaˉmx+bˉmy+cˉmz λ={a,b,c}ˉs+k{a,b,c}ˉm

Now that we know λ, we can substitute it back into the definition above to calculate the coordinates of ˉi.

Step 5: Filter Intersection Points

Not all intersection points are actually valid. All we know about ˉi is that it is on the same plane as a triangle. However, we only care about the ones that are actually inside the triangle.

Before performing any other checks, note that λ0. If your λ value is negative, you can stop early. That plane cannot get hit by the ray, as the intersection point is behind the camera.

Then, we can do a quick sanity check, making sure that ˉi is inside the bounding box of the triangle. If ˉix<minx(t1,t2,t3) then ˉi cannot be inside the triangle as it is too far left. Repeat this for each combination of min or max and x, y, or z.

The intersection point is inside the bounding box

We do that first because it’s much faster than the real check and we can often exit early. If ˉi is inside the bounding box, we need to do a full check, which works like this:

The outline of a triangle is showing, with the intersection point shown to the right

t has 3 sides. We know that ˉi is inside the triangle if:

  1. ˉi and t1 are on the same side of t2t3
  2. ˉi and t2 are on the same side of t1t3
  3. ˉi and t3 are on the same side of t1t2

Intuitively, that makes sense.

One corner of the triangle and the intersection point are shown on opposite sides of a side of the triangle

To check whether ˉi and t1 are on the same side of the line t2t3, we do:

ˉv=t2t3ˉa=ˉv×(ˉit3)ˉb=ˉv×(t1t3)c=ˉaˉbc>0

This uses both the cross product and the dot product. The cross product has a special property that ˉa×ˉb is the same as ˉb×ˉa but in the opposite direction.

This goes further, and says that the direction of ˉa depends on which side of ˉv that ˉi is on. If ˉi and t1 are on the same side of ˉv then the angle between them will be <90. If they are on different sides, the angle will be >90.

When the dot product is positive, the angle between the two vectors is <90, meaning they must have been on the same side.

The A and B vectors are shown in almost opposite directions

if c0 then ˉi and t1 are on the same side. We repeat that for each of the three sides of the triangle. Any intersection points that are not in the triangle can be discarded.

Step 6: Rasterise the Rays

Each ray can only intersect with one triangle, so we need to find the triangle that gets hit first. To calculate the distance that the ray travelled, we can use Pythagoras’ theorem in 3D:

d=(ˉixˉsx)2+(ˉiyˉsy)2+(ˉizˉsz)2

For our simple implementation, we will set each pixel’s brightness to 1d, meaning pixels that are closer to the camera are brighter. Any rays that don’t intersect result in a black pixel.

Each pixel of the view plane is coloured based on whether they hit the pyramid

Conclusion

That’s it! You now know everything you need to, and can write your own ray tracer. If you want to know more about 3D rendering, make sure you have a look at these:

If you are still desperate for some example code, you can look at the code from my last post. Either way, you really should give it a try yourself first.

Once you’ve finished your basic ray tracer, here’s some extra features you could add, from easiest to hardest:

  • Colour each triangle a different colour
  • Send multiple rays per pixel and average the result
  • Change the shape of the view plane to a dome
  • Optimise your code, can it run at 60fps?
  • Allow some triangles to behave like mirrors
  • Add transparent materials and simulate refraction
  • Add light sources to your scene, and render a shadow
  • Add a black hole to the scene and simulate gravitational lensing

If you do write your own ray tracer, please show me on twitter (especially if you implement gravitational lensing, you absolute madlad)