In my last blog post, I demonstrated my ray tracer for the BBC Micro:Bit. In that post, I talked a little about what a ray tracer is, but didn’t discuss how you actually implement one. If you don’t know what a ray tracer is, make sure you read that post first.
This post discusses a ray tracer in detail, talking you through the computation step-by-step. It’s not the most advanced ray tracer - in fact it’s about as simple as they come. It fires one ray per pixel, and doesn’t support reflection or refraction. However, you should be able to adapt it to your needs once you understand the maths.
I deliberately haven’t included any code samples. Instead, I explain in words, maths, and pictures. That probably seems scary, but trust me when I say that ray tracing just isn’t that complicated. If I included code samples, you would probably just copy them and not really understand what was going on.
By the end of this post, you should have an intuitive understanding of how ray tracers work. You should be able to write the code yourself, and more advanced readers will be able to extend their implementation with extra features.
Step 1: Define your Scene
Ray tracers simulate light rays to render a 3D scene. Before we can start doing that, we need to define the scene that we want to render.
In our simple implementation, the scene contains one camera and many triangles. We only use triangles in our scene because they are the simplest shape to render. We can convert more complex shapes into triangles in a process called triangulation
Each triangle t is a set of 3 coordinates {t1,t2,t3}.
Each coordinate is a 3D coordinate {x,y,z}.
We add four triangles in a pyramid shape and the camera. The camera’s position is also a 3D coordinate. It dictates where we are looking in the scene.
The camera is special because it has both a position and a direction. This allows us to turn the camera and look around the scene. The camera’s rotation is given as θy (yaw) and θp (pitch). Yaw handles the left-right rotation, while the up-down rotation is determined by the pitch. Together, these let the camera look in any direction.
Step 2: Calculate the Triangle Planes
Triangles are flat and two-dimensional. We can imagine a triangle as a small section of an infinite plane.
We describe a plane as ˉp={a,b,c,k}. Any values of {x,y,z} that satisfy aX+bY+cZ+k=0 are on the plane, and vice-versa.
The a, b, and c components of the plane decide its orientation. We can calculate them by finding the angle of the plane’s normal line. The normal line is the line that is orthogonal (at a right angle) to the plane.
To get the angle of the normal, we will use the cross-product of two vectors. The cross-product is an algebraic function which, given two vectors, outputs a new vector which is orthogonal to both.
To find the plane’s normal line we need two vectors on the plane that cross. We can pick any two sides of the triangle.
I will use the two lines crossing at t1. We can calculate the angle of those two lines by looking at the difference in coordinates from one end to the other:
ˉa=t2−t1ˉb=t3−t1For anyone that’s new to vectors, subtracting them works like this:
ˉa−ˉb={ˉax−ˉbxˉay−ˉbyˉaz−ˉbz}Now that we have our two lines ˉa and ˉb, we can calculate the normal line as ˉa×ˉb. This is ×, the cross product operator.
I’m not new to vectors, but I certainly can’t remember how to do a cross product. For anyone in the same camp, here’s how it works:
ˉa×ˉb={ˉay⋅ˉbz−ˉaz⋅ˉbyˉaz⋅ˉbx−ˉax⋅ˉbzˉax⋅ˉby−ˉay⋅ˉbx}If you want a more in-depth explanation of how it works, try this ‘Math is Fun’ page. I’m not ashamed to say that I spent a long time on that site. It may be for children, but it sure is helpful!
Using the cross product, we now have the angle of the normal line ˉn={0,−2,−2}. The values of ˉn map to the {a,b,c} components of the plane.
To calculate the final component, k, we need to go back to the plane formula:
a⋅X+b⋅Y+c⋅Z+k=0We can substitute in our values from ˉn:
ˉnx⋅ˉsx+ˉny⋅ˉsy+ˉnz⋅ˉsz+k=0 ˉn⋅ˉs+k=0 k=−ˉn⋅ˉsHere, ˉs is simply any point on the plane. The last two lines are using the dot product, ˉa⋅ˉb for conciseness. It’s defined as:
ˉa⋅ˉb={ˉax⋅ˉbxˉay⋅ˉbyˉaz⋅ˉbz}Now we know that k=−ˉn⋅ˉs, we can calculate it by substituting the value of our normal line ˉn and any point on the plane ˉs. Pick your favourite corner of the triangle and use that.
In our case, this becomes:
k=−(0⋅−1+−2⋅0+−2⋅2)k=2⋅2k=4Meaning the overall vector for the plane is:
ˉp={ˉnxˉnyˉnz−ˉn⋅ˉc}={0−2−24}Referring back to the original formula for the plane, we can substitute in our values to get 0x−2y−2z+4=0 or put simply y+z=−2. That’s it, we have calculated the formula for the plane!
Step 3: Calculate the Ray Lines
It’s time to start simulating the rays. The vector representation of a line, including both position and angle, is:
λˉm+ˉsˉs is the origin point of the ray. In our case, it is always the same as the camera coordinates. m is the angle of the ray. It specifies how the x, y, and z coordinates change as we move along the line. The λ parameter specifies how far along the line we are.
You were probably taught a similar formula for 2D graphs in school, y=mx+c. That is equivalent to saying:
{xy}=λ⋅{1m}+ˉcIn other words, for each 1 that x increases, y increases by m. You can move along the line using λ. It is simple to see how this gets extended to work in 3D.
Finding the origin of the ray, ˉs, is easy. All rays start at the camera position ˉc, so we just use that.
Calculating ˉm, the line’s angle, is more complex. This is the ratio of how each coordinate changes as we move along the line.
If ˉm was {1,2,3} then for each increase in x, we would see twice the increase in y and three times the increase in z. Since ˉm is just a ratio, it doesn’t matter whether we say {1,2,3} or {100,200,300}.
We can find ˉm by looking at how the coordinates change along a known portion of the line. We will look at the portion of the line from the camera to the view plane. The view plane is a grid that sits 1m in front of the camera. Each cell in the grid is one pixel that we want to paint on the screen.
For now, we will just consider the simplest case. The camera has no rotation, and is just looking straight ahead (in the z direction). We are calculating the angle of the center ray, which goes through the centermost pixel in the grid.
In that case, the change in coordinates between the camera’s position and the center of the view plane is simply ˉm={0,0,1}. In this example, we will treat the view plane as having a fixed width of 0.5m. Since the screen is 5×5 pixels, each pixel has a width and height of 0.1m.
For every pixel right of the center, we must add 0.1m to ˉmx. That means that for the pixel {px,py} (2,2) being the center, ˉm is simply:
ˉm={0.1⋅(px−2)0.1⋅(py−2)1}However, this ignores the direction of the camera. Thankfully, rotating vectors is quite simple. We can simply rotate ˉm to match the rotation of the camera.
To rotate a vector about the y axis (yaw), we do:
roty(ˉv,θ)={ˉvx⋅cosθ+ˉvz⋅sinθˉvy−ˉvx⋅sinθ+ˉvz⋅cosθ}And about the x axis (pitch), we do:
rotx(ˉv,θ)={ˉvxˉvy⋅cosθ−ˉvz⋅sinθˉvy⋅sinθ−ˉvz⋅cosθ}Therefore, given yaw θy and pitch θp:
ˉmθpθy=rotx(roty({0.1⋅(px−2)0.1⋅(py−2)1},θy),θp)Now that we know both ˉm and ˉs, we can calculate the coordinates of any point along the ray’s path.
Step 4: Intersect Rays with Planes
We’re finally ready to simulate our light rays. We know where our light rays start, where they’re going, and everything they might hit. All that’s left to do is to actually simulate the path and find out what each ray hits.
More precisely, we need to find the intersection point of each pairwise ray/plane combo. To be even more precise, for each pairwise combo of ray ˉr=λˉm+ˉs and plane ˉp={a,b,c,k}, what is the intersection point ˉi?
Obviously, ˉi must be on both ˉr and ˉp. We can substitute ˉi into the formula for ˉr to get:
ˉi=λˉm+ˉsˉi={λ⋅ˉmx+ˉsxλ⋅ˉmy+ˉsyλ⋅ˉmz+ˉsz}And we can do the same for ˉp to get:
a⋅ˉix+b⋅ˉiy+c⋅ˉiz+k=0Then we substitute in the values of ˉi from above:
a⋅(λˉmx+ˉsx)+b⋅(λˉmy+ˉsy)+c⋅(λˉmz+ˉsz)+k=0Which can then be rearranged to solve for λ:
aλˉmx+aˉsx+bλˉmy+bˉsy+cλˉmz+cˉsz+k=0 aλˉmx+bλˉmy+cλˉmz=−aˉsx−bˉsy−cˉsz−k λ(aˉmx+bˉmy+cˉmz)=−aˉsx−bˉsy−cˉsz−k λ=−aˉsx+bˉsy+cˉsz+kaˉmx+bˉmy+cˉmz λ=−{a,b,c}⋅ˉs+k{a,b,c}⋅ˉmNow that we know λ, we can substitute it back into the definition above to calculate the coordinates of ˉi.
Step 5: Filter Intersection Points
Not all intersection points are actually valid. All we know about ˉi is that it is on the same plane as a triangle. However, we only care about the ones that are actually inside the triangle.
Before performing any other checks, note that λ≥0. If your λ value is negative, you can stop early. That plane cannot get hit by the ray, as the intersection point is behind the camera.
Then, we can do a quick sanity check, making sure that ˉi is inside the bounding box of the triangle. If ˉix<minx(t1,t2,t3) then ˉi cannot be inside the triangle as it is too far left. Repeat this for each combination of min or max and x, y, or z.
We do that first because it’s much faster than the real check and we can often exit early. If ˉi is inside the bounding box, we need to do a full check, which works like this:
t has 3 sides. We know that ˉi is inside the triangle if:
- ˉi and t1 are on the same side of t2→t3
- ˉi and t2 are on the same side of t1→t3
- ˉi and t3 are on the same side of t1→t2
Intuitively, that makes sense.
To check whether ˉi and t1 are on the same side of the line t2→t3, we do:
ˉv=t2−t3ˉa=ˉv×(ˉi−t3)ˉb=ˉv×(t1−t3)c=ˉa⋅ˉbc>0This uses both the cross product and the dot product. The cross product has a special property that ˉa×ˉb is the same as ˉb×ˉa but in the opposite direction.
This goes further, and says that the direction of ˉa depends on which side of ˉv that ˉi is on. If ˉi and t1 are on the same side of ˉv then the angle between them will be <90∘. If they are on different sides, the angle will be >90∘.
When the dot product is positive, the angle between the two vectors is <90∘, meaning they must have been on the same side.
if c≥0 then ˉi and t1 are on the same side. We repeat that for each of the three sides of the triangle. Any intersection points that are not in the triangle can be discarded.
Step 6: Rasterise the Rays
Each ray can only intersect with one triangle, so we need to find the triangle that gets hit first. To calculate the distance that the ray travelled, we can use Pythagoras’ theorem in 3D:
d=√(ˉix−ˉsx)2+(ˉiy−ˉsy)2+(ˉiz−ˉsz)2For our simple implementation, we will set each pixel’s brightness to 1d, meaning pixels that are closer to the camera are brighter. Any rays that don’t intersect result in a black pixel.
Conclusion
That’s it! You now know everything you need to, and can write your own ray tracer. If you want to know more about 3D rendering, make sure you have a look at these:
If you are still desperate for some example code, you can look at the code from my last post. Either way, you really should give it a try yourself first.
Once you’ve finished your basic ray tracer, here’s some extra features you could add, from easiest to hardest:
- Colour each triangle a different colour
- Send multiple rays per pixel and average the result
- Change the shape of the view plane to a dome
- Optimise your code, can it run at 60fps?
- Allow some triangles to behave like mirrors
- Add transparent materials and simulate refraction
- Add light sources to your scene, and render a shadow
- Add a black hole to the scene and simulate gravitational lensing
If you do write your own ray tracer, please show me on twitter (especially if you implement gravitational lensing, you absolute madlad)