← Back

CUDA Render Engine

A GPU-accelerated 3D rendering engine built on CUDA with support for implicit geometry, OBJ files, HDRI lighting, KD-Trees, and BRDF materials.

engineering / research · 2020

CUDA Render Engine

Cover render

This is a passion project where I wanted to implement different rendering algorithms and techniques in a 3D rendering engine running off of CUDA. This is a work in progress but as of right now it has support for implicit geometry, OBJ files, HDRI light domes, anti-aliasing, KD-Tree implementation, and BDRF materials. You can find brief explanations of the implementation process and some numbers about run time and settings. Note that these tests were run on Windows 10, NVIDIA GeForce RTX 2080, AMD Ryzen Threadripper 1950X 16-Core Processor (32 CPUs), ~3.4GHz. There are plenty of links to research papers and resources that I used when compiling this project, so if you feel like nerding out, I would recommend checking those out as well!

>> runtime: 11 seconds
>> thread blocks: 32x32 
>> number of samples: 512

The first step was adding implicit equations for primitives. For example, using the equation below we can see if the ray that is shot from the camera intersects the sphere:

t²·b·b + 2t·b·(A−C) + (A−C)·(A−C) − r² = 0 r(θ,ϕ) = (cosθ sinϕ, cosθ, sinθ sinϕ)

Similar logic was used to render the torus, cylinder, and box primitives. All of these have a material attribute, initially just Lambertian, that can be mapped to a color or a texture map.

Although not used in this implementation, as they are intended for GLSL, here are some fascinating implicit equations for some out of the box primitives by the amazing Inigo Quilez.

Primitives render


The first iteration of the project simply sampled from a white light dome, as shown above. To give the scene a bit more life, I implemented an HDRI Map by sampling a input texture as the light source.

n = Normalize(Pdome_surface_point) u = atan2(nx, nz) / 2π + 0.5 v = ny · 0.5 + 0.5

HDRI lit scene

>> runtime: 1 minute & 12 seconds 
>> thread blocks: 32x32
>> number of samples: 512

>> runtime: 4 minute & 12 seconds 
>> thread blocks: 32x32
>> number of samples: 2048

D_GGX(h,α) = α² / π((n·h)²(α²−1)+1)²

Box: 10% reflective roughness, not refractive. Sphere: 60% reflective roughness, not refractive. Cylinder: 50% reflective roughness, fully refractive. Torus: 0% reflective roughness, not refractive.

HDRI with BRDF materials


With the addition of another primitive, a triangle—using Möller–Trumbore intersection algorithm, I was able to parse in any OBJ with a OBJ file parser that was written for this project. The major key at this point is to make sure that the intersection of these triangles can be put into an optimal space sorting algorithm. I opted to use a K-D tree implementation.

Skull OBJ render with gold

With the addition of a GGX distribution of a Specular BDRF, I was able to emulate the look of gold. Reflective weight at 50%, Roughness at 12%, 256 samples, and Fresnel color set to a warm yellow.

F_Schlick(v,h,f0,f90) = f0 + (f90−f0)(1−v·h)⁵

Number of polygons: 287,225
Geometry load time: 1 minute 23 seconds