NervLand DevLog #26: Physically Based Rendering (PBR) initial implementation

Hey hey my dear friend! Welcome back for another episode on our NervLand world 😊! Last time, we introduced support for development GUIs with IMGUI in our engine, and now we are going to put that to use in this session while experimenting with physically based rendering implementation 👍! From what I read so far, writing a very minimal version of PBR should actually be no big deal: it's just a few custom shader functions in the end 🤔, So hopefully, not too much trouble ahead of us here ? Well, in theory at least… Let's see how this goes now!

Youtube video for this article available at:

Actually, before jumping into PBR, there are a couple of fixes I need to implement urgently:

First, I need to switch my Matrix 4D class from row-major storage to column-major storage to get rid of all the transpositions I currently need whenever I need to send matrix data between the CPU side and the WebGPU side.

This turned out to be fairly easy to change in the end: all what I really had to do was to introduce a macro to access the matrix element at position “row, col”, as follow:

#define MAT_RC(row, col) _mat[(col)][(row)]

#define SET_ROW(row, v1, v2, v3, v4)                                           \
    MAT_RC((row), 0) = (v1);                                                   \
    MAT_RC((row), 1) = (v2);                                                   \
    MAT_RC((row), 2) = (v3);                                                   \
    MAT_RC((row), 3) = (v4);

#define INNER_PRODUCT(a, b, r, c)                                              \
    ((a).MAT_RC(r, 0) * (b).MAT_RC(0, c)) +                                    \
        ((a).MAT_RC(r, 1) * (b).MAT_RC(1, c)) +                                \
        ((a).MAT_RC(r, 2) * (b).MAT_RC(2, c)) +                                \
        ((a).MAT_RC(r, 3) * (b).MAT_RC(3, c))

And then I simply refactored all the code for the mat4 class to use this macro instead of accessing the internal mat member array directly.

Then removing all the calls to mat4::transpose() and fixing the various the WGPU example with this, and everything seems to still work just fine 👍!

Another minor change for that one:

    // We should also install the SDL event handler here:
    auto* wman = (SDLWindowManager*)SDLWindowManager::instance();
    wman->add_event_handler([](SDL_Event& evt) {
        ImGuiIO& io = ImGui::GetIO();

        auto etype = SDLWindowManager::get_event_type(evt);

        // Quit event should never be considered handled:
        if (etype == SDLWindowManager::EVT_QUIT)
            return false;

        return io.WantCaptureMouse || io.WantCaptureKeyboard;

I started with a simple extension in the SDLAppContext to trigger the reconstruction of the swapchain on resize event:

auto SDLAppContext::initialize() -> bool {

    // create the window:
    auto* wman = SDLWindowManager::instance();

    Window::Traits traits("wgpu_sdl", width, height);
    traits.title = "WGPU SDL Window";
    traits.resizable = true;
    window = (SDLWindow*)wman->create_window(traits);
    // Note: the generated window cannot be null (or create_window will throw an
    // error)

    // Get the WGPUEngine instance:
    auto& app = NervApp::instance();
    eng = app.get_or_create_component<WGPUEngine>();

    logDEBUG("Creating wgpu surface and swapchain...");

    window->add_resize_handler([this](int w, int h) {
        build_swapchain(w, h);
        return false;

    eng->create_swapchain(0, window.get(), width, height);

    logDEBUG("Init completed.");
    return true;

void SDLAppContext::build_swapchain(int w, int h) {
    if (w != width or h != height) {
        logDEBUG("SDLApp: rebuilding swapchain with size {}x{}", w, h);
        width = w;
        height = h;

        auto surf = eng->get_surface(0);

        // We must first remove the previous swapchain:

        // Create the new swapchain:
        eng->create_swapchain(0, surf, width, height);

Yet I get a validation error now when processing the render passes (after I resize the window once):

2023-07-30 14:23:38.319071 [DEBUG] SDLApp: rebuilding swapchain with size 821x600
2023-07-30 14:23:38.322036 [DEBUG] WGPUCamera: window resized to 821x600
2023-07-30 14:23:38.322801 [ERROR] Dawn: Validation error: Attachment [TextureView] size (width: 800, height: 600) does not match the size of the other attachments (width: 821, height: 600).
 - While validating depthStencilAttachment.
 - While encoding [CommandEncoder].BeginRenderPass([RenderPassDescriptor]).

2023-07-30 14:23:38.322816 [ERROR] Dawn: Validation error: [Invalid CommandBuffer] is invalid.
 - While calling [Queue].Submit([[Invalid CommandBuffer]])

⇒ What we have here is in fact not an issue with the fact that we resized the swapchain color attachment, but rather than we did not resize the depthstencil texture to match the new swapchain size. Let's see how we could do that…

Allright, I added a dedicated method to support dynamically changing the depth texture in a renderpass, and now I can increase the size of the window without errors:

auto WGPURenderPass::add_swapchain_depth_attachment(
    U64 idx, wgpu::LoadOp depth_load_op, wgpu::StoreOp depth_store_op,
    F32 depth_clear_value, bool depth_read_only, wgpu::LoadOp stencil_load_op,
    wgpu::StoreOp stencil_store_op, U32 stencil_clear_value,
    bool stencil_read_only) -> WGPURenderPass& {
    add_depth_attachment(nullptr, depth_load_op, depth_store_op,
                         depth_clear_value, depth_read_only, stencil_load_op,
                         stencil_store_op, stencil_clear_value,

    auto* eng = WGPUEngine::instance();

    // Update the target depth texture format:
    _depthFormat = wgpu::TextureFormat::Depth24PlusStencil8;
        [idx, eng, this](wgpu::RenderPassDepthStencilAttachment& attachment,
                         wgpu::Texture& depthTex) {
            U32 width = 0;
            U32 height = 0;

            eng->get_swapchain_size(width, height, idx);

            if (depthTex == nullptr || depthTex.GetWidth() != width ||
                depthTex.GetHeight() != height) {
                logDEBUG("Resizing depth texture to {}x{}", width, height);
                depthTex =
                    eng->create_depth_texture(width, height, _depthFormat);

            attachment.view = depthTex.CreateView();

    return *this;

But this is still not working when reducing the size of the window, due to the fixed viewport size: let's fix this part now:

2023-07-30 15:35:47.766129 [ERROR] Dawn: Validation error: Viewport bounds (x: 0.000000, y: 0.000000, width: 800.000000, height: 600.000000) are not contained in the render target dimensions (903 x 594).
 - While encoding [RenderPassEncoder].SetViewport(0.000000, 0.000000, 800.000000, 600.000000, 0.000000, 1.000000).

⇒ Now I'm automatically resizing the viewport when the swapchain is resized too:

auto WGPURenderPass::add_swapchain_color_attachment(U64 idx,
                                                    wgpu::Color clear_color,
                                                    wgpu::LoadOp load_op,
                                                    wgpu::StoreOp store_op)
    -> WGPURenderPass& {
    auto* eng = WGPUEngine::instance();
    return add_color_attachment(
        [eng, idx, this](RenderPassColorAttachment& att) {
            att.view = eng->get_swapchain(idx).GetCurrentTextureView();

            // For now, we always assume dynamic viewport if we are rendering
            // towards a swapchain:
            U32 width = 0;
            U32 height = 0;
            eng->get_swapchain_size(width, height);
            _viewport.set(0.0F, 0.0F, (F32)width, (F32)height);
            _scissorRect.set(0.0F, 0.0F, (F32)width, (F32)height);
        nullptr, clear_color, nullptr, load_op, store_op);

And this works great!

Or, well… almost 😆 In the case of IMGUI, we also need to avoid processing the resize events when the focus is on the GUI: so we update the global event handler accordingly:

    auto* wman = (SDLWindowManager*)SDLWindowManager::instance();
    wman->add_event_handler([](SDL_Event& evt) {
        auto etype = SDLWindowManager::get_event_type(evt);

        // Quit event should never be considered handled:
        switch (etype) {
        case SDLWindowManager::EVT_QUIT:
        case SDLWindowManager::EVT_SIZE_CHANGED:
        case SDLWindowManager::EVT_RESIZED:
            return false;

        ImGuiIO& io = ImGui::GetIO();

        return io.WantCaptureMouse || io.WantCaptureKeyboard;

Now using this controller by default in the WGPUCamera constructor:

    if (window != nullptr) {
        I32 width = 0;
        I32 height = 0;
        window->get_size(&width, &height);
        aspect = (F32)width / (F32)height;

        _udata.window_width = width;
        _udata.window_height = height;

        // Also assign a controller here:
        // _cameraCtrl = create_ref_object<TargetCameraController>(
        //     _camera.get(), window, Vec3f(0.0, 0.0, 0.0),
        //     Vec3f(0.0, 0.0, -infos.viewDistance));
        _cameraCtrl = create_ref_object<FreeFlyCameraController>(
            _camera.get(), window, Vec3d(0.0, 0.0, -infos.viewDistance),

        // Note: might need to track the handler created below ?
        window->add_resize_handler([this](int w, int h) {
            logDEBUG("WGPUCamera: window resized to {}x{}", w, h);
            _udata.window_width = w;
            _udata.window_height = h;
            return false;

Eventually I should really make it configurable to select the desired type of controller 😅, but that will be for another day.

Side note: I also added the handling of a press of the space bar to reset the position of the camera since, for now, I can still some times loose track of the only object I have at the center of my scene.

Started implementation of the pbr_basic.wgsl shader.

Vertex shader is pretty simple, just need to forward the vertex world position and normal to the fragment shader.

Also had to retrieve the camera world position: so the CameraProvider will now also write the inverse view matrix to be able to easily retrieve that.

Next introduced the initial PBRMaterial structure:

struct PBRMaterial {
    F32 roughness{0.0F};

And sending a instance of such a material on GPU with makeFragUBO(material).

Next, we introduced the SceneUBO structure to provide a “lighting context” for now (could be extended with more scene related content eventually):

struct SceneUBO {
    std::array<Vec4f, 4> lightPositions{};
    std::array<Vec4f, 4> lightColors{};
    U32 numLights{0};

Also sending an instance of this on GPU with makeFragUBO(sceneData).

Implemented the required PBR helper functions in WGSL, and finally, here is the first visual result I got (which looks correct at least):

Then I activated the test code in the shader to update the roughness:

// 	// Add striped pattern to roughness based on vertex position
	roughness = max(roughness, step(fract(in.worldPos.y * 2.02), 0.5));
// #endif

And this will indeed produce horizontal strips on the next run:

Now, how about trying to control the material settings dynamically with IMGUI 😁?!

Actually, before doing that I think I need to clean a little the handling of IMGUI by encapsulating it all in a dedicated GUIManager class.

In this process I updated the constructor of the WGPURenderPass to accept a RenderPassDesc argument to be able to generate “standalone render passes” (ie. not managed by the WGPUEngine itself). And now I have store a standalone render pass in the GUI manager, and avoid any interference with the “user created” render passes 👍.

All what we need now to activate the IMGUI rendering is to specify useImgui = true; in our setup_app() function, which will in turn setup and init IMGUI for rendering:


    if (useImgui) {
        auto& gui = eng->get_gui_manager();
        gui.add_builder_func([this]() { return construct_gui(); });

Hmmmm 🤔 okay, then I faced a bit of a problem: I could update the roughness/metallic value in the material structure, but I still need to somehow upload that on the GPU, so I need to keep track of the corresponding buffer. So here is what I have for now:

I get the buffer from the BindEntry:

    BindEntry mat_bind = makeFragUBO(material);
    mat_buf = mat_bind.get_buffer();

And then I can use that everytime a change is detected:

    if (changed) {
        logDEBUG("Updated roughness: {:.3f}, mettalic: {:.3f}",
                 material.roughness, material.metallic);
        eng->write_buffer(mat_buf, 0, &material, sizeof(material));

But I don't like this solution very much…I don't think I should have to get the buffer from the BindEntry, so let's think a bit more about this.

Ohh, and by the way, here is what the gui currently looks like (minimal, but working fine 😉):

To solve the problem I reported just above, I decided to introduce a new template class, the DynamicBuffer class:

template <typename T> class DynamicBuffer : public RefObject {

    explicit DynamicBuffer(wgpu::BufferUsage usage) : _usage(usage) {

    explicit DynamicBuffer(const T& rhs, wgpu::BufferUsage usage)
        : _usage(usage), _data(rhs) {

    explicit DynamicBuffer(T&& rhs, wgpu::BufferUsage usage)
        : _usage(usage), _data(std::move(rhs)) {

    /** get the data pointer */
    auto data() -> T& { return _data; }

    // Get the buffer:
    auto buffer() -> wgpu::Buffer { return _buffer; }

    /** Update the buffer */
    void update() {
        auto* eng = WGPUEngine::instance();
        eng->write_buffer(_buffer, 0, &_data, sizeof(T));

    /** Create the buffer */
    void create_buffer() {
        auto* eng = WGPUEngine::instance();
        _buffer = eng->create_buffer(_data, _usage);

    /** Buffer usage */
    wgpu::BufferUsage _usage;

    /** The data object*/
    T _data;

    /** The WGPU buffer */
    wgpu::Buffer _buffer;

⇒ So now I can use this class to keep the link between a CPU structure and the corresponding GPU buffer which may need to be updated dynamically.

Adding this in our PBR basic sample code… OK!

In the process, added a nice mechanism to convert a DynamicBuffer into a BindEntry:

    auto as_ubo(Bool frag = false) -> BindEntry {
        BindEntry entry;
        entry.type = BIND_ENTRY_BUFFER;
        entry.buffer_usage =
            wgpu::BufferUsage::CopyDst | wgpu::BufferUsage::Uniform;
        entry.stage =
            frag ? wgpu::ShaderStage::Fragment : wgpu::ShaderStage::Vertex;
        entry.buffer = _buffer;
        entry.buffer_size = sizeof(T);
        entry.buffer_offset = 0;
        return entry;

Added support to show/hide the roughness debug strips: OK

Also added support to dynamically update the positions of the lights:

    if (!paused) {
        auto ctime = SystemTime::get_current_time();
        auto sint = (F32)sin(toRad(ctime * 0.1)) * 20;
        auto cost = (F32)cos(toRad(ctime * 0.1)) * 20;
        scene.lightPositions[0].x() = sint;
        scene.lightPositions[0].z() = cost;
        scene.lightPositions[1].x() = cost;
        scene.lightPositions[1].y() = sint;

        scn_changed = true;

    if (scn_changed) {
        // logDEBUG("Updated roughness: {:.3f}, metallic: {:.3f}",
        scene.flags = show_strips ? 1 : 0;

  • blog/2023/0803_nvl_dev26_initial_pbr_impl.txt
  • Last modified: 2023/08/07 11:12
  • (external edit)