blog:2023:0807_nvl_dev27_pbr_ibl_impl

NervLand DevLog #27: PBR Image based lighting

In this session we will add “image based lighting” from an hdr environment cubemap to the PBR equation, using the surrounding environment as the light source. This should add an even more realistic look the scene as the light contribution used by the materials is now controlled by the environment. In the process, we will also need to generate the BRDF 2D-LUT and irradiance and filtered cube maps from the environment map. Let's get started!

Adding show_tex.wgsl program and node to display RTTs:

#include "base_utils"

@group(0) @binding(1) var ourSampler: sampler;
@group(0) @binding(2) var ourTexture: texture_2d<f32>;

struct VertexOutput {
    @builtin(position) Position : vec4<f32>,
    @location(0) coords : vec2<f32>,
}

@vertex
fn main(@builtin(vertex_index) vIdx: u32) -> VertexOutput {

    var pos = vec4(0.0);
    var vertex_idx = vIdx%3;

    // Two ways to construct the points:
    // switch vertex_idx {
    //     case 0: { pos = vec4(-1.0, -1.0, 1.0, 1.0); }
    //     case 1: { pos = vec4( 3.0, -1.0, 1.0, 1.0); }
    //     default: { pos = vec4(-1.0,  3.0, 1.0, 1.0); }
    // }
    
    var frag_pos = vec2(f32((vertex_idx << 1) & 2), f32(vertex_idx & 2));
    // pos = vec4(frag_pos * 2.0f + -1.0f, FAR_Z_PLANE, 1.0f);
    pos = vec4(frag_pos * 0.5 + -1.0f, NEAR_Z_PLANE, 1.0f);

    var output : VertexOutput;
    output.coords = frag_pos;
    output.Position = pos;

    return output;
};

@fragment
fn main_fs(@location(0) coords: vec2<f32>) -> @location(0) vec4<f32> {

    // Sample the texture:
    // var color2 = textureSample(ourTexture, ourSampler, coords);
    // cf. https://www.w3.org/TR/WGSL/#texturesamplelevel
    // var color2 = textureSampleLevel(ourTexture, ourSampler, coords, 0);
    // return vec4(color2.rgb, 1.0);
    var alpha = 0.0;
    if(coords.x<=1.0 && coords.y<=1) {
        alpha = 1.0;
    }
    return vec4(coords, 0.0, alpha);
}

Considering 2 different version of G_SchlicksmithGGX:

// Geometric Shadowing function
fn G_SchlicksmithGGX(dotNL: f32, dotNV: f32, roughness: f32) -> f32
{
	var k: f32 = (roughness * roughness) / 2.0;
	var GL: f32 = dotNL / (dotNL * (1.0 - k) + k);
	var GV: f32 = dotNV / (dotNV * (1.0 - k) + k);
	return GL * GV;
}

fn G_SchlicksmithGGX_v1(dotNL: f32, dotNV: f32, roughness: f32) -> f32
{
	var r: f32 = (roughness + 1.0);
	var k: f32 = (r*r) / 8.0;
	var GL: f32 = dotNL / (dotNL * (1.0 - k) + k);
	var GV: f32 = dotNV / (dotNV * (1.0 - k) + k);
	return GL * GV;
}
/

⇒ Not really a bit difference, but the second version looks slightly better to me (?)

Lesson learned: reading this page I eventually realized that both functions above are used in different cases: either the “direct” PBR computation or the IBL version, so I need to keep both of them ;-)

Need to get this change in the code to work (removing the depth attachment, and setting the color attachment format to rg32f):

    auto& rttpass =
        eng->create_rtt_pass({.width = BRDF_LUT_DIM,
                              .height = BRDF_LUT_DIM,
                              //   .with_depth = true,
                              .color_clear_value = {1.0F, 0.0F, 0.0F, 1.0F},
                              .format = TextureFormat::RG32Float});

Initially, this is producing this vallidation error:

2023-08-03 17:02:36.606859 [ERROR] Dawn: Validation error: Attachment state of [RenderPipeline] is not compatible with [RenderBundleEncoder].
[RenderBundleEncoder] expects an attachment state of { colorFormats: [TextureFormat::BGRA8Unorm], sampleCount: 1 }.
[RenderPipeline] has an attachment state of { colorFormats: [TextureFormat::BGRA8Unorm], depthStencilFormat: TextureFormat::Depth24PlusStencil8, sampleCount: 1 }.
 - While encoding [RenderBundleEncoder].SetPipeline([RenderPipeline]).
 - While calling [RenderBundleEncoder].Finish([null]).

Lesson 1: f32 textures are not filterable in WebGPU for now (see https://doc.babylonjs.com/setup/support/webGPU/webGPULimitations).

Lesson 2: Other important note, when targeting a f32 format texture, we cannot enable blending in a render pipeline, as this will lead to a validation error:

2023-08-03 20:32:20.150613 [ERROR] Dawn: Validation error: Blending is enabled but color format (TextureFormat::RG32Float) is not blendable.
 - While validating targets[0].
 - While validating fragment state.
 - While calling [Device].CreateRenderPipeline([RenderPipelineDescriptor]).

2023-08-03 20:32:20.150638 [ERROR] Dawn: Validation error: [Invalid RenderPipeline] is invalid.
 - While encoding [RenderBundleEncoder].SetPipeline([Invalid RenderPipeline]).
 - While calling [RenderBundleEncoder].Finish([null]).

Lesson 3: What I also just learned is that you cannot jsut disable blending by setting the source and destination factors like that:

    auto srcFactor = enabled ? BlendFactor::SrcAlpha : BlendFactor::One;
    auto dstFactor =
        enabled ? BlendFactor::OneMinusSrcAlpha : BlendFactor::Zero;

Instead, we have to effectively connect/disconnect the blend states from the target description with this:

    _colorTargets[index].blend = enabled ? &state : nullptr;

Ohhh… And now just realized that in that reference implementation they are still using the rgba8unorm format to compute the BRDF LUT 😅, so I've been fighting with f32 target texture support for nothing here 🤣 [Well, that's not really for nothing of course: eventually this will be usefull, I'm sure of it.]:

// Generate a BRDF integration map used as a look-up-table (stores roughness /
// NdotV)
static void generate_brdf_lut(wgpu_context_t* wgpu_context)
{
  const WGPUTextureFormat format = WGPUTextureFormat_RGBA8Unorm;
  const int32_t dim              = (int32_t)BRDF_LUT_DIM;

  // Texture dimensions
  WGPUExtent3D texture_extent = {
    .width              = dim,
    .height             = dim,
    .depthOrArrayLayers = 1,
  };

Note: When building a texture view on a cubemap texture (or any other textures really) I'm wondering what would be the effect of leaving the mipLevelCount entry to WGPU_MIP_LEVEL_COUNT_UNDEFINED 🤔 ⇒ I should test that eventually, but for now, just retrieving that value from the texture itself instead:

                    TextureViewDescriptor desc = {
                        .dimension = entry.cubemap
                                         ? TextureViewDimension::Cube
                                         : TextureViewDimension::Undefined,
                        .mipLevelCount = entry.texture.GetMipLevelCount(),
                        .arrayLayerCount =
                            entry.texture.GetDepthOrArrayLayers(),
                    };
                    auto view = entry.texture.CreateView(&desc);
                    set_texture(location++, view);
                    break;

The source implementation in C uses an intermediate “offscreen” rendering cubemap target:

  // Framebuffer for offscreen rendering
  struct {
    WGPUTexture texture;
    WGPUTextureView texture_views[6 * (uint32_t)IRRADIANCE_CUBE_NUM_MIPS];
  } offscreen;
/

⇒ I'm wondering if I really need that and could not render to the irradiance cubemap directly instead 🤔?

Added support to build a render pass on different layers/mipLevel of an existing 3D texture:

            auto& rttpass = eng->create_rtt_pass(
                {.width = IRRADIANCE_CUBE_DIM / factor,
                 .height = IRRADIANCE_CUBE_DIM / factor,
                 //  .color_clear_value = {0.0F, 0.0F, 0.0F, 1.0F},
                 .color_clear_value = {(F32)lIdx / (F32)(array_layer_count - 1),
                                       (F32)i / (num_mips - 1), 0.0F, 1.0F},
                 .format = format,
                 .with_blend = false,
                 .target_texture = irradiance_tex,
                 .array_layer = lIdx,
                 .mip_level = i});

Lesson: When setting up a render pass this way (eg. using an existing texture) I must ensure that the specified width/height of the render pass are matching the actual color target attachment dimensions (taking the mip level into account).

Reading more on this page to try to understand the idea here: https://learnopengl.com/PBR/IBL/Diffuse-irradiance

Initial tests done with rendering the initial cubemap into irradiance cubemap, then showing the irradiance cubemap with:

    rpass1.render_node(0, "tests/skybox", DrawCall{3}, "camera",
                       BindLinearSampler, texCube(irradianceTex));

At first I was just using the projection matrix as mvp (effectively only displaying the forward face (+Z)):

Next fixing the mvp matrices. OK!

Note: to display a specific mip level we can use the WGSL function call:

var col = textureSampleLevel(ourTexture, ourSampler, uvVec, 0);

And the final irradiance map will show as this:

Current implementation is as follow:

    for (U32 lIdx = 0; lIdx < array_layer_count; ++lIdx) {

        for (U32 i = 0; i < num_mips; ++i) {

            // Create a render pass on that view:
            // Note: the width/height parameters below should really match the
            // actual color attachment view size, otherwise, bad things seem to
            // happen
            auto& rttpass = eng->create_rtt_pass(
                {//  .color_clear_value = {0.0F, 0.0F, 0.0F, 1.0F},
                 .color_clear_value = {(F32)lIdx / (F32)(array_layer_count - 1),
                                       (F32)i / (num_mips - 1), 0.0F, 1.0F},
                 .format = format,
                 .with_blend = false,
                 .target_texture = irradiance_tex,
                 .array_layer = lIdx,
                 .mip_level = i});
            // const char* tname = "tests/cmaps/yokohama_rgba";
            const char* tname = "tests/cmaps/bridge2";
            rttpass.render_node(std_vfmt_ipnu, "base/gen_irradiance_cube",
                                "tests/cube", makeUBO(mvps[lIdx]),
                                BindLinearSampler, texCube(tname));
        }
    }

⇒ So we are building one dedicated render node for each pass ⇒ one dedicated render pipeline per pass, and we have 6*7=49 passes to generate the irradiance map with all mip levels 😲!

What I'm thinking instead, is that we could (I think 🤔?) instead create a single RenderNode (so a single pipeline), and a single BindGroup along the way and share that between all passes. Let's try that! (⇒ In fact, this is how it is done in the original implementation from Samuel's webgpu-native-examples repository, so this should definitely work 😉)

Now, remember the DynamicBuffer class we introduced in the last session ? I'm now think I could also extend that to simplify dynamic offsets handling too 😄! Let's think about it for a moment… ⇒ Done 👍!

And now, here is the update code for the irradiance generation, using a single RenderNode in the process:

    // Prepare the mvp  matrices for each face:
    Mat4f proj = Mat4f::perspective(toRad(90.0), 1.0, 0.01, 100.0);

    UniformBuffer<Mat4f, 6> mvpU{{
        // Positive X:
        proj * Mat4f::rotate(toRad(-90.0F), {0.0F, 1.0F, 0.0F}),
        // Negative X:
        proj * Mat4f::rotate(toRad(90.0F), {0.0F, 1.0F, 0.0F}),
        // Positive Y:
        proj * Mat4f::rotate(toRad(90.0F), {1.0F, 0.0F, 0.0F}),
        // Negative Y:
        proj * Mat4f::rotate(toRad(-90.0F), {1.0F, 0.0F, 0.0F}),
        // Positive Z:
        proj,
        // Negative Z:
        proj * Mat4f::rotate(toRad(180.0F), {0.0F, 1.0F, 0.0F}),
    }};

    const char* tname = "tests/cmaps/bridge2";
    //  const char* tname = "tests/cmaps/yokohama_rgba";

    // Prepare a shared RenderNode:
    // In this render node we define a single drawable and we will update its
    // dynamic offsets for each pass.
    WGPURenderNode rnode;
    rnode.reset(std_vfmt_ipnu, "base/gen_irradiance_cube")
        .enable_depth(false)
        .set_blend_state({.enabled = false})
        .add("tests/cube", mvpU.as_ubo(), BindLinearSampler, texCube(tname));

    // We need to render to each layer (count=6) and each mipmap level of this
    // irradiance texture:
    for (U32 lIdx = 0; lIdx < array_layer_count; ++lIdx) {

        for (U32 i = 0; i < num_mips; ++i) {

            // Create a render pass on that view:
            // Note: the width/height parameters below should really match the
            // actual color attachment view size, otherwise, bad things seem to
            // happen
            auto& rttpass = eng->create_rtt_pass(
                {//  .color_clear_value = {0.0F, 0.0F, 0.0F, 1.0F},
                 .color_clear_value = {(F32)lIdx / (F32)(array_layer_count - 1),
                                       (F32)i / (num_mips - 1), 0.0F, 1.0F},
                 .format = format,
                 .with_blend = false,
                 .target_texture = irradiance_tex,
                 .array_layer = lIdx,
                 .mip_level = i});

            // Dyn offsets are for the binding group 0, of drawable 0, and we
            // only have 1 buffer with such offsets below:
            rnode.set_dyn_offsets(0, {{0, {mvpU.get_dyn_offset(lIdx)}}});
            rnode.add_bundle_to(rttpass);

            // rttpass.render_node(std_vfmt_ipnu, "base/gen_irradiance_cube",
            //                     "tests/cube", makeUBO(mvps[lIdx]),
            //                     BindLinearSampler, texCube(tname));
        }
    }

Final note: In the end, I'm not using an “offscreen” intermediate texture as in our reference C implementation, and this still seems to be OK, as I could display the different mip levels and clearly notice the decrease of the resolution, so I think we should be good on this point.

Hmmm 🤔 Just noticed something is wrong in my optimization above, as I seem to still be rebuilding the pipeline for each render pass:

2023-08-05 12:58:18.249693 [DEBUG] Building pipeline...
2023-08-05 12:58:18.267450 [DEBUG] Rebuilding rendernode pipeline.
2023-08-05 12:58:18.267470 [DEBUG] Building pipeline...
2023-08-05 12:58:18.270070 [DEBUG] Rebuilding rendernode pipeline.
2023-08-05 12:58:18.270082 [DEBUG] Building pipeline...
2023-08-05 12:58:18.272425 [DEBUG] Rebuilding rendernode pipeline.

⇒ That should not be needed. Let's clarify what's happenning.

Arrff, yes, that's due to the fact that I'm rebuilding the pipeline in case there is a chance that the shader preprocessor definitions changed:

    if ( (vertDefs != nullptr || fragDefs != nullptr) && _pipeline != nullptr) {
        logDEBUG("Rebuilding rendernode pipeline.");
        _pipeline = nullptr;
    }

But in fact I should really check here if the content of the definitions changed since the last pipeline build.

Added initial shader file and processing function.

Introduced additional buffer to pass the roughness/numSamples data:

    struct PassData {
        F32 roughness{0.0F};
        U32 numSamples{32};
        // Padding
        U32 pad[2]{0};
    };

    UniformBuffer<PassData, PREFILTERED_CUBE_NUM_MIPS> pdata;
    for (U32 m = 0; m < num_mips; ++m) {
        pdata.data(m).roughness = (F32)m / (F32)(num_mips - 1);
    }

OK could successfully render the prefiltered environment map, also displaying it as with the irradiance texture:

    rpass1.render_node(0, "tests/skybox", DrawCall{3}, "camera",
                       BindLinearSampler, texCube(filteredEnvTex));

Added exposure and gamma in SceneUBO struct:

struct SceneUBO {
    Vec4f lightPositions[4]{};
    Vec4f lightColors[4]{};
    U32 numLights{0};
    F32 exposure{4.5F};
    F32 gamma{2.2F};
    U32 flags{0};
};

And I eventually got my first PBR IBL rendering working 🥳!

Yet, looking carefully at the rendering above, and dynamically changing the roughness, there is one thing strange I noticed (with using the pre-filtered envmap ?): the resolution seems pretty low/incorrect ? But I think this could be related to not having the mipmaps for the original cubemap, let's try to check that.

Hmmm 🤔… I've been testing on the prefilter shader, and there is something going wrong there 😵‍💫. I've been testing with this code:

			// Filtering based on https://placeholderart.wordpress.com/2015/07/28/implementation-notes-runtime-environment-map-filtering-for-image-based-lighting/

			var dotNH: f32 = clamp(dot(N, H), 0.0, 1.0);
			var dotVH: f32 = clamp(dot(V, H), 0.0, 1.0);

			// Probability Distribution Function
			var pdf: f32 = D_GGX(dotNH, roughness) * dotNH / (4.0 * dotVH) + 0.0001;
			// Slid angle of current smple
			var omegaS: f32 = 1.0 / (f32(pdata.numSamples) * pdf);
			// Solid angle of 1 pixel across all cube faces
			var omegaP: f32 = 4.0 * PI / (6.0 * envMapDim * envMapDim);
			// Biased (+1.0) mip level for better result
			// var mipLevel: f32 = roughness == 0.0 ? 0.0 : max(0.5 * log2(omegaS / omegaP) + 1.0, 0.0f);
			var mipLevel: f32 = select(max(0.5 * log2(omegaS / omegaP) + 1.0, 0.0f), 0.0f, roughness == 0.0);
			
			// color += textureLod(samplerCube(textureEnv, samplerEnv), L, mipLevel).rgb * dotNL;
			// color += textureSampleLevel(ourTexture, ourSampler, L, 0).xyz * dotNL;
			if(mipLevel > 0.1) {
				color += vec3(1.0,0.0,0.0) * dotNL;
				totalWeight += dotNL;
			}
			else {
				// color += textureSampleLevel(ourTexture, ourSampler, L, mipLevel).xyz * dotNL;
				color += vec3(0.0,0.0,0.0) * dotNL;
				totalWeight += dotNL;
			}

And all I get with this is a completely black filtered envmap 😲 which means the mipLevel value is wrong. In fact, I also tested with if(mipLevel>0.0) and I still get no red output, which means that the mipLevel is always 0.

Checking further it seems actually my roughness parameter is always 0 itself.

Arrggghh 🙄 Stupid me: I had forgotten to call update() after changing the roughness parameters on the CPU side:

    struct PassData {
        F32 roughness{0.0F};
        U32 numSamples{1024};
        // Padding
        U32 pad[2]{0};
    };

    UniformBuffer<PassData, PREFILTERED_CUBE_NUM_MIPS> pdata;
    for (U32 m = 0; m < num_mips; ++m) {
        pdata.data(m).roughness = (F32)m / (F32)(num_mips - 1);
    }

Allright!, looks much better now that this is fixed:

But I still think there is a final step required that I'm missing here as I'm still not providing mipmaps for the original cubemap, and the prefiltering shader will try to access those mipmaps on this line:

color += textureSampleLevel(ourTexture, ourSampler, L, mipLevel).xyz * dotNL;

OK! Now just fixed the mipmap generation for cubemaps too 🥳! And the results seem as good as I could expect them to be now: https://twitter.com/magik_engineer/status/1688131749656604672?s=20

  • blog/2023/0807_nvl_dev27_pbr_ibl_impl.txt
  • Last modified: 2023/08/10 10:34
  • by 127.0.0.1