blog:2024:0118_precomputed_atm_unit_tests

NervLand: Precomputed Atmospheric Scattering Unit tests

End of last year I completed the initial port of the Precomputed Atmospheric Scattering algorithm in my WebGPU engine. But considering the complexity of this algorithm, before I even try to use it in a scene I feel I should really build some robust unit tests to validated all the computation steps. So this will be our goal here.

Youtube video for this article available at:

Reconsidering the differences we get on WGSL/C++ sides in the GetLayerDensity() function, and now trying to use an approximate function to replace the exp() builtin function:

static auto GetLayerDensity(const DensityProfileLayer& layer, F32 altitude)
    -> F32 {
    // F32 density = layer.exp_term * exp(layer.exp_scale * altitude) +
    //               layer.linear_term * altitude + layer.constant_term;
    F32 density = layer.exp_term * approx_exp(layer.exp_scale * altitude) +
                  layer.linear_term * altitude + layer.constant_term;

    return clamp(density, 0.0F, 1.0F);
}

And we use:

/* Polynomial approximation for exp(x) */
inline auto approx_exp(F32 x) -> F32 {
    const F32 c1 = 1.0F;
    const F32 c2 = 0.5F;
    const F32 c3 = 1.0F / 6.0F;
    const F32 c4 = 1.0F / 24.0F;

    F32 result =
        1.0F + c1 * x + c2 * x * x + c3 * x * x * x + c4 * x * x * x * x;

    return result;
}

(Also added the same changes in the WGSL code of course)

But this didn't help, we still observe a large difference for very small values:

2024-01-06 00:02:04.131762 [NOTE] Generating CPU transmittance data...
D:/Projects/NervLand/tests/test_nvGPU/atmospheremodel_spec.cpp(231): error: in "atmospheremodel/test_transmittance_computation": difference{0.158059} between val{1.61398049e-13} and transmittance.z(){1.39369419e-13} exceeds 0.100000001
D:/Projects/NervLand/tests/test_nvGPU/atmospheremodel_spec.cpp(231): error: in "atmospheremodel/test_transmittance_computation": difference{0.175733} between val{1.12435594e-14} and transmittance.z(){9.56302484e-15} exceeds 0.100000001
D:/Projects/NervLand/tests/test_nvGPU/atmospheremodel_spec.cpp(231): error: in "atmospheremodel/test_transmittance_computation": difference{0.103269} between val{1.32476813e-16} and transmittance.z(){1.20076635e-16} exceeds 0.100000001

The computations done in the AtmosphereModel will use only a few ComputePasses and perform multiple computation steps in those passes. Thus, if we want to examinate the content written in a given target texture, we must do so before that content is updated again, which means that we need to request the copy of those intermediate results into additional debug textures that we can analyze later on.

To achieve this, I'm now creating the internal helper class ComputeInspector which I may use to inject additional computation step when recording the full computation passes:

    /* Compute inspector used to inject debug operations in the compute passes
     */
    class ComputeInspector : public RefObject {
      public:
        virtual void onStepAdded(PrecomputeStep step, WGPUComputePass& cpass,
                                 const ComputeContext& ctx,
                                 I32 scattering_order) = 0;
    };

This class will then be notify after each precomputation step is registered, using a list of precomutation step ids:

    enum PrecomputeStep {
        STEP_TRANSMITTANCE,
        STEP_FINAL_TRANSMITTANCE,
        STEP_DIRECT_IRRADIANCE,
        STEP_SINGLE_SCATTERING,
        STEP_SCATTERING_DENSITY,
        STEP_COPY_PREV_IRRADIANCE,
        STEP_INDIRECT_IRRADIANCE,
        STEP_COPY_PREV_SCATTERING,
        STEP_MULTIPLE_SCATTERING,
    };

For instance, here are part of the updates in the Precompute() method:

        for (U32 lIdx = 2; lIdx < ctx.num_scattering_orders; ++lIdx) {
            cnode->add({ngrpsx0, ngrpsy0}, {bgrp0},
                       {{0, {scatteringOrderU.get_dyn_offset(lIdx - 2)}}});
            cnode2->add({ngrpsx1, ngrpsy1}, {bgrp2},
                        {{0, {scatteringOrderU.get_dyn_offset(lIdx - 2)}}});

            // Add the compute steps in the compute pass interleaved:
            cpass->add_compute_step(*cnode, lIdx - 2);
            onStepAdded(STEP_SCATTERING_DENSITY, *cpass, ctx, (I32)lIdx);

            // Copy prev irradiance:
            cpass->add_compute_step(*cnode1, 0);
            onStepAdded(STEP_COPY_PREV_IRRADIANCE, *cpass, ctx, (I32)lIdx);
            cpass->add_compute_step(*cnode2, lIdx - 2);
            onStepAdded(STEP_INDIRECT_IRRADIANCE, *cpass, ctx, (I32)lIdx);

            // Copy prev scattering:
            cpass->add_compute_step(*cnode3, 0);
            onStepAdded(STEP_COPY_PREV_SCATTERING, *cpass, ctx, (I32)lIdx);
            cpass->add_compute_step(*cnode4, 0);
            onStepAdded(STEP_MULTIPLE_SCATTERING, *cpass, ctx, (I32)lIdx);
        }

To perform the texture copy operations in our unit tests, my idea is to simply add an helper function on the WGPUComputePass to request a texture copy compute step. And to achieve this, I think the optimal way would be to keep a list of the required WGPUComputeNodes in the engine itself and inject the corresponding compute step in the given pass as need. Thus, I'm now storing such a list of compute nodes in the WGPUEngine class, as follow:

    /* Custom key for CopyTexture nodes */
    struct CopyNodeKey {
        wgpu::TextureDimension dimemsion;
        wgpu::TextureFormat format;

        auto operator==(const CopyNodeKey& other) const -> bool {
            return std::tie(dimemsion, format) ==
                   std::tie(other.dimemsion, other.format);
        }
    };

    struct CopyNodeKeyHash {
        auto operator()(const CopyNodeKey& key) const -> std::size_t {
            // Use std::hash for each integer and combine the hash values
            return std::hash<int>()((int)key.dimemsion) ^
                   (std::hash<int>()((int)key.format) << 1);
        }
    };

    using CopyTextureNodeMap =
        UnorderedMap<CopyNodeKey, RefPtr<WGPUComputeNode>, DefaultPoolAllocator,
                     CopyNodeKeyHash>;

    /** texture copy compute nodes */
    CopyTextureNodeMap _copyTexNodes;

Yet, quickly after that, while working on the WGPUComputePass::copy_texture() method I realized that it was probably a better idea to create a new compute node everytime we need one, as we are going to add a compute step in it and this could lead to synchronization issues in a multithreaded context.

Thus now replacing the get_copy_texture_node() method with a create_copy_texture_node() method.

Next step was to write the copy_texture shader to be able to use it in the WGPUComputePass::copy_texture() method:

// This shader is used to copy either 2D or 3D textures from a source to a destination:

#ifdef NV_COPY_2D
@group(0) @binding(0) var sourceTex : texture_2d<f32>;
@group(0) @binding(1) var destTex : texture_storage_2d<${OUTPUT_FMT()},write>;
#endif

#ifdef NV_COPY_3D
@group(0) @binding(0) var sourceTex : texture_3d<f32>;
@group(0) @binding(1) var destTex : texture_storage_3d<${OUTPUT_FMT()},write>;
#endif

@compute @workgroup_size(8, 8)
fn main(@builtin(global_invocation_id) id : vec3<u32>)
{
    // Get the size of the source texture:
#ifdef NV_COPY_2D
    let tsize: vec2<u32> = textureDimensions(sourceTex);
#endif

#ifdef NV_COPY_3D
    let tsize: vec3<u32> = textureDimensions(sourceTex);
#endif

    // Check for out of bounds:
    if (id.x >= tsize.x || id.y >= tsize.y) {
        // don't do anything :
        return;
    }

#ifdef NV_COPY_2D
    // Copy a single pixel: 
    var val: vec4f = textureLoad(sourceTex, id.xy, 0);
    textureStore(destTex, id.xy, val);
#endif

#ifdef NV_COPY_3D
    // Copy multiple pixels:
    for(var layer: u32 = 0; layer < tsize.z; layer++) {
        var uvw = vec3u(id.xy, layer);
        var val: vec4f = textureLoad(sourceTex, uvw, 0);
        textureStore(destTex, uvw, val);
    }
#endif
}

I just using a single shader file for 2D/3D textures copy, that's more convinient.

Eventually I will need to download texture data at runtime, so I will need that process to be optimized and performed async, so let's just update that right now.

⇒ Hmmm… it seems I have something going wrong in my updated version, time to go for a debugging session. OK fixed.

I can now use this async call to get the texture data:

    // BOOST_CHECK_EQUAL(eng->debug_read_texture_data(transmittance_tex, data),
    //                   true);
    eng->read_texture_data(
        transmittance_tex,
        [](const U8* /*data*/, U64 /*bytesPerRow*/, U64 /*stride*/,
           U64 /*height*/) { logNOTE("I'm in the buffer reader callback!"); },
        &data);
    eng->wait_idle();

So from there I can remove the method WGPUEngine::debug_read_texture_data()

I then added the helper methood readTextureData in our custom Inspector class:

auto Inspector::readTextureData(const String& tname) -> TextureData {
    auto* eng = WGPUEngine::instance();
    auto it = _textures.find(tname);
    NVCHK(it != _textures.end(), "Invalid texture for {}", tname);

    TextureData tdata;

    U64 texBytesPerRow = eng->read_texture_data_sync(it->second, tdata.data);
    tdata.height = it->second.GetHeight();
    tdata.width = it->second.GetWidth();
    tdata.pixelSize = texBytesPerRow / tdata.width;

    return tdata;
}

In our unit test, we then updated the storage of the texture copy, in custom map indexed by the precompute_iteration index, the step and the scattering_order:

class Inspector : public AtmosphereModel::ComputeInspector {
  public:
    void on_step_added(nv::AtmosphereModel::PrecomputeStep step,
                       WGPUComputePass& cpass,
                       const nv::AtmosphereModel::ComputeContext& ctx,
                       I32 scattering_order) override;

    auto read_texture_data(U32 idx, AtmosphereModel::PrecomputeStep step,
                           I32 scattering_order = -1) -> TextureData;

    auto get_num_iterations() const -> U32 { return _numIterations; }

  protected:
    struct TexDataKey {
        U32 iteration;
        AtmosphereModel::PrecomputeStep step;
        I32 scattering_order;

        auto operator==(const TexDataKey& other) const -> bool {
            return std::tie(iteration, step, scattering_order) ==
                   std::tie(other.iteration, other.step, scattering_order);
        }
    };

    struct TexDataKeyHash {
        auto operator()(const TexDataKey& key) const -> std::size_t {
            // Use std::hash for each integer and combine the hash values
            return std::hash<int>()((int)key.iteration) ^
                   (std::hash<int>()((int)key.step) << 1) ^
                   (std::hash<int>()((int)key.scattering_order) << 2);
        }
    };

    using TexDataMap = UnorderedMap<TexDataKey, wgpu::Texture,
                                    DefaultPoolAllocator, TexDataKeyHash>;

    /** texture copy compute nodes */
    TexDataMap _texData;

    U32 _numIterations{0};
};

Then updated our test code to check also all the initial transmittance textures:

    {
        logNOTE("Checking final transmittance data...");
        timer.start();
        auto transmittance_data = inspector->read_texture_data(
            niters - 1, AtmosphereModel::STEP_FINAL_TRANSMITTANCE);

        AtmosphereParameters ATMO =
            model->get_parameters(transmittance_data.lambdas);
        check_transmittance_data(transmittance_data, ATMO);
        elapsed = timer.stop();
        logNOTE("Final transmittance data checked in {} secs.", elapsed);
    }

    // Now for each iteration, we should check multiple steps:
    for (U32 i = 0; i < niters; ++i) {
        // Check the initial transmittance:
        {
            logNOTE("Checking transmittance data on iteration {}", i);
            timer.start();
            auto transmittance_data = inspector->read_texture_data(
                i, AtmosphereModel::STEP_TRANSMITTANCE);

            AtmosphereParameters ATMO =
                model->get_parameters(transmittance_data.lambdas);
            check_transmittance_data(transmittance_data, ATMO);
            elapsed = timer.stop();
            logNOTE("Transmittance data checked in {} secs.", elapsed);
        }
    }

And this produced some tolerance errors:

2024-01-07 15:11:30.642860 [NOTE] Final transmittance data checked in 0.8599714000010863 secs.
2024-01-07 15:11:30.643429 [NOTE] Checking transmittance data on iteration 0
2024-01-07 15:11:30.666096 [NOTE] Full data size is: 262144 bytes
D:/Projects/NervLand/tests/test_nvGPU/atmospheremodel_spec.cpp(418): error: in "atmospheremodel/test_atm_computations": difference{0.298393} between val{1.49441518e-11} and transmittance.x(){1.15097333e-11} exceeds 0.200000003
D:/Projects/NervLand/tests/test_nvGPU/atmospheremodel_spec.cpp(421): error: in "atmospheremodel/test_atm_computations": difference{0.214231} between val{1.10447198e-08} and transmittance.y(){9.09605813e-09} exceeds 0.200000003
D:/Projects/NervLand/tests/test_nvGPU/atmospheremodel_spec.cpp(418): error: in "atmospheremodel/test_atm_computations": difference{0.333757} between val{1.26689756e-13} and transmittance.x(){9.49871353e-14} exceeds 0.200000003
D:/Projects/NervLand/tests/test_nvGPU/atmospheremodel_spec.cpp(421): error: in "atmospheremodel/test_atm_computations": difference{0.23879} between val{3.38483908e-10} and transmittance.y(){2.73237571e-10} exceeds 0.200000003
D:/Projects/NervLand/tests/test_nvGPU/atmospheremodel_spec.cpp(418): error: in "atmospheremodel/test_atm_computations": difference{0.208635} between val{2.63333279e-16} and transmittance.x(){2.17876568e-16} exceeds 0.200000003
D:/Projects/NervLand/tests/test_nvGPU/atmospheremodel_spec.cpp(418): error: in "atmospheremodel/test_atm_computations": difference{0.219826} between val{2.34897906e-16} and transmittance.x(){1.925668e-16} exceeds 0.200000003
D:/Projects/NervLand/tests/test_nvGPU/atmospheremodel_spec.cpp(418): error: in "atmospheremodel/test_atm_computations": difference{0.224055} between val{2.31974716e-16} and transmittance.x(){1.89513365e-16} exceeds 0.200000003
D:/Projects/NervLand/tests/test_nvGPU/atmospheremodel_spec.cpp(418): error: in "atmospheremodel/test_atm_computations": difference{0.228572} between val{2.27903691e-16} and transmittance.x(){1.85502902e-16} exceeds 0.200000003
D:/Projects/NervLand/tests/test_nvGPU/atmospheremodel_spec.cpp(418): error: in "atmospheremodel/test_atm_computations": difference{0.238004} between val{2.23281564e-16} and transmittance.x(){1.80356118e-16} exceeds 0.200000003
D:/Projects/NervLand/tests/test_nvGPU/atmospheremodel_spec.cpp(418): error: in "atmospheremodel/test_atm_computations": difference{0.242368} between val{2.22748489e-16} and transmittance.x(){1.79293542e-16} exceeds 0.200000003
D:/Projects/NervLand/tests/test_nvGPU/atmospheremodel_spec.cpp(418): error: in "atmospheremodel/test_atm_computations": difference{0.247469} between val{2.22805638e-16} and transmittance.x(){1.78606122e-16} exceeds 0.200000003
D:/Projects/NervLand/tests/test_nvGPU/atmospheremodel_spec.cpp(418): error: in "atmospheremodel/test_atm_computations": difference{0.249812} between val{2.22911702e-16} and transmittance.x(){1.78356247e-16} exceeds 0.200000003
D:/Projects/NervLand/tests/test_nvGPU/atmospheremodel_spec.cpp(418): error: in "atmospheremodel/test_atm_computations": difference{0.251829} between val{2.23194796e-16} and transmittance.x(){1.78295022e-16} exceeds 0.200000003
2024-01-07 15:11:31.492312 [NOTE] Transmittance data checked in 0.8488814000738785 secs.
2024-01-07 15:11:31.492932 [NOTE] Checking transmittance data on iteration 1
2024-01-07 15:11:31.516769 [NOTE] Full data size is: 262144 bytes
2024-01-07 15:11:32.347726 [NOTE] Transmittance data checked in 0.854789900011383 secs.
2024-01-07 15:11:32.348298 [NOTE] Checking transmittance data on iteration 2
2024-01-07 15:11:32.370934 [NOTE] Full data size is: 262144 bytes
2024-01-07 15:11:33.183326 [NOTE] Transmittance data checked in 0.8350267000496387 secs.
2024-01-07 15:11:33.183894 [NOTE] Checking transmittance data on iteration 3
2024-01-07 15:11:33.205939 [NOTE] Full data size is: 262144 bytes
2024-01-07 15:11:34.015230 [NOTE] Transmittance data checked in 0.8313355999998748 secs.
2024-01-07 15:11:34.015799 [NOTE] Checking transmittance data on iteration 4
2024-01-07 15:11:34.041526 [NOTE] Full data size is: 262144 bytes
2024-01-07 15:11:34.860446 [NOTE] Transmittance data checked in 0.8446464999578893 secs.

⇒ So, increasing the tolerance value from 0.2 to 0.3 now.

Arrghh… even with that change I still have an error left on one element:

D:/Projects/NervLand/tests/test_nvGPU/atmospheremodel_spec.cpp(418): error: in "atmospheremodel/test_atm_computations": difference{0.333757} between val{1.26689756e-13} and transmittance.x(){9.49871353e-14} exceeds 0.300000012

So, that's not good 😑. And now I think I should try to find another way to proceed to fix those mismatches. All the mismatches are detected for very small transmittance values, so maybe those values are not even meaningful in the end ? So let's check the statistics for the transmittance first.

First I added a few helper functions to compute some base statistics:

template <typename T> inline auto mean(T* ptr, U64 num) -> T {

    if (num == 0) {
        return 0.0;
    }

    T meanVal = 0.0;

    for (U64 i = 0; i < num; ++i) {
        meanVal += (*ptr++);
    }

    return meanVal / (T)num;
}

template <typename T> inline auto mean(T* ptr, U64 num, T& mini, T& maxi) -> T {

    if (num == 0) {
        return 0.0;
    }

    T meanVal = 0.0;
    mini = ptr[0];
    maxi = ptr[0];

    for (U64 i = 0; i < num; ++i) {
        auto val = (*ptr++);
        if (val < mini) {
            mini = val;
        }
        if (val > maxi) {
            maxi = val;
        }

        meanVal += val;
    }

    return meanVal / (T)num;
}

template <typename T> inline auto stddev(T* ptr, U64 num) -> T {

    if (num == 0) {
        return 0.0;
    }

    T meanVal = mean(ptr, num);
    T stddevVal = 0.0;

    for (U64 i = 0; i < num; ++i) {
        T diff = (*ptr++) - meanVal;
        stddevVal += diff * diff;
    }

    return std::sqrt(stddevVal / (T)num);
}

template <typename T>
inline auto stddev(T* ptr, U64 num, T& meanVal, T& mini, T& maxi) -> T {

    if (num == 0) {
        return 0.0;
    }

    meanVal = mean(ptr, num, mini, maxi);
    T stddevVal = 0.0;

    for (U64 i = 0; i < num; ++i) {
        T diff = (*ptr++) - meanVal;
        stddevVal += diff * diff;
    }

    return std::sqrt(stddevVal / (T)num);
}

template <typename T>
inline auto percentile(const T* ptr, U64 num, T percentile) -> T {

    if (num == 0) {
        return 0.0;
    }

    // Ensure percentile is within [0, 100]
    percentile = std::max(0.0, std::min(100.0, percentile));

    // Create a copy of the array:
    Vector<T> sortedArray(ptr, ptr + num);

    // Sort the copied array
    std::sort(sortedArray.begin(), sortedArray.end());

    // Calculate the index corresponding to the given percentile
    F64 index = (percentile / 100.0) * (num - 1);

    // Compute the lower and upper indices for interpolation
    U64 lowerIndex = static_cast<U64>(std::floor(index));
    U64 upperIndex = static_cast<U64>(std::ceil(index));

    // Interpolate between the values at lower and upper indices
    T lowerValue = sortedArray[lowerIndex];
    T upperValue = sortedArray[upperIndex];
    F64 fraction = index - (F64)lowerIndex;

    return static_cast<T>((1.0 - fraction) * lowerValue +
                          fraction * upperValue);
}

I then added this code in our check_transmittance_data() function:

    F32 mini = 0.0F;
    F32 maxi = 0.0F;
    F32 mean = 0.0F;
    F32 dev = stddev(data_ptr, (U64)width * height * 4, mean, mini, maxi);
    F32 pval = percentile(data_ptr, (U64)width * height * 4, 0.01F);
    logNOTE("Transmittance statistics: mini={}, maxi={}, mean={}, dev={}, "
            "3sig_range=[{}, {}], 0.01% threshold={}",
            mini, maxi, mean, dev, mean - 3 * dev, mean + 3 * dev, pval);

And this produced outputs all similar to this one:

2024-01-07 16:22:22.506342 [NOTE] Transmittance statistics: mini=0, maxi=1, mean=0.3475563, dev=0.35492176, 3sig_range=[-0.7172089, 1.4123216], 0.01% threshold=0

So from there, it seems to me that very small transmittance value such as 1e-9 or lower could certainly be approximated to be zero, no 🤔?

⇒ let's try that to see if this will improve the GPU/CPU match.

Alright, if I set the lower transmittance threshold to a value of 1e-8, then I can reduce the tolerance value to 0.1 (but I cannot got much further that this for now, as I get other mismatch with non-small values if I decrease the tolerance more than that):

static auto
ComputeTransmittanceToTopAtmosphereBoundary(const AtmosphereParameters& ATMO,
                                            F32 r, F32 mu) -> Vec3f {
    auto res = exp(-(
        ATMO.rayleigh_scattering * ComputeOpticalLengthToTopAtmosphereBoundary(
                                       ATMO, ATMO.rayleigh_density, r, mu) +
        ATMO.mie_extinction * ComputeOpticalLengthToTopAtmosphereBoundary(
                                  ATMO, ATMO.mie_density, r, mu) +
        ATMO.absorption_extinction *
            ComputeOpticalLengthToTopAtmosphereBoundary(
                ATMO, ATMO.absorption_density, r, mu)));
    F32 thres = 1e-8;

    return componentMaximum(res, {thres, thres, thres});
}

Update: Hmmm, I don't know if I dreamt or something, but it seems that actually the best transmittance tolerance that I can reach is again 0.2, that's strange… 🤔.

Moving forward with the unit tests we will need to be able to sample some texture data on the CPU side. so I extended the TextureData class accordingly:

struct TextureData {
    Vector<U8> data;
    U32 width{};
    U32 height{};
    U32 depth{};
    U32 pixelSize{};
    Vec3d lambdas;
    Mat3d luminance_from_radiance;
    bool blend{};

    [[nodiscard]] auto sample_f32_linear_2d(Vec2f uv) const -> Vec4f;

    [[nodiscard]] auto sampleTexel(U32 x, U32 y) const -> Vec4f;
    [[nodiscard]] auto sampleTexel(Vec2u coords) const -> Vec4f {
        return sampleTexel(coords.x(), coords.y());
    }
};

auto TextureData::sampleTexel(U32 x, U32 y) const -> Vec4f {
    U32 numChannels = pixelSize / 4;

    // Get the start of the pixel:
    F32* ptr = (F32*)data.data();
    ptr += y * width * numChannels + x;
    Vec4f res;
    for (U32 i = 0; i < numChannels; ++i) {
        res[i] = *ptr++;
    }
    return res;
}

static auto fract(F32 x) -> F32 { return x - std::floor(x); }

auto TextureData::sample_f32_linear_2d(Vec2f uv) const -> Vec4f {
    F32 xsize = (F32)width;
    F32 ysize = (F32)height;
    Vec2f psize{1.0F / xsize, 1.0F / ysize};

    // To calculate the coordinates we need to remove half a pixel size to the
    // uv,
    Vec2f uv2 = Vec2f(uv.x() * xsize, uv.y() * ysize) - Vec2f(0.5F, 0.5F);

    // Take the floor of the coords:
    uv2 = uv2.floor();

    I32 xcoord = (I32)uv2.x();
    I32 ycoord = (I32)uv2.y();

    // Calculate the four texel coordinates for bilinear filtering
    Vec2u texel00 = Vec2u(std::max(xcoord, 0), std::max(ycoord, 0));
    Vec2u texel10 =
        Vec2u(std::min(xcoord + 1, (I32)width - 1), std::max(ycoord, 0));
    Vec2u texel01 =
        Vec2u(std::max(xcoord, 0), std::min(ycoord + 1, (I32)height - 1));
    Vec2u texel11 = Vec2u(std::min(xcoord + 1, (I32)width - 1),
                          std::min(ycoord + 1, (I32)height - 1));

    // Sample the four texels
    Vec4f color00 = sampleTexel(texel00);
    Vec4f color10 = sampleTexel(texel10);
    Vec4f color01 = sampleTexel(texel01);
    Vec4f color11 = sampleTexel(texel11);

    // Interpolate the colors using bilinear interpolation
    Vec4f color = mix(mix(color00, color10, fract(uv2.x())),
                      mix(color01, color11, fract(uv2.x())), fract(uv2.y()));

    return color;
}

First attempts are not working very well for the moment 😅… Investigating and debugging.

Hmmm 🤔… I seem to have a problem in my filtering function, when retrieving the values from the textures:

    // Sample the four texels
    // Vec4f color00 = sampleTexel(texel00);
    Vec4f color00 = sampleTexel({0, 0});
    return color00;

The code above doesn't work if I'm checking the texel00 values, but it works when I try random pixel locations, how could that be ? Puff, stupid error of course, I forgot to normalize the frag_coords by the texture dimensions:

    /* Sample the transmittance: */
    return transmittance
        .sample_f32_linear_2d(frag_coord / IRRADIANCE_TEXTURE_SIZE)
        .xyz();

Alright, after fixing this, the unit test implementation for direct irradiance went very smoothly.

At that point I moved the atm functions in a dedicated file atmo_functions.cpp to try to simplify the design.

While working on the single scattering unit tests I faced some additional trouble (as this was a first test with a 3D texture), and when I eventually got the unit test to start working correctly I realized that I was getting numerous very large differences due to the different level of precision and complex computations at play.

Here is the last help I wrote to compare Vec4f from a pixel for instance:

static void check_pixel(const Vec4f& vec1, const Vec4f& vec2, F32 tol,
                        F32 thres, F32 tol2) {

    F32 t0 =
        (std::abs(vec1.x()) < thres || std::abs(vec2.x()) < thres) ? tol2 : tol;
    F32 t1 =
        (std::abs(vec1.y()) < thres || std::abs(vec2.y()) < thres) ? tol2 : tol;
    F32 t2 =
        (std::abs(vec1.z()) < thres || std::abs(vec2.z()) < thres) ? tol2 : tol;
    F32 t3 =
        (std::abs(vec1.w()) < thres || std::abs(vec2.w()) < thres) ? tol2 : tol;

    if (vec1.x() != 0 && vec2.x() != 0) {
        BOOST_CHECK_CLOSE_FRACTION(vec1.x(), vec2.x(), t0);
    }
    if (vec1.y() != 0 && vec2.y() != 0) {
        BOOST_CHECK_CLOSE_FRACTION(vec1.y(), vec2.y(), t1);
    }
    if (vec1.z() != 0 && vec2.z() != 0) {
        BOOST_CHECK_CLOSE_FRACTION(vec1.z(), vec2.z(), t2);
    }
    if (vec1.w() != 0 && vec2.w() != 0) {
        BOOST_CHECK_CLOSE_FRACTION(vec1.w(), vec2.w(), t3);
    }
}

⇒ This is becomming too tricky and too slow, so i think I should try to compute some statistics instead.

So I finally switched to this kind of computation check:

    F32 meanVal = NAN;
    F32 mini = NAN;
    F32 maxi = NAN;
    F32 sdev =
        stddev(channels[0].data(), channels[0].size(), meanVal, mini, maxi);
    logNOTE("delta_rayleigh statistics: mini={}, maxi={}, mean={}, dev={}",
            mini, maxi, meanVal, sdev);

    BOOST_CHECK_SMALL(meanVal, 0.001F);
    BOOST_CHECK_SMALL(sdev, 0.02F);

And this takes a lot of time to do the computation on the CPU side, but it seems I get the expected type of results with that, such as:

2024-01-09 22:11:03.147162 [NOTE] Checking single scattering data on iteration 4
2024-01-09 22:12:11.139468 [NOTE] delta_rayleigh statistics: mini=-0.28173828, maxi=0.064453125, mean=-0.00016569764, dev=0.0050190063
2024-01-09 22:12:11.222059 [NOTE] Single scattering data checked in 68.07548150001094 secs.

And here are the outputs for all the delta rayleigh/mie Vec3 components:

2024-01-09 22:22:46.556132 [NOTE] Checking single scattering data on iteration 1
2024-01-09 22:23:54.425857 [NOTE] Single scattering channel 0 statistics: size=1048576, mini=-1.4091797, maxi=0.890625, mean=-0.00029711888, dev=0.019341089
2024-01-09 22:23:54.430108 [NOTE] Single scattering channel 1 statistics: size=1048576, mini=-1.0595703, maxi=0.6435547, mean=-0.000257808, dev=0.015144418        
2024-01-09 22:23:54.434277 [NOTE] Single scattering channel 2 statistics: size=1048576, mini=-0.8125, maxi=0.4921875, mean=-0.0002252653, dev=0.011956611
2024-01-09 22:23:54.438498 [NOTE] Single scattering channel 3 statistics: size=1048576, mini=-0.008010864, maxi=0.15132141, mean=1.5105066e-05, dev=0.0013182273   
2024-01-09 22:23:54.442898 [NOTE] Single scattering channel 4 statistics: size=1048576, mini=-0.013015747, maxi=0.14520264, mean=1.3490402e-05, dev=0.0013575512   
2024-01-09 22:23:54.446981 [NOTE] Single scattering channel 5 statistics: size=1048576, mini=-0.01940918, maxi=0.14141846, mean=1.0692577e-05, dev=0.0014051737    
2024-01-09 22:23:54.525407 [NOTE] Single scattering data checked in 67.96985949994996 secs.

I then added support to disable the single scattering validation as this is taking a lot of time:

#if 0
        // Check the single scattering:
        logNOTE("Checking single scattering data on iteration {}", i);
        timer.start();
        check_single_scattering(*model, *inspector, ATMO, i);
        elapsed = timer.stop();
        logNOTE("Single scattering data checked in {} secs.", elapsed);
#endif

There are many elements in WGSL that would confirm that the F32 computation precision is not the same as in C++ (cf. https://www.w3.org/TR/WGSL/#floating-point-evaluation)

This was another very large one… to the point where even statistical checking didn't really work with the 32 layers to be checked in the 3D scattering output texture.

So I had to temporarily reduce the SAMPLE_COUNT value to 1 instead of 16 in the function ComputeScatteringDensity and I also had to increase the threshold for the mean and stddev checks significantly as the differences between GPU and CPU are literally exploding:

    I32 idx = 0;
    for (auto& channel : channels) {
        F32 meanVal = NAN;
        F32 mini = NAN;
        F32 maxi = NAN;
        F32 sdev = stddev(channel.data(), channel.size(), meanVal, mini, maxi);
        logNOTE("Scattering density channel {} statistics: size={}, mini={}, "
                "maxi={}, mean={}, dev={}",
                idx, channel.size(), mini, maxi, meanVal, sdev);

        if (idx < 3) {
            BOOST_CHECK_SMALL(meanVal, 0.006F);
            BOOST_CHECK_SMALL(sdev, 0.6F);
        }

        idx++;
    }

⇒ I'm not giving another look into GetOneScattering() as it seems there could really be something going wrong there. But this didn't give anything new :-(

This is another one where the processing takes too long on the CPU side, so I had to reduce the number of samples in ComputeMultipleScattering from 50 to 5 during testing:

static auto ComputeMultipleScattering(const AtmosphereParameters& ATMO,
                                      const TextureData& transmittance,
                                      const TextureData& scattering_density,
                                      F32 r, F32 mu, F32 mu_s, F32 nu,
                                      bool ray_r_mu_intersects_ground)
    -> Vec3f {
    // f32 of intervals for the numerical integration.
    const I32 SAMPLE_COUNT = 5; // Instead of 50, FOR TESTING ONLY
    // The integration step, i.e. the length of each integration interval.
    F32 dx = DistanceToNearestAtmosphereBoundary(ATMO, r, mu,
                                                 ray_r_mu_intersects_ground) /
             F32(SAMPLE_COUNT);

    // Integration loop.
    Vec3f rayleigh_mie_sum{};

Now just added our atmosphere model in the plane_ortho example:

#if 1
    // Add the atmosphere:
    auto traits = AtmosphereModel::get_earth_traits({});

    logNOTE("Initializing AtmosphereModel...");
    auto model = create_ref_object<AtmosphereModel>(traits);
    model->init();

    logNOTE("Adding AtmosphereModel to scene.");
    scene.draw(model->get_render_node());
    logNOTE("Done adding AtmosphereModel to scene.");
#endif

Currently only producing a green/red/yellow screen aligned quad but the model is still initialized properly.

The first thing I would like to change on this is the usage of 6 vertices to draw a screen aligned quad: it's preferable to use only 3 vertices instead, so let's update this.

And this gives us this function in wgsl, which seems to work just fine:

/* Return the normalized UVs fro the vertices assuming we draw a single triangle
to fill the scree */
fn get_screen_triangle_pos(vertex_idx: u32) -> vec2f
{
    switch (vertex_idx%3) {
        case 0: { return vec2(0.0,0.0); }
        case 1: { return vec2(2.0,0.0); }
        default: { return vec2(0.0,2.0); }
    }
}

While trying to implement the functions for atmosphere sampling I got the following error:

[ERROR] Dawn: Validation error: Tint WGSL reader failure: :570:10 error: 'textureSample' must only be called from uniform control flow
  return textureSample(transmittance_texture, linearSampler, uv).xyz;
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

:585:9 note: called by 'GetTransmittanceToTopAtmosphereBoundary' from 'GetTransmittance'
        GetTransmittanceToTopAtmosphereBoundary(r, mu) /
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

:578:3 note: control flow depends on possibly non-uniform value
  if (ray_r_mu_intersects_ground) {
  ^^

:578:7 note: parameter 'ray_r_mu_intersects_ground' of 'GetTransmittance' may be non-uniform
  if (ray_r_mu_intersects_ground) {
      ^^^^^^^^^^^^^^^^^^^^^^^^^^

:789:74 note: possibly non-uniform value passed here
var shadow_transmittance: vec3f = GetTransmittance(r, mu, shadow_length, ray_r_mu_intersects_ground);
                                                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^

:749:28 note: parameter 'view_ray' of 'GetSkyRadiance' may be non-uniform
var rmu: f32 = dot(camera, view_ray);
                           ^^^^^^^^

:801:31 note: possibly non-uniform value passed here
return GetSkyRadiance(camera, view_ray, shadow_length, sun_direction, transmittance) *
                              ^^^^^^^^

:801:31 note: parameter 'view_ray' of 'GetSkyLuminance' may be non-uniform
return GetSkyRadiance(camera, view_ray, shadow_length, sun_direction, transmittance) *
                              ^^^^^^^^

:843:47 note: possibly non-uniform value passed here
var radiance: vec3f = GetSkyLuminance(camPos, viewDirection, shadowLength, sunDirection, &transmittance);
                                              ^^^^^^^^^^^^^

:840:28 note: parameter 'in' of 'main_fs' may be non-uniform
var viewDirection: vec3f = in.ray;
                           ^^


 - While validating [ShaderModuleDescriptor]
 - While calling [Device].CreateShaderModule([ShaderModuleDescriptor]).

Whe after reading this section https://www.w3.org/TR/WGSL/#uniformity and then this section https://www.w3.org/TR/WGSL/#builtin-functions-that-compute-a-derivative I realized I might be able to fix this by using textureSampleLevel instead. ⇒ OK! this did the trick.

So after that I got my first running atmosphere rendering pass, but of course, the display was completely wrong 😭:

But the error on that point was easy to fix: I forgot to use the *kLengthUnitInMeters* factor when providing the camera position:

    // We need the camera position from the center of the planet, which is precisely
    // the ubo.camWorldPos vector:
    var camPos = in.origin/kLengthUnitInMeters;

So now I seem to get the appropriate shape for the earth model, but the atmosphere rendering is still completely red 😭:

* Checked if there could be an issue with the SKY_SPECTRAL_RADIANCE_TO_LUMINANCE vector in this call:

fn GetSkyLuminance(
    camera: vec3f, view_ray: vec3f, shadow_length: f32,
    sun_direction: vec3f, transmittance: ptr<function, vec3f>) -> vec3f {
    return GetSkyRadiance(camera, view_ray, shadow_length, sun_direction, transmittance) *
        SKY_SPECTRAL_RADIANCE_TO_LUMINANCE;
}

But that doesn't seem to be the case as we get the debug outputs:

2024-01-13 11:22:06.004711 [NOTE] Initializing AtmosphereModel...
2024-01-13 11:22:06.004754 [DEBUG] AtmosphereModel: Computed sky_k=Vec3d(   683,    683,    683)
2024-01-13 11:22:06.004838 [DEBUG] AtmosphereModel: Computed sun_k=Vec3d(98242.8, 69954.4,  66475)

* Now trying to build the origin project as reference:

⇒ Just running the batch file platform\windows\download_build_run.bat in a VS prompt worked just fine to build/run the demo app.

I then added the following functions in the reference model.cc file to be able to save the precomputed textures to file:

// Function used to save a texture 2d
// Function to download texture data and save it to a raw file
void SaveTexture2DToRawFile(GLuint texture, int width, int height, const std::string& filename) {
    glBindTexture(GL_TEXTURE_2D, texture);

    // Allocate memory to store the texture data
    std::vector<float> textureData(width * height * 4); // Assuming RGBA format and floating-point values

    // Get the texture data
    glGetTexImage(GL_TEXTURE_2D, 0, GL_RGBA, GL_FLOAT, textureData.data());

    // Save the texture data to a raw file
    std::ofstream file(filename, std::ios::binary);
    if (file.is_open()) {
        file.write(reinterpret_cast<const char*>(textureData.data()), textureData.size() * sizeof(float));
        file.close();
        std::cout << "Texture data saved to: " << filename << std::endl;
    } else {
        std::cerr << "Error opening file for writing: " << filename << std::endl;
    }
}

#if 0
// Function to save a 3D texture to a raw file
void SaveTexture3DToRawFile(GLuint texture, int width, int height, int depth, GLenum format, const std::string& filename) {
    glBindTexture(GL_TEXTURE_3D, texture);

    // Allocate memory to store the texture data
    std::vector<float> textureData(width * height * depth * 4); // Assuming RGBA format and floating-point values

    // Get the texture data
    glGetTexImage(GL_TEXTURE_3D, 0, format, GL_FLOAT, textureData.data());

    // Save the texture data to a raw file
    std::ofstream file(filename, std::ios::binary);
    if (file.is_open()) {
        file.write(reinterpret_cast<const char*>(textureData.data()), textureData.size() * sizeof(float));
        file.close();
        std::cout << "3D Texture data saved to: " << filename << std::endl;
    } else {
        std::cerr << "Error opening file for writing: " << filename << std::endl;
    }
}
#else 
// Function to save a 3D texture with GL_RGBA16F internal format to a raw file
void SaveTexture3DToRawFile(GLuint texture, int width, int height, int depth, const std::string& filename) {
    glBindTexture(GL_TEXTURE_3D, texture);

    // Allocate memory to store the texture data
    std::vector<GLhalf> textureData(width * height * depth * 4); // Assuming GL_RGBA16F format and half-precision values

    // Get the texture data
    glGetTexImage(GL_TEXTURE_3D, 0, GL_RGBA, GL_HALF_FLOAT, textureData.data());

    // Save the texture data to a raw file
    std::ofstream file(filename, std::ios::binary);
    if (file.is_open()) {
        file.write(reinterpret_cast<const char*>(textureData.data()), textureData.size() * sizeof(GLhalf));
        file.close();
        std::cout << "3D Texture data saved to: " << filename << std::endl;
    } else {
        std::cerr << "Error opening file for writing: " << filename << std::endl;
    }
}
#endif

And then writing the texture files at the end of the Model::init() call:

  /* Write the texture data to file: */
  SaveTexture2DToRawFile(transmittance_texture_, TRANSMITTANCE_TEXTURE_WIDTH, TRANSMITTANCE_TEXTURE_HEIGHT, "transmittance.raw");
  SaveTexture3DToRawFile(scattering_texture_, SCATTERING_TEXTURE_WIDTH, SCATTERING_TEXTURE_HEIGHT, SCATTERING_TEXTURE_DEPTH, "scattering.raw");
  SaveTexture2DToRawFile(irradiance_texture_, IRRADIANCE_TEXTURE_WIDTH, IRRADIANCE_TEXTURE_HEIGHT, "irradiance.raw");

Now let's display the transmittance texture in jupyter:

For this I create a new jupyter book ${NERVHOME}/books/atmo/atm_initial_test.pynb with this code:

To read the raw 2D texture data in Python and display it using Matplotlib, you can use the following code. Assuming the texture data is stored in a raw file and is in floating-point format (e.g., GL_RGBA32F), you can use the numpy library to read and process the data, and matplotlib for visualization.

python
Copy code
import numpy as np
import matplotlib.pyplot as plt

def load_raw_texture(filename, width, height):
    # Read raw file into numpy array
    with open(filename, 'rb') as file:
        raw_data = np.fromfile(file, dtype=np.float32)

    # Reshape data to 2D texture dimensions
    texture_data = raw_data.reshape((height, width, 4))  # Assuming RGBA format

    return texture_data

def display_texture(texture_data):
    plt.imshow(texture_data, cmap='viridis')  # Adjust the colormap as needed
    plt.title('2D Texture Data')
    plt.colorbar()
    plt.show()

# Example usage
filename = 'transmittance_texture.raw'  # Replace with your raw file
width = TRANSMITTANCE_TEXTURE_WIDTH      # Replace with your texture width
height = TRANSMITTANCE_TEXTURE_HEIGHT    # Replace with your texture height

texture_data = load_raw_texture(filename, width, height)
display_texture(texture_data)

Actually I had to override the alpha channel from the texture data to get it to display properly:

filename = 'transmittance.raw'  # Replace with your raw file
TRANSMITTANCE_TEXTURE_WIDTH = 256
TRANSMITTANCE_TEXTURE_HEIGHT = 64

width = TRANSMITTANCE_TEXTURE_WIDTH      # Replace with your texture width
height = TRANSMITTANCE_TEXTURE_HEIGHT    # Replace with your texture height

texture_data = load_raw_texture(filename, width, height)
texture_data[:,:,3] = 1.0

display_texture(texture_data)

print("Data min: %s", np.amin(texture_data))
print("Data max: %s", np.amax(texture_data))

And then this would produce this result:

Now let's do the same thing with the reference irradiance data.

But for the irradiance data we get a rather black display for the texture, so we have to introoduce a scale factor to get something on screen:

def display_atm_2d_output(filename, width, height, scale=None):
    texture_data = load_raw_texture(filename, width, height)
    texture_data[:,:,3] = 1.0
    
    if scale is not None:
        texture_data = np.clip(texture_data*scale,0.0,1.0)
        
    display_texture(texture_data)

    print("Data min: %s", np.amin(texture_data))
    print("Data max: %s", np.amax(texture_data))

Then we get this:

Now let's try to get the same textures out of our AtmosphereModel class.

Added this method to support texture writting:

void WGPUEngine::write_texture_raw_file(const wgpu::Texture& texture,
                                        const String& filename) {
    // Read the texture data:
    logDEBUG("Writting texture file {}...", filename);
    Vector<U8> texData;
    read_texture_data_sync(texture, texData);

    std::ofstream file(filename.c_str(), std::ios::binary);
    NVCHK(file.is_open(), "Cannot write to file {}", filename);

    file.write((const char*)texData.data(), (U64)texData.size());
    file.close();
    logDEBUG("Texture data saved to file {}.", filename);
}

⇒ Comparison of transmittance data seems OK but not for irradiance data:

Next added a method to display a given layer of a 3D texture:

def display_atm_3d_output(filename, width, height, depth, layer, scale=None):
    texture_data = load_raw_3d_texture(filename, width, height, depth)
    
    dmini = np.amin(texture_data[:,:,:,:3])
    dmaxi = np.amax(texture_data[:,:,:,:3])
    
    print("Data min: ", dmini)
    print("Data max: ", dmaxi)
    texture_data = (texture_data - dmini)/(dmaxi - dmini)
    
    if scale is not None:
        texture_data = np.clip(texture_data*scale,0.0,1.0)

    texture_data[:,:,:,3] = 1.0

    ldata = texture_data[layer,:,:,:]
    display_texture(ldata)

And from that we see that our scattering data is also very different from the reference scattering data (and completely red):

⇒ So now we need to automate the texture data comparision in a new unit test and find the first computation step where we start to get noticeable differences.

Update the code in Model.cc to write the output textures in custom dir:

const std::string data_dir="D:\\Projects\\NervHome\\data\\atmo_tests\\"

First significant difference is on the single scattering pass: I even tried to replace the content of the GetProfileDensity() function with this:

fn GetProfileDensity(profile: DensityProfile, altitude: f32) -> f32{
  return altitude/100.0;
  // if( altitude < profile.layers[0].width ) {
  //   return GetLayerDensity(profile.layers[0], altitude);
  // }

  // return GetLayerDensity(profile.layers[1], altitude);
}

But I'm still observing a significant difference, and some horizontal lines in our WGPU computed results:

Simplifying the code even further, it seems we really have a problem in the GetTransmittance() function itself, as calling this will produce very noticeable visual differences:

OK: So now I think the issue here is with my custom texture sampling function sample_tex_2d_linear but in fact, I now think I could just replace this with textureSampleLevel in our compute shader. so let's try that (Note: this means I also need to provide a sampler in all my bindings groups)

Indeed with more tests this texture sample process was the main issue.

And also the fact that I was thresholding the transmittance data (removed below):

fn ComputeTransmittanceToTopAtmosphereBoundary(r: f32, mu: f32) -> vec3f {
  var res: vec3f = exp(-(
      ATMO.rayleigh_scattering * ComputeOpticalLengthToTopAtmosphereBoundary(ATMO.rayleigh_density, r, mu) 
      + ATMO.mie_extinction * ComputeOpticalLengthToTopAtmosphereBoundary(ATMO.mie_density, r, mu)
      + ATMO.absorption_extinction * ComputeOpticalLengthToTopAtmosphereBoundary(ATMO.absorption_density, r, mu)));
  return res;
  // var thres: f32 = 1e-8;
  // return max(res, vec3f(thres));
}

I could then continue with validation of scattering density without issue for once (all scattering orders were matching very nicely as below)

Same results for the indirect irradiance step, all matching correctly:

But then I switched to luminance usage case (ie. multiple called of Precompute()) and in this case we see that in the WGSL implementation, the scattering texture out of step 2 will progressively drift towards red when accumulated so this is definitely where we have a bug to fix:

After that I finally got a “correct” (eg. proper blue/white colors instead of the red colors we got initially) rendering of the atmosphere, and I just had to reduce a bit the expose value which was not appropriate for the precomputed luminance procesing path:

    var transmittance: vec3f = vec3f(0.0);
    var radiance: vec3f = GetSkyLuminance(camPos, viewDirection, shadowLength, sunDirection, &transmittance);

    var exposure: f32 = 1e-4;
    var white_point: vec3f = vec3f(1.0);

    var color: vec3f = pow(vec3f(1.0) - exp(-radiance / white_point * exposure), vec3f(1.0 / 2.2));

    // var col: vec3f = vec3f(in.coords.x, in.coords.y, 0.0);
    // return vec4f(col, 0.5);
    return vec4f(color, 0.5);

And with that change this is the display that I got:

First let's set the fixed sun direction to match the current sun direction used in the planet rendering itself:

OK: In the planet_ortho.wgsl file we have the definition:

    var lightDir = normalize(vec3f(1.0,1.0,-1.0));
    var light = dot(n, lightDir);
    // if (h >= 0.1) {
    res = vec4f(res.rgb*(light+0.2), 1.0);

Now let's also change th background color to full black. So in planet_ortho.cpp we replace the clear color in the planet render pass:

    scene.add_render_pass({.with_depth = true,
                           .swapchain_idx = 0,
                           .color_clear_value = {0.0, 0.0, 0.0, 1.0}});

And next, let's also add the sun radiance itself. But let's do that properly now: we will need the size of the sun as a uniform value, so we might as well prepare a uniform buffer to provide all the uniforms we will need in the atmosphere model.

OK, I have now introduced the AtmUniforms structure, and with that we can also display the sun radiance in our atm rendering process now.

Next step is to add the ground radiance, but to achieve this, first we need to get access to the depth texture from the planet rendering pass, so we will need another rendering pass in fact.

And this also means we should perform the first render pass into render textures instead of targetting the default swapchain directly.

⇒ basically we will need some kind of Dynamic Resolution Rendering (DRR) mechanism here (cf. https://github.com/gpuweb/gpuweb/issues/2379)

I seem to have a problem with the depth texture generated by the planet rendering now, so I need to look into this first.

Added the following test code to read the depth texture content:

    eng->add_post_render_func([depth_texture]() {
        static int count = 0;
        count++;
        if (count % 120 != 0) {
            return;
        }
        auto* eng = WGPUEngine::instance();
        Vector<U8> tdata;
        eng->read_texture_data_sync(depth_texture, tdata);

        U32 num = tdata.size() / 4;
        // logDEBUG("Number of depth value: {}", num);
        F32* ptr = (F32*)tdata.data();
        F32 dmin = *ptr;
        F32 dmax = *ptr;

        for (U32 i = 0; i < num; ++i) {
            F32 val = *ptr++;
            dmin = val < dmin ? val : dmin;
            dmax = val > dmax ? val : dmax;
        }
        logDEBUG("Depth values range: [{}, {}]", dmin, dmax);
    });

Worked my way to understand how to reverse the computation of the linear depth value from the depth buffer values in the render_atmosphere shader, which was something like that:

    // Convert to linear depth:
    var zNear: f32 = in.zplanes.x;
    var zFar: f32 = in.zplanes.y;

    // (x,y,z,1) in camera space is transformed by the perspective into clip space:
    // A, 0, 0, 0,
    // 0, B, 0, 0,
    // 0, 0, C, D,
    // 0, 0, 1, 0
    // which gives: (A*x, B*y, C*z+D, z)
    // Then the coords are normalized by z, so the depth value is:
    // depth = C+D/z, and that's the value in range [0,1]
    // then:
    // depth-C = D/z => z = D/(depth-C)

    var C: f32 = zFar / (zFar - zNear);
    var D: f32 = -zFar * zNear / (zFar - zNear);

    // var zLinear : f32 =  D / (depth - C);
    // zLinear = (zLinear - zNear)/(zFar - zNear);

    var distance_to_intersection : f32 = D / (depth - C);
    distance_to_intersection /= kLengthUnitInMeters;

So this would produce this kind of linear depth results:

And finally I have a working atmosphere rendering system 👍!:

⇒ Alright, there is still a lot left to do here, but I feel that this is a good point to stop the current session :-) So we will continue on this topic in the next session then, see you there ;-)!

  • blog/2024/0118_precomputed_atm_unit_tests.txt
  • Last modified: 2024/01/18 20:21
  • by 127.0.0.1