NervLand: Drawing our first cube

OKay, so in this session, I want to turn my previous simple triangle display into a 3D cube. To achieve that I would like to start using an index buffer, and also introduce the minimal support required to handle an ortho projection, so that I could play a simple cube animation and see it rotating. This all sounds pretty easy said like this, but I'm sure I will find some trouble on my path, as usual 😆. So let's get started.

  • First thing first, we need a cube, so we need a vertex buffer, with 8 vertices in it.
  • While we are at it, let's also provide colors for our vertices.
  • So here is what the first version of our vertex buffer looks like:
        -- Prepare our vertex buffer here, with support to write to it:
        -- We need 8 vertices, and for each vertex, a vec3 position (3 floats)
        -- as well as an RGBA colors (4 bytes or 1 float)
        -- So 4 floats per vertex
        local vbufIdx = self.vkeng:create_vertex_buffer(4 * 4 * 8, nvk.BufferCreateFlags.SEQUENTIAL_HOST_WRITE)
        local vbuf = self.vkeng:get_vertex_buffer(vbufIdx)
        -- Specify the vertex attributes:
        -- This time we build a cube here so we need 8 vertices.
        vbuf:add_rgb32f(0) -- vec3 position at location=0
        vbuf:add_rgba8(1) -- vec4 color at location=2
        -- write the data:
        -- Cube half-size:
        local hs = 10.0
        vbuf:write_float_array {
            -- front plane (with positive z values)
            -hs, -hs, hs, nv.rgba_to_f32(255, 0, 0, 255),
            -hs, hs, hs, nv.rgba_to_f32(255, 0, 127, 255),
            hs, hs, hs, nv.rgba_to_f32(255, 0, 255, 255),
            hs, -hs, hs, nv.rgba_to_f32(255, 127, 0, 255),
            -- back plane (with negative z values)
            -hs, -hs, -hs, nv.rgba_to_f32(0, 0, 0, 255),
            -hs, hs, -hs, nv.rgba_to_f32(0, 0, 127, 255),
            hs, hs, -hs, nv.rgba_to_f32(0, 0, 255, 255),
            hs, -hs, -hs, nv.rgba_to_f32(0, 127, 0, 255),
For the X/Y axis I'm currently following the same conventions as in vulkan: X from left to right of the screen and Y from top to bottom of the screen
  • Next, let's prepare the data for our index buffer, and I think this should be correct:
        -- Next we need the index buffer data:
        local indices = {
            -- front face, counter-clockwise front face:
            0, 3, 1, 1, 3, 2,
            -- back face:
            4, 7, 5, 5, 7, 6,
            -- left face:
            4, 0, 5, 5, 0, 1,
            -- right face:
            3, 7, 2, 2, 7, 6,
            -- top face:
            0, 4, 3, 4, 7, 3,
            -- bottom face:
            1, 2, 5, 5, 2, 6,
  • Now, how do we build an Index buffer ? hhhmmm 🤔. OK, sorted out, so now we have this:
        local num_indices = #indices
        local ibufIdx = self.vkeng:create_index_buffer(4 * num_indices, nvk.BufferCreateFlags.SEQUENTIAL_HOST_WRITE)
        local ibuf = self.vkeng:get_index_buffer(ibufIdx)
        -- Load the indices:
  • Next step is to implement the bind_index_buffer method in the CommandBuffer class. OK, now we have a simple function to handle the index binding:
    void VulkanCommandBuffer::bind_index_buffer(VulkanIndexBuffer* ibuf,
                                                U64 startOffset) {
        VkBuffer buf = ibuf->get_buffer()->getVk();
        // Use the absolute start point of this index buffer as a starting point:
        U64 offset = ibuf->get_start_offset() + startOffset;
        // Only using UIN32 indices for now:
        _device->vkCmdBindIndexBuffer(_buffer, buf, offset, VK_INDEX_TYPE_UINT32);
  • Next I think I should reconsider what I need to do in my “CmdBufProvider” implementation. Currently, we have the following simple function:
    void SimpleCmdBuffersProvider::get_buffers(FrameDesc& fdesc,
                                               VkCommandBufferList& buffers) {
        // Write the command buffer:
        U32 idx = fdesc.swapchainImageIndex;
        // Re-record the command buffer as above:
        auto* cbuf = _cbufs[idx].get();
        auto* fbuf = _renderer->get_swapchain_framebuffer(idx);
        // We update our push constants here to contain a time value:
        F32 time = (F32)fdesc.frameTime;
        // logDEBUG("Writing time value: {}", time);
        // We write the time as the z element of the first vec4:
        _pushArr.write_f32(time, 8);
        // Push constants stages :
        U32 width = _renderer->get_swapchain_width();
        U32 height = _renderer->get_swapchain_height();
        // Check if we need to rebuild the pipeline:
        if (_width != width || _height != height) {
            _pipeline = _renderer->get_device()->create_graphics_pipeline(
                _cfg->getVk(), _pipelineCache->getVk());
            _width = width;
            _height = height;
        fbuf->set_clear_color(0, 0.2, 0.2, 0.2, 1.0);
        // Begin rendering into the swapchain framebuffer:
        cbuf->begin_inline_pass(_rpass.get(), fbuf);
        // Bind the graphics pipeline:
        // Bind the vertex buffer:
        cbuf->bind_vertex_buffer(_vbuf.get(), 0);
        // add the push constants
        cbuf->write_push_contants(_playout->getVk(), pstages, 0,
                                  _pushArr.get_size(), _pushArr.get_data());
        // Draw our triangle:
        // End the render pass
        // Finish the command buffer:
        // Add the buffer to the list:
  • ⇒ What i'm thinking now, is that I should have the concept of a RenderGraph here: the render graph could be built only once, and the we could request it to record the commands with a record_commands() function for instance.
  • The RenderGraph should have a list of command buffers available to it (in a vector, so we can access them from index 0, 1, …)
  • The RenderGraph should also have a list of framebuffers assigned to it, and then we can select the proper target framebuffer when building our render passes.
  • The RenderGraph should contain a list of RenderPasses
  • Each RenderPass should contain a list of RenderPipelines
  • Each RenderPipeline, may have a list of DrawCalls (if it is a graphics pipeline)
  • And in each DrawCall, we may provide one or more vertex buffers, an index buffer, and additional info on the draw to be performed.
  • ⇒ So quite a few new design elements here, but I think this could definitely be an interesting path to investigate so let's get to it 👍!
  • OK, So I have introduced an initial version of the RenderGraph, RenderPass, RenderPipeline and Drawable classes.
  • Next, what I'm realizing is that I might also need preliminary/common commands, like buffer copies, synchronization barriers, etc in my graph. Hmmmm 🤔… This means that I should take a bit of a different road in fact to build the graph:
  • Would be more appropriate the have a base RenderNode class. And then in our RenderGraph we could add as many nodes as needed: these nodes could be simple commands, or render passes, and then in the RenderPass should be some kind of RenderGroup: ie. it could contain additional childrens. And then same thing for the RenderPipeline: it should be some kind of Group too.
  • And when doing this, I realize I might not absolutely need to build the “RenderPipeline” as a group: this could instead be a single Node 🤔? Note quite sure yet to which extent this could make sense… But anyway, I feel it's still a better idea to keep it as a group for now.
  • Next I need to build a node corresponding to a “write_push_constant” command. OK, here it is:
    namespace nvk {
    PushConstants::PushConstants(U32 size, VulkanPipelineLayout* playout,
                                 U32 pstages, U32 index)
        : RenderNode(index), _playout(playout), _pstages(pstages),
          _byteArray(size) {
        logDEBUG("Creating PushConstants object.");
    PushConstants::~PushConstants() { logDEBUG("Deleting PushConstants object."); }
    void PushConstants::record_commands(RenderGraphContext& ctx) {
        auto* cbuf = ctx.cbuf;
        cbuf->write_push_contants(_playout->getVk(), _pstages, _offset,
                                  _byteArray.get_size(), _byteArray.get_data());
    } // namespace nvk
  • Another critical part will be on the update of an existing render graph: I think the easiest option for the moment is simply to have external update functions which are aware of the “PushContants” node for instance and can thus update it as needed. Let's see if this could work.
  • So here is the minimal GraphUpdater class I'm providing for this:
    struct GraphUpdater : public nv::RefObject {
        virtual void update(FrameDesc&) = 0;
  • And also added the add_updater() and update() methods in the RenderGraph class.
  • Initially I was thinking about assigning a framebuffer to a given index in the rendergraph and then access that when needed in a given renderpass: but I'm not sure this is worth it anymore: I could just as well just assign the target framebuffer to the renderpass, and use the current swapchain fbuffer by default ?
  • So now in the RenderPass record_commands() method we will use this kind of code:
    void RenderPass::record_commands(RenderGraphContext& ctx) {
        auto* cbuf = ctx.graph->get_cmd_buffer(_cmdBufferIndex);
        // auto* fbuf = ctx.graph->get_framebuffer(_framebufferIndex);
        // Retrieve the framebuffer:
        VulkanFramebuffer* fbuf = _framebuffer.get();
        if (fbuf == nullptr) {
            // Use the current swapchain framebuffer:
            fbuf = ctx.get_current_framebuffer();
        // .. More stuff here.    
  • And then we don't really need a list of framebuffers in the RenderGraph class, so removing this.
  • Similar to the framebuffers above, we should not need to have the RenderGraph managing the target command buffers: Instead, this should be make another type of RenderGroup.
  • That group will be responsible for begining/finishing the commandbuffer recording, and also selecting the correct cbuffer to use if it should be dependent on the current image index.
  • Feewww… that was a lot of changes already 😄 BUt now I have a dedicated CmdBufferSelector node class, where I may add additional commands.
  • Now time to try to use all this new stuff from lua, first, building the RenderGraph…
  • But before we go any further, I need to fix that stupid issue that I have with the bindings generation preventing me from just binding everything in the “nvk” namespace.
  • Currently I get this error in the binding process when I request to bind “^nvk::”:
    2022-12-11 13:06:30.196828 [FATAL] No lua converted available for VkPhysicalDevice
    stack traceback:
            [string "setup.lua"]:204: in function 'THROW'
            [string "setup.lua"]:199: in function 'CHECK'
            [string "reflection.lua.VectorConverter"]:68: in function 'writeGetter'
            [string "bind.FunctionWriter"]:261: in function 'writeFunctionOverload'
            [string "bind.FunctionWriter"]:198: in function 'writeFunctionBinds'
            [string "bind.FunctionWriter"]:62: in function 'writeFunctionBindings'
            [string "bind.ClassWriter"]:223: in function 'writeContent'
            [string "bind.ClassWriter"]:15: in function '__init'
            [string "setup.lua"]:161: in function 'classWriter'
            [string "app.NervBind"]:332: in function 'func'
            [string "reflection.Scope"]:33: in function 'func'
            [string "bind.LunaManager"]:164: in function 'foreachClass'
            [string "app.NervBind"]:314: in function 'writeBindings'
            [string "app.NervBind"]:217: in function 'processModuleConfig'
            [string "app.NervBind"]:264: in function 'run'
            [string "app.AppBase"]:24: in function '__init'
            [string "app.NervBind"]:12: in function '__init'
            [string "setup.lua"]:161: in function 'app'
            [string "run.lua"]:70: in function <[string "run.lua"]:68>
            [C]: in function 'xpcall'
            [string "run.lua"]:73: in main chunk
  • OK now fixed: was basically a problem in the VectorConverter class.
  • First real issue I get with this is when creating the graphics pipeline: passing the pipeline object directly to the constructor will be a problem for me as I still need to be able to reconstruct this pipeline on resize: would be handy to just let the RenderPipeline auto rebuild the pipeline if needed on size changed events.
  • OK: Now I got my first working example with a static triangle:
        -- Build the rendergraph here
        local rgraph = nvk.RenderGraph()
        -- Number of swapchain images:
        local num = self.renderer:get_num_swapchain_images()
        local cbufs = rgraph:add_primary_cmd_buffer_selector(num, 0)
        -- Add a render pass to that cmd buffer selector:
        local rpass = cbufs:add_render_pass(renderpass)
        -- Add the pipeline:
        local pline = rpass:add_graphics_pipeline(cfg, pipelineCache)
        local pushc = pline:add_push_constants(32, pipelineLayout)
        local parr = pushc:get_bytearray()
        parr:write_vec4f(nv.Vec4f(0.5, 0.5, 0.0, 0.0))
        parr:write_vec4f(nv.Vec4f(0.0, 0.0, 0.4, 0.4))
  • Next we should provide a simple updater for the push constants. Here is the simple C++ implementation in the sandbox module:
    namespace nvk {
    struct TimePushConstantsUpdater : public GraphUpdater {
        nv::RefPtr<PushConstants> pushc{nullptr};
        U32 offset{0};
        TimePushConstantsUpdater(PushConstants* pushc, U32 offset)
            : pushc(pushc), offset(offset){};
        void update(FrameDesc& fdesc) override;
    void TimePushConstantsUpdater::update(FrameDesc& fdesc) {
        auto& barr = pushc->get_bytearray();
        // We update our push constants here to contain a time value:
        F32 time = (F32)fdesc.frameTime;
        // We write the time as the z element of the first vec4:
        barr.write_f32(time, 8);
    auto create_time_push_constant_updater(PushConstants* pushc, U32 offset)
        -> GraphUpdater* {
        auto updater =
            nv::create_ref_object<TimePushConstantsUpdater>(pushc, offset);
        return updater.release();
  • And then we can assign that updater to our render graph:
        -- Apply updater for the push constants:
        local updater = nvk.create_time_push_constant_updater(pushc, 8)
  • ⇒ And this works! Yeepee! With this, we are finally back to where we were before starting the RenderGraph implementation: a rotating triangle on screen, rendered at about 7500fps on Ragnarok ;-).
  • One thing I just noticed through, is that my helper functions in the sandbow module don't get registered in the LLS library file:
    • I have those declarations:
      namespace nvk {
      auto create_simple_cmd_buf_provider(
          VulkanRenderer* renderer, VulkanRenderPass* rpass, VulkanVertexBuffer* vbuf,
          VulkanPipelineLayout* playout, VulkanGraphicsPipelineCreateInfo* cfg,
          VulkanPipelineCache* pcache, const VulkanCommandBufferList& cbufs,
          const nv::ByteArray& pushArr) -> CmdBuffersProvider*;
      auto create_time_push_constant_updater(PushConstants* pushc, U32 offset)
          -> GraphUpdater*;
      } // namespace nvk
    • But I will only get an empty library file:
      nvk = nvk or {}
  • ⇒ I think it would be very good to fix this, so let“s have a look.
  • Added the missing function to handle the global functions:
    function Class:writeFunctionDefs(funcs)
        for _, f in ipairs(funcs) do
            local ns = self:getNamespaceStack(f)
            local ns_str = table.concat(ns, ".")
            if not f:isIgnored() and not f:isConstructor() and not f:isDestructor() then
                local sigs = f:getPublicSignatures(true)
                local func_line = ""
                for idx, sig in ipairs(sigs) do
                    -- get the list of arguments:
                    local args = sig:getArguments()
                    local anames = {}
                    -- if idx == 1 then
                    for _, arg in ipairs(args) do
                        -- Ignore the canonical lua state:
                        if not arg:isCanonicalLuaState() then
                            local aname = arg:getName()
                            local opt = arg:hasDefault() and "?" or ""
                            table.insert(anames, aname)
                            local atype = arg:getType():getLuaTypeName()
                            if atype == "luna.LuaFunction" then
                                atype = "function"
                            -- Write the parameter desc line:
                            self:writeSubLine("---@param ${1}${3} ${2}", aname, atype, opt)
                    -- get the return type:
                    local rtype = sig:getReturnType()
                    if rtype ~= nil then
                        local rtname = rtype:getLuaTypeName()
                        -- We don't write void return
                        if rtname ~= "" then
                            self:writeSubLine("---@return ${1}", rtname)
                    local fname = f:getLuaName(sig)
                    -- Then we write the actual function proto:
                    local anames = table.concat(anames, ", ")
                    -- This is the version version of the function so we should write the function line for it:
                    func_line = table.concat({ "function ", ns_str, ".", fname, "(", anames, ") end" }, "")
  • And now we have our new helper functions properly listed in the library file:
    nvk = nvk or {}
    ---@param renderer nvk.VulkanRenderer
    ---@param rpass nvk.VulkanRenderPass
    ---@param vbuf nvk.VulkanVertexBuffer
    ---@param playout nvk.VulkanPipelineLayout
    ---@param cfg nvk.VulkanGraphicsPipelineCreateInfo
    ---@param pcache nvk.VulkanPipelineCache
    ---@param cbufs nvk.VulkanCommandBufferList
    ---@param pushArr nv.ByteArray
    ---@return nvk.CmdBuffersProvider
    function nvk.create_simple_cmd_buf_provider(renderer, rpass, vbuf, playout, cfg, pcache, cbufs, pushArr) end
    ---@param pushc nvk.PushConstants
    ---@param offset integer
    ---@return nvk.GraphUpdater
    function nvk.create_time_push_constant_updater(pushc, offset) end
  • So, all good on this point!
  • Now it's time to move to a 3D object display, starting with our simple cube of course.
  • But the real issue we have to handle here is on the 3D/4D matrices: we need to ensure that our classes are fully compatible with the vulkan conventions, so let's look into this.
  • ⇒ As usual, let's start with our good old OpenSceneGraph implementation.
  • Hmmm, actually, reconsidering the Matrix implementation design I think I will just start with the quaternions for now lol. OK, I have a first version for this Quat class.
  • Now time to implement our Mat4 class. Yet, when doing this, I realize I could really benefit from a template version of the Vec3 class, 😅 my my… So let's try to provide a template build of Vec2f/Vec2d already.
  • OK: I turned Vec2f/Vec2d classes into a unified Vec2<T> template class. Except that now my luna binding system is confused with this and thus, I lost the support to bind functions such as (in ByteArray here):
    void write_vec2f(const Vec2f& value, U64 idx = U64_MAX);
  • Let's see if I could restore that… Ohh, no, actually, my bad: this part is working fine. But what we lost are the functions from inside the template class itself… I'm wondering if I could restore that automatically 🤔 ?
  • Whaaoo, I'm really excited with those results I finally got, and I feel these would deserve a dedicated post, but I'm lazy to start a new one, so let's just put that here:
  • As mentioned above I turned the Vec2f/Vec2d classes into a unified template Vec2<T>, and then I was working pretty hard trying to restore the support to still get the bindings generated for those classes now defined as:
    namespace nv {
      using Vec2f = Vec2<float>;
      using Vec2d = Vec2<double>;
  • And in the end this works really well!
  • In the NervLuna system I actually introduced the support to generate ClassTemplate objects at some point, and within those class templates, we will maintain a collection of template parameters. For instance, when we define the following:
    namespace nv {
    template <typename T, typename U>
    struct MyStruct {
      auto do_something(const U& arg0)-> T*;
  • We will internally build an “nv::MyStruct” class template with 2 template parameters: T and U
  • Then, when parsing this class template the binding system will discover functions such as do_something above, and thus find an argument type: “const U&” and a return type “T*”: those will lead to the generation of “virtual type” trees based on the template parameter types, which are internally given the names: ${T}@nv::MyStruct and ${U}@nv::MyStruct. From the base virtual types we will construct the chain of types as needed, for instance: ${T}@nv::MyStruct${T}*@nv::MyStruct and ${U}@nv::MyStructconst ${U}@nv::MyStructconst ${U}&@nv::MyStruct
  • with those “virtual types” the functions will be registered as usual.
  • Then we could eventually find a type that is an instanciation of this template like:
    using MyType = nv::MyStruct<int,float>;
  • When this real type is discovered the NervLuna system will search for an existing ClassTemplate with that name (ie. “nv::MyStruct” in this case), and since we are usually already providing a Class object representing MyType (but that class is initially empty: it's just a placeholder), that class object will be passed to the ClassTemplate along with the list of actual types that should be used to instanciate the different methods.
  • And here comes the real magic: the ClassTemplate can then:
    1. Map its template parameters to the provided actual types: (T, U) ↔ (int, float)
    2. Then iterate on each of the available functions it contains and replace each virtual type by it's corresponding concrete version: const ${U}&@nv::MyStruct thus becomes const float& and ${T}*@nv::MyStruct becomes int*,
    3. Each of those concrete function prototypes are then dynamically injected into the class object for MyType,
    4. And then we are back to business as usual 😆: each of those concrete functions will be analyzed to figure out if the various types can be binded, and thus if the whole function can be provided in lua.
On the question: “Why do we use the @nv::MyStruct suffix here ?”: this is simply because the template parameter names can be the same for may template classes, and since in our type system we register the types by names we need to separate those “virtual types” into different versions depending on which template class they are coming from.
  • Note: One thing that we are loosing unfortunately with this new mechanism is the application of (now templated) global functions on the template classes, for instance we have:
    template <typename T>
    inline auto componentMultiply(const Vec2<T>& lhs, const Vec2<T>& rhs)
        -> Vec2<T> {
        return {lhs[0] * rhs[0], lhs[1] * rhs[1]};
    template <typename T>
    inline auto componentDivide(const Vec2<T>& lhs, const Vec2<T>& rhs) -> Vec2<T> {
        return {lhs[0] / rhs[0], lhs[1] / rhs[1]};
  • Those global template functions are not supported yet in NervLuna (ie. should not be too tricky to implement but I would indeed some kind of TemplateFunction class I guess 🤔 ⇒ but not something to worry about just yet anyway.)
  • So, with this powerful system in place we can keep on our journey to provide the Matrix class template ;-)
  • I have now added an initial re-implementation of the Mat4 template class in my Core module (for both double and float values). Now, there are a lot of points in this class where i'm not quite sure what I'm doing is valid, so I definitely need unit tests on this now.
  • So, I'm in the process of adding support for unit tests with the Boost Test Framework, and first thing I'm noticing here is that Boost now use a macro called CHECK too internally, which seems to be conflicting with my own version of CHECK() that I'm using evreywhere in my code 😭: I need to do something about this.
  • ⇒ So I took the radical decision of replacing all instances of CHECK in my code with NVCHK, this was quick and should not be a problem anyway (and now I can compile my unit test properly at least)
  • And then I added a first collection of unit tests on the Mat4d class to validate my Matrix function re-implementation. Now time to get back to the display of our cube.
  • To display our cube object we will need a new “push constant updater”, that will build the appropriate matrix transformation chain at each frame, so let's prepare that.
  • While working on this point I also took the opportunity to implement a LuaGraphUpdater class (not something I will willingly use in production, but this may still help during investigations/debugging):
    namespace nvk {
    struct LuaGraphUpdater : public GraphUpdater {
        luna::LuaFunction func;
        explicit LuaGraphUpdater(lua_State* L, int index) : func(L, index, true){};
        void update(FrameDesc& fdesc) override;
    void LuaGraphUpdater::update(FrameDesc& fdesc) { func(fdesc); }
    auto create_lua_graph_updater(luna::LuaFunction& func)
        -> nv::RefPtr<GraphUpdater> {
        return nv::create_ref_object<LuaGraphUpdater>(func.state, func.index);
  • Now the problem with this is that when I have a a LuaFunction stored in a C++ function it might happen at the end of the program that the lua state is destroyed before the C++ object, and then I get an error if I try to unregister a ref index in the lua function.
  • I think that to handle this properly, all “referenced” lua functions should instead be invalidated when the parent lua state is closed. So let's see how we could do that… OK
  • Next I prepared a minimal SimpleCubeRotator updater as follow:
    namespace nvk {
    struct SimpleCubeRotator : public GraphUpdater {
        nv::RefPtr<PushConstants> pushc{nullptr};
        U32 offset{0};
        Vec3f axis;
        float speed;
        Mat4f viewProj;
        float startTime{-1.0F};
        SimpleCubeRotator(PushConstants* pushc, const Vec3f& axis, float speed,
                          const Mat4f& view, const Mat4f& proj, U32 offset)
            : pushc(pushc), offset(offset), viewProj(proj * view.inverse()),
              axis(axis), speed(speed){};
        void update(FrameDesc& fdesc) override;
    void SimpleCubeRotator::update(FrameDesc& fdesc) {
        auto& barr = pushc->get_bytearray();
        // compute the rotation angle:
        if (startTime < 0.0) {
            startTime = (float)fdesc.frameTime;
        float elapsed = (float)fdesc.frameTime - startTime;
        float angle = toRad(elapsed * speed);
        // Create the rotation matrix:
        auto rot = Mat4f::rotate(angle, axis);
        // Compute the final transform matrix:
        auto mat = viewProj * rot;
        // Write the matrix into the push constant buffer:
        barr.write_mat4f(mat, offset);
    auto create_simple_cube_rotator(PushConstants* pushc, const Vec3f& axis,
                                    float speed, const Mat4f& view,
                                    const Mat4f& proj, U32 offset)
        -> nv::RefPtr<GraphUpdater> {
        return nv::create_ref_object<SimpleCubeRotator>(pushc, axis, speed, view,
                                                        proj, offset);
  • But one issue I'm just noticing now is that the LLS library doesn't seem to be registering my static functions correctly on template classes like Mat4f or Mat4d so I need to check that part again.
  • Note: In fact the binding itself is incorrect for static functions in my template classes, since I have for instance:
    static auto _bind_look_at_sig1(lua_State* L) -> int {
    	nv::Mat4f* self = Luna< nv::Mat4f >::get(L,1);
    	nv::Vec3f* eye = Luna< nv::Vec3f >::get(L,2,false);
    	nv::Vec3f* center = Luna< nv::Vec3f >::get(L,3,false);
    	nv::Vec3f* up = Luna< nv::Vec3f >::get(L,4,false);
    	nv::Mat4f res = self->look_at(*eye, *center, *up);
    	nv::Mat4f* res_ptr = new nv::Mat4f(res);
    	Luna< nv::Mat4f >::push(L, res_ptr, true);
    	return 1;
  • So this definitely needs fixing before I can use it. We start in the ClassTemplate:instanciateInClass(cl, templateArgs) method, and we simply mark the created function signatures as static where this is applicable:
      for i, ref_sig in ipairs(sigs) do
        local signame = ref_sig:getName()
        logDEBUG("Creating templated signature ", i, ": ", signame)
        local sig = func:createSignature(signame)
        -- (More code here)
  • OK static functions from template classes now bound correctly.
  • In the VulkanApp lua code I then setup that updater:
        -- Create on simple cube rotator:
        local axis = nv.Vec3f(1.0, 1.0, 1.0)
        local speed = 30.0
        local view = nv.Mat4f.look_at(nv.Vec3f(0.0, 0.0, -50.0), nv.Vec3f(0.0, 0.0, 0.0), nv.Vec3f(0.0, -1.0, 0.0))
        local proj = nv.Mat4f.perspective(math.rad(60.0), 1.333, 1.0, 100.0)
        -- local proj = nv.Mat4f.ortho(-30, 30, -30, 30, 0.0, 100.0)
        local updater = nvk.create_simple_cube_rotator(pushc, axis, speed, view, proj, 0)
One thing to keep in mind with the code just above, is that when using a perspective projection, we really cannot use a near plane distance of 0.0 as this would mess the projection matrix completely ;-)
  • And in fact, just before, I request the display of a cube instead of a simple triangle:
        -- self:add_triangle(pline)
  • And also use a specific shader to do the transformation:
        -- self:createGLSLShaderModules("tests/test_pushconstants")
  • And this eventually worked without too much trouble 🥳! So here is my brand new rotating 3D cube in vulkan:

Yet, one thing I eventually realized with this experiment is that in GLSL we should pass our matrices in column-major order if we want to continue post-multiplying by vectors (whereas on the C++ side I'm using row-major matrix representation): So this will require some fixing.
  • To fix the transposition issue mentioned just above, I decided to simply transpose the matrix when writing it into the push constants as follow:
    void ByteArray::write_mat4f(const Mat4f& value, U64 idx, bool transpose) {
        U64 dataSize = sizeof(F32) * 16;
        if (transpose) {
            // transpose the matrix elements while copying:
            if (idx == U64_MAX) {
                idx = _position;
            NVCHK(idx + dataSize <= _data.size(), "Out of range data write.");
            const F32* src = value.ptr();
            F32* ptr = (F32*)get_data(idx);
            for (int i = 0; i < 4; ++i) {
                *ptr++ = *(src);
                *ptr++ = *(src + 4);
                *ptr++ = *(src + 8);
                *ptr++ = *(src + 12);
            _position = idx + dataSize;
        } else {
            // Write the matrix data directly:
            write_data((U8*)value.ptr(), dataSize, idx);
  • With that change, I get the same display as before, but I don't need to call transpose() anymore in the shader itself.
  • With the latest changes reported above I think we can consider this post as completed since we finally got our cube displaying. This was a somewhat large one involving significant enhancements/extensions: support for index buffers, support for render graphs and graph updaters, support for matrices computation, support for template class bindings, support for unit tests, etc. But I was lucky and this all went relatively smoothly 😆.
  • blog/2023/0103_nervland_drawing_first_cube.txt
  • Last modified: 2023/01/03 22:08
  • (external edit)