Introducing NervLuna: the automatic C++/Lua binding generator

,

In May 2020, I was experimenting with the Vulkan API for GPU/Game programming, and thus implemented a Lua based application framework to start performing some minimal/simple tests.

I'm somewhat familiar with this approach since I used this already in previous projects such as my old "Singularty" project were I've been writing core components in C++ then using a binding layer to config/use those components in lua instead. This proved to be very efficient (once you get everything on rails at least), because there is a significant speed boost when trying to develop something in a scripting language such as Lua compared to trying to build everything in C++ (which is really slow to compile).

sol3 macro extensions

So, with those vulkan experiments, I actually decided I should “upgrade” my Lua bindings techniques, and thus moved to the excellent sol3 library.

This was fun :-)… with sol3 writing the bindings “feels easy” and “very elegant”. But, you have to “write some code” of course: you define your table/classes/functions/fields, you write your custom accessors, or other custom extensions for the elements you bind to lua, etc. So you will quickly find that if you have to write a lot of bindings, then there is a lot of boilerplate code that you would write again and again…

That's were I started introducing a MACRO system on top of sol3 to quickly write the bindings I needed. In effect this gave me content such as:

    SOL_BEGIN_CLASS(gp, "Vec", gp_Vec)
    SOL_CALL_CONSTRUCTORS(class_t(),class_t(const gp_Dir &),class_t(const gp_XYZ &),class_t(Real,Real,Real), class_t(const gp_Pnt &P1, const gp_Pnt &P2));
    SOL_OV2_FUNCS(SetCoord, void(const Int, const Real), void(const Real, const Real, const Real));
    SOL_CLASS_FUNC(SetX);
    SOL_CLASS_FUNC(SetY);
    SOL_CLASS_FUNC(SetZ);
    SOL_CLASS_FUNC(SetXYZ);
    SOL_RESOLVE_FUNC(Coord, Real(const Int)const, Coord);
    SOL_CUSTOM_FUNC(Coords) = [](class_t& p) { Real x,y,z; p.Coord(x,y,z); return std::tuple(x,y,z); };
    SOL_CLASS_FUNC(X);
    SOL_CLASS_FUNC(Y);
    SOL_CLASS_FUNC(Z);
    SOL_CLASS_FUNC(XYZ);
    SOL_CLASS_FUNC(IsEqual);
    SOL_END_CLASS()

This might feels a bit cryptic at first sight, but this is anyway very close to the kind of code you would need to write to generate bindings with sol3.

So this felt pretty satisfying… [for a while…] and I kept going on this way: building bindings for my “core libraries”, OpenCascade (as shown in the example code just above in fact), DirectX12, Vulkan, LLVM, libclang, etc… extending the macro system progressively and getting more and more experience with sol3. But at some point I realized this would not work as a long term solution for me and I really needed to “transcend” this whole “binding generation” question somehow.

In case the sol3 macro system I mention here could be of interest to someone, I'm attaching the list of macros definitions I've implemented at the end of this article in the Appendices section [but from there, you're on your own ;-)]

On the necessity to "move beyond" sol3 bindings

As I said above sol3 is a very nice library to generate lua bindings, but it comes with some significant limitations that were starting to get in my way:

  • It makes extremely heavy use of template meta-programming

This main problem leads to other important problems:

  • It's really slow to compile (even if you use PCHs)
  • It produces very large binaries
  • It will use too much RAM during compilation (and thus there is a limit from where it will start to fail) if you try to put “too many bindings” in a single translation unit.

And of top of that, there is another major limitation in this library that you will also find in almost every other lua binding generation library:

  • It's only a framework to help *YOU* write bindings to lua

What you have to understand here is that the sol3 library is only an helper library/framework that you use to write the bindings, but you still have to write the binding code yourself, you still have to decide what you can/want to bind to lua from C++, you still have to read the classes definitions from some header file, understand its content, and then figure out the list of all the functions that are “valid” for binding, all the “fields” that can be accessed, and for each of them, you have to write the code.

And guess what ? “You” are human… [yeah, I know, that part sucks lol…] and thus, you will make error [or… is this the part that really sucks ?] when writing your binding code… then you will try to compile it, and since it uses a lot of template meta-programming as I said just above, it will take a long time, only to report that… you made an error on such and such lines… :-S.

⇒ And believe me, that game can quickly become really frustrating when you plan to “generate the complete bindings for a very large library” and you've been working on the bindings for a “only single class in that library” for 2 days already [I know what I'm talking about, because I've been there :-)]

Back to my old custom solution: sgtLuna

So, as a conclusion from the previous section: sol3 is nice when you only want to generate simple/small binding sets. But it becomes impractical when you goal is to generate bindings for large/complete libraries and frameworks. Generally speaking, “manually generating bindings” for complete frameworks just takes too long. ⇒ So I really needed an automatic solution to handle this.

But in fact, I already had this available from my previous Singularity project: I had written an automatic lua binding generator which I called sgtLuna that I used successfully to generate bindings for the complete OpenSceneGraph framework as well as wxWidgets and other pretty large libraries. [I think I simply just forgot *why* I created/used that before lol]

This binding generator was written in lua and was using the doxmlparser from doxygen to parse C++ code, and from there I could generate all the bindings I needed automatically (but with a significant configuration layer to control all the “exceptions/problems” that could occur during the bindings generation). It was inspired by the logic avalaible in LunaWrapper, and that's where the name comes from.

From my memory, this binding generation tool was starting to feel too complex to be usable: there was a lot of special considerations to take into account at every level… But thinking about it this was still order of magnitude faster than trying to manually write the bindings anyway, so maybe this really was the appropriate path for me ?

NervLuna: next generation lua binding generator

So I decided I should refresh my experience on lua binding generation and got back to my old sgtLuna project, but with the knowledge I have now. And it turns out that lately, working on my C++ JIT Compiler project, I started to play a lot with clang and discovered the libclang module that could be used to easily parse C++ code. And this is exactly what I needed: a robust system that could parse very complex C++ construct while avoiding a significant part of the low level “string processing” currently in use in the sgtLuna project.

⇒ I've been working on this new project for a few days already and it already looks really promising: I could generate the bindings for all the structures defined in the vulkan API in a matter of a few seconds (ie. more than 500 structs), with access to most type of fields in those structs, the compilation of the generated code is then significantly faster than the bindings I was generating so far with sol3, and the binaries are also significantly smaller.

Also, for the moment, I could reduce the required configuration for binding generation to a minimum, so this tool is currently very nice to use. Here is, for instance the config file (ie. a lua file) I use to configure the vulkan API binding generation:

-- Configuration file for the vulkan bindings generation.

local path = import "pl.path"
local utils = import "base.utils"
local Set = import "base.Set"

local vulkanIncludeDir = path.abspath(root_path.."../deps/msvc64/vulkan-1.2.135/include")
-- local inputDir = path.abspath(root_path.."../deps/msvc64/vulkan-1.2.135/include")
local inputDir = path.abspath(root_path.."../sources/lua_bindings/wip_Vulkan/interface")
local outputDir = path.abspath(root_path.."../sources/lua_bindings/wip_Vulkan")

local cfg = {}

cfg.moduleName = "Vulkan"
cfg.includePaths = {inputDir}
cfg.clangArgs = {
    "-Wno-pragma-once-outside-header",
    "-I"..vulkanIncludeDir.."/vulkan"
}

-- We only want to process the files in the vulkan folder:
-- cfg.inputPaths = {inputDir.."/vulkan"}

cfg.outputDir = outputDir

cfg.processEntities = function(root)
    -- We should ignore all the global functions:
    local extSuffixes = {"KHR", "EXT", "NVX", "NV", "INTEL", "GOOGLE", "AMD"}
    local ignoredFuncs = Set{"vkNegotiateLoaderLayerInterfaceVersion"}

    root:foreachGlobalFunction(function(f)
        local fname = f:getFullName()

        if ignoredFuncs:hasElement(fname) then
            logDEBUG("Ignoring function ", fname)
            f:setIgnored(true)
            return
        end

        for _,ext in ipairs(extSuffixes) do
            if fname:endsWith(ext) then
                logDEBUG("Ignoring function ", fname)
                f:setIgnored(true)
                break
            end
        end
    end)

    -- We should register the "void" type:
    local cl = root:getOrCreateClass("void")
    cl:setDefined(false)
end

return cfg
In the config file above for instance I have a simple “customization” layer for instance (defined with the “processEntities ()” function) where I ignore all the extension functions [because I cannot directly link those in the final module]. And I request the reflection layer to consider “void” as a valid “Luan class”. That's a lot easier than what I was doing in sgtLuna before.

And then a typical snippet of the code that is auto-generated is as follow:

#include <bind_precomp.h>

namespace nv {

// Function type checkers

// Typecheck for ClassC (1) with signature: void ()
static bool _check_ClassC_sig1(lua_State* L) {
	int luatop = lua_gettop(L);
	if( luatop!=0 ) return false;

	return true;
}

// Typecheck for ClassC (2) with signature: void (int)
static bool _check_ClassC_sig2(lua_State* L) {
	int luatop = lua_gettop(L);
	if( luatop!=1 ) return false;

	if( lua_isnumber(L,1)!=1 ) return false;
	return true;
}

// Typecheck for remove (1) with signature: int (int)
static bool _check_remove_sig1(lua_State* L) {
	int luatop = lua_gettop(L);
	if( luatop!=2 ) return false;

	if( !luna_isInstanceOf(L,1,SID("nvt::ClassC")) ) return false;

	if( lua_isnumber(L,2)!=1 ) return false;
	return true;
}


// Function bindings

// Bind for ClassC constructor (1) with signature: void ()
static nvt::ClassC* _bind_ClassC_sig1(lua_State* L) {
	// When reaching this call, we assume that the type checking is already done.

	return new nvt::ClassC();
}

// Bind for ClassC constructor (2) with signature: void (int)
static nvt::ClassC* _bind_ClassC_sig2(lua_State* L) {
	// When reaching this call, we assume that the type checking is already done.
	int32_t val = (int32_t)lua_tointeger(L,1);

	return new nvt::ClassC(val);
}

// Overall bind for ClassC
static nvt::ClassC* _bind_ClassC(lua_State* L) {
	if(_check_ClassC_sig1(L)) return _bind_ClassC_sig1(L);
	if(_check_ClassC_sig2(L)) return _bind_ClassC_sig2(L);

	luaL_error(L, "Binding error for function ClassC, cannot match any of the 2 signature(s):\n  sig1: void ()\n  sig2: void (int)");
	return nullptr;
}

// Bind for remove (1) with signature: int (int)
static int _bind_remove_sig1(lua_State* L) {
	// When reaching this call, we assume that the type checking is already done.
	nvt::ClassC* self = Luna< nvt::ClassC >::get(L,1);
	ASSERT(self!=nullptr);

	int32_t offset = (int32_t)lua_tointeger(L,2);

	int res = self->remove(offset);
	lua_pushnumber(L,res);

	return 1;
}

// Overall bind for remove
static int _bind_remove(lua_State* L) {
	if(_check_remove_sig1(L)) return _bind_remove_sig1(L);

	luaL_error(L, "Binding error for function remove, cannot match any of the 1 signature(s):\n  sig1: int (int)");
	return 0;
}


// Fields type checkers


// Fields bindings


nvt::ClassC* LunaTraits< nvt::ClassC >::construct(lua_State* L) {
	return _bind_ClassC(L);
}

void LunaTraits< nvt::ClassC >::destruct(nvt::ClassC* obj) {
	delete obj;
}

const char LunaTraits< nvt::ClassC >::className[] = "ClassC";
const char LunaTraits< nvt::ClassC >::fullName[] = "nvt.ClassC";
const char* LunaTraits< nvt::ClassC >::namespaces[] = {"nvt",0};
const char* LunaTraits< nvt::ClassC >::parents[] = {"nvt.ClassB",0};
const StringID LunaTraits< nvt::ClassC >::id = SID("nvt::ClassC");
const StringID LunaTraits< nvt::ClassC >::baseIDs[] = {SID("nvt::ClassB"),SID("nvt::MyClass"),0};

luna_RegType LunaTraits< nvt::ClassC >::methods[] = {
	{"remove", &_bind_remove},
	{0,0}
};

luna_RegType LunaTraits< nvt::ClassC >::fields[] = {
	{0,0}
};

luna_RegEnumType LunaTraits< nvt::ClassC >::enumValues[] = {
	{0,0}
};

}

In NervLuna, each class binding is generated in a different .cpp file and with the integrated usage of PCH files the compilation is pretty fast.

I haven't done any performance comparaison with sol3 bindings so far, but since I'm writing raw lua C API function calls everywhere, I'm probably not too far behind, and anyway, this isn't the main concern here: instead, what I really need is a scalable mechanism that can automatically generate very large bindings and I'd say I'm on a good path now ;-).

Conclusion

Of course, I'm only at the very beginning of the project, so there is still a lot to do. The last thing I added was the support for class/struct field read/write access. But then I should also check than I'm not trying to write to readonly fields for instance, I should add support for static fields in classes, then I will have to handle template classes, consider introducing support for class inheritance in lua, I also need to re-visit “inter-modules” dependencies, etc etc… So yes: still a long way to go :-D

But I will definitely keep working on this new project for a while, and introduce the new features I need progressively, so we will see where this lead us!

Appendices

SOL3 macro system

Here are the macros I've used to generate the lua bindings in my projects with sol3:

#include <sol/sol.hpp>
#include <core_math.h>

namespace nv {

template<typename T>
Vector<T> arrayFromTable(sol::table t)
{
    U32 num = t.size();
    // logDEBUG("Table size is: "<<num);
    Vector<T> res;
    res.resize(num);
    for (U32 i = 0; i < num; ++i)
    {
        res[i] = t[i + 1];
    }

    return std::move(res);
};

}

#define SOL_BEGIN_CLASS(space, cname, ctype) { \
    typedef ctype class_t; \
    auto utype = space.new_usertype<class_t>(cname);

#define SOL_CLASS_NO_CONSTRUCTOR() utype["new"] = sol::no_constructor;
#define SOL_CLASS_NO_DESTRUCTOR(cname) utype[sol::meta_function::garbage_collect] = sol::destructor([](class_t &ref) { \
                                     THROW_MSG("Should not call destructor in lua for "<<cname);    \
                                 });

#define SOL_CALL_CONSTRUCTORS(...) utype[sol::call_constructor] = sol::constructors<__VA_ARGS__>();

#define SOL_BEGIN_REF_CLASS(space, cname, ctype) { \
    typedef ctype class_t; \
    auto utype = space.new_usertype<class_t>(cname); \
    SOL_CLASS_NO_CONSTRUCTOR() \
    SOL_CLASS_NO_DESTRUCTOR(cname)

#define SOL_CLASS_BASES(...) utype[sol::base_classes] = sol::bases<__VA_ARGS__>();

#define SOL_CUSTOM_PROP(pname, ...) utype[#pname] = sol::property(__VA_ARGS__);
#define SOL_READ_PROP(pname) utype[#pname] = sol::property([](class_t&obj) { return obj.pname;})
#define SOL_RW_PROP(pname) utype[#pname] = sol::property([](class_t&obj) { return obj.pname;}, [](class_t&obj, decltype(class_t::pname) val) { obj.pname = val; })

#define SOL_CLASS_FUNC(fname) utype[#fname] = &class_t::fname;
#define SOL_FUNC(fname, f) utype[#fname] = f
#define SOL_META_FUNC(mname, f) utype[sol::meta_function::mname] = f
#define SOL_ADD_OP(ret_t,arg_t) SOL_META_FUNC(addition, sol::resolve<ret_t(arg_t) const>(&class_t::operator+))
#define SOL_SUB_OP(ret_t,arg_t) SOL_META_FUNC(subtraction, sol::resolve<ret_t(arg_t) const>(&class_t::operator-))
#define SOL_MULT_OP(ret_t,arg_t) SOL_META_FUNC(multiplication, sol::resolve<ret_t(arg_t) const>(&class_t::operator*))
#define SOL_DIV_OP(ret_t,arg_t) SOL_META_FUNC(division, sol::resolve<ret_t(arg_t) const>(&class_t::operator/))
#define SOL_MINUS_OP(ret_t) SOL_META_FUNC(unary_minus, sol::resolve<ret_t() const>(&class_t::operator-))
#define SOL_MOD_OP(ret_t,arg_t) SOL_META_FUNC(modulus, sol::resolve<ret_t(arg_t) const>(&class_t::operator%))
#define SOL_CUSTOM_FUNC(fname) utype[#fname]
#define SOL_OVERLOAD_FUNCS(fname, ...) utype[#fname] = sol::overload(__VA_ARGS__)
#define SOL_OV2_FUNCS(fname, sig1, sig2) utype[#fname] = sol::overload( \
    sol::resolve<sig1>(&class_t::fname), \
    sol::resolve<sig2>(&class_t::fname))
#define SOL_OV3_FUNCS(fname, sig1, sig2, sig3) utype[#fname] = sol::overload( \
    sol::resolve<sig1>(&class_t::fname), \
    sol::resolve<sig2>(&class_t::fname), \
    sol::resolve<sig3>(&class_t::fname))
#define SOL_OV4_FUNCS(fname, sig1, sig2, sig3, sig4) utype[#fname] = sol::overload( \
    sol::resolve<sig1>(&class_t::fname), \
    sol::resolve<sig2>(&class_t::fname), \
    sol::resolve<sig3>(&class_t::fname), \
    sol::resolve<sig4>(&class_t::fname))
#define SOL_OV5_FUNCS(fname, sig1, sig2, sig3, sig4, sig5) utype[#fname] = sol::overload( \
    sol::resolve<sig1>(&class_t::fname), \
    sol::resolve<sig2>(&class_t::fname), \
    sol::resolve<sig3>(&class_t::fname), \
    sol::resolve<sig4>(&class_t::fname), \
    sol::resolve<sig5>(&class_t::fname))
#define SOL_RESOLVE_FUNC(fname, sig, cfname) utype[#fname] = sol::resolve<sig>(&class_t::cfname)
#define SOL_CALL_FACTORIES(...) utype[sol::call_constructor] = sol::factories(__VA_ARGS__)
#define SOL_DEFAULT_CALL_FACTORY() utype[sol::call_constructor] = sol::factories([] { return nv::createRefObject<class_t>(); })
#define SOL_END_CLASS() }


#define SOL_NO_CONSTRUCTOR "new", sol::no_constructor
#define SOL_NO_DESTRUCTOR(cname) sol::meta_function::garbage_collect, sol::destructor([](cname &ref) { \
                                     THROW_MSG("Should not call destructor in lua for " << #cname);    \
                                 })

#define SOL_NO_CREATE(cname) SOL_NO_CONSTRUCTOR, \
                             SOL_NO_DESTRUCTOR(cname)

#define SOL_REF_CREATE(cname) SOL_NO_CONSTRUCTOR,       \
                              SOL_NO_DESTRUCTOR(cname), \
                              sol::call_constructor, sol::factories([] { return nv::createRefObject<cname>(); })

#define SOL_ENUM(name, value) enumSpace[name] = (int)value

#define SOL_BEGIN_ENUM(space, name) { \
    auto enumSpace = space[name].get_or_create<sol::table>();

#define SOL_END_ENUM() }

}