blog:2020:0807_nervluna_type_handling_mechanism

Action disabled: register

NervLuna: Refactoring of types handling mechanism

So after my last session trying to add more folders as input for the nvCore bindings generation I eventually realized that there was something fondamentally wrong in the way I was handling “types”. This lead me to a large refactoring session and now we will review in this article the major changes introduced and the problems fixed in the process.

  • So far when I was creating a type, say for instance a type to represent “unsigned long long”, I was also supportting 1 alias for that type. So if later in the code we had:

namespace engine {

typedef unsigned long long MyUnsignedID; 

typedef unsigned long long U64
}

Then NervLuna would see that “engine::MyUnsignedID” is a potential alias for “unsigned long long”, but it would be discarded immediately because that name has more characters than the default type name [but still I would register a mapping of “engine::MyUnsignedID” to the actual type internally so that we do not reparse it again]. On the other hand, it would also find then that “engine::U64” is another potential alias and start using it because it is a shorter name.

⇒ This was already a first point that I wanted to change: I think that at any time we should keep the links between the actual type and any valid name for that type. So instead of a single name for a type I can now provide multiple different names (ie. adding the names one by one as they are discovered) for a type.

And thus, the TypeManager is not handling a mapping of “typename <-> type” entries anymore, but instead a vector of Types, where each type may provide multiple names for itself.

  • The next major change introduced on the Type is the base type dependency system: suppose we have a type for “unsigned long long” as above, and now support we find a type such as “const engine::MyUnsignedID&”. Before I would have resolved the type “unsigned long long”, made a copy of it, then added the “const” qualifier on that copy, then made a copy of that second type, and added a reference on the second copy. Great. But what if that const reference type was discovered before the additional name “engine::U64” for instance ? Then our “const engine::MyUnsignedID” and “const engine::MyUnsignedID&” types would never acknowledge that they can also be called “const engine::U64” and “const engine::U64&” respectively… not so good.

⇒ So now, instead of making a copy when creating a dependent type, I would instead create a new type for “const engine::MyUnsignedID”, assign it the type for “engine::MyUnsignedID” as base type, and just specify that it should add the “const” qualifier on that base type, then I create another type for “const engine::MyUnsignedID&”, assign it the type for “const engine::MyUnsignedID” as base type, and just specify that it should add a reference on that base type.

As a result of this, type names can be dynamically constructed now: if I later add the name “engine::U64” to the type “engine::MyUnsignedID”, and I query the type “const engine::MyUnsignedID&” for its default (ie. shortest name), it will automatically provide the name “const engine::U64&”!

  • Once we have a “base type” system for the types, we may then realize that there are basically 2 ways to generate a name for a type: it might come from the base type itself, or it might be defined in the code directly. Suppose for instance we have this:

namespace engine {
  
  typedef int MyID;
  typedef MyID* MyIDPtr;
  typedef int* PointerType;

  int test(const MyIDPtr& ptr);
}

From that code, we would generate the base type “MyID” which has the names “int” and “MyID”, and the type “MyIDPtr” which uses “MyID” as base type (and add a pointer to its name).

And then resolving the type for “PointerType”, we check if one of the existing type already has the target name (even if this is not its shortest/default name): so here we search for a type matching the name “int*” ⇒ and we will correctly find the type MyIDPtr in that case (because the base type for that type can be named “int” and we just add a pointer on it, so “int*” is a correct name for us :-))

So finally, we do not create a new type for “PointerType”, and instead, we just add that name as an alternative name for the type MyIDPtr: awesome, isn't it ?! ;-)

Note: One side note on this new type naming mechanism is that I had to setup a strict syntax for the names so now I'm always removing the non-necessary spaces in the names to ensure I can always compare the strings correctly:

function utils.sanitizeTypeName(tname)
  tname = tname:gsub("^class ", "")
  tname = tname:gsub("([^_%w])class ", "%1")
  tname = tname:gsub("^struct ", "")
  tname = tname:gsub("([^_%w])struct ", "%1")
  tname = tname:gsub("^enum ", "")
  tname = tname:gsub("([^_%w])enum ", "%1")
  tname = tname:gsub(",%s+", ",")
  tname = tname:gsub("%s+>", ">")
  tname = tname:gsub("<%s+", "<")
  tname = tname:gsub("%s+%*", "*")
  tname = tname:gsub("%s+&", "&")

  return tname
end</sxhjs>

With this new mechanism implemented, I went back to trying to generate the bindings for nvCore module, and of course, this came with its load of additional errors :-), So let's have a look at these.


===== Incorrect type mapping from fully qualified name =====


  * First thing I discovered was a case that was total non-sense when parsing with clang the following code for instance

<sxh cpp; highlight: []>#include <string>

namespace nvt 
{

typedef std::basic_string<char, std::char_traits<char>> String;

void testConstString(const char* fname, const String &content) {};

}

This will produce a direct typedef mapping as follow:

[Debug] 	      Arg 1 cursor: content, type: const nvt::String&
[Debug] 	      Resolving type const nvt::String&
[Debug] 	      Creating new type object for const nvt::String&
[Debug] 	      Type kind: LValueReference
[Debug] 	      Resolving type const nvt::String
[Debug] 	      Creating new type object for const nvt::String
[Debug] 	      Type kind: Typedef
[Debug] 	      Using base type: 'std::basic_string<char, std::char_traits<char>>' for typedef type const nvt::String
[Debug] 	      Declaration location: W:\Projects\NervSeed\sources\lua_bindings\test1\interface\bind_test4.h:9:57:148
[Debug] 	      level 1: 'nvt::String' of kind 'TypeRef', type: 'nvt::String'
[Debug] 	      Resolving type std::basic_string<char,std::char_traits<char>>
[Debug] 	      Using already resolved type for 'std::basic_string<char,std::char_traits<char>>', type name: nvt::String
[Debug] 	      Adding name const nvt::String to type {
  "nvt::String",
  "std::basic_string<char,std::char_traits<char>>"
}
[Debug] 	      Base class for reference const nvt::String& is: const nvt::String, type name: nvt::String
[Debug] 	      TypeManager: Registering type {
  "const nvt::String&"
}
[Debug] 	      Done with arg 1

⇒ This doesn't make any sense because here we are directly mapping “std::basic_string<char, std::char_traits<char»” to “const nvt::String” (so, considering a const type is the same thing as its non-const base type in this case)

Actually, I didn't fix that issue immediately: I just figured out it was somehow related to the custom function I'm using to retrieve a fully qualified type name, but it took me quite a long time to find a proper solution to this problem, so we'll get back to this point later.

  • Next, I had to introduce some kind of minimal support to deal with “invalid types” gracefully. Because I could otherwise get errors such as:
    Could not build class for type rapidjson::StaticAssertTest<sizeof(::rapidjson::STATIC_ASSERTION_FAILURE<bool(sizeof(Ch) >= 2)>)>

Clearly, that's not the kind of class I really want to generate a binding for, and here I couldn't sucessfully parse the expression “sizeof(::rapidjson::STATIC_ASSERTION_FAILURE<bool(sizeof(Ch) >= 2)>)” to extract a boolean value from it. But that doesn't really matter, so when facing an invalid type like that I just needed to produce an invalid name, containing for instance "???", then I would ignore the classes that would contain the invalid names when generating the actual bindings:

function Class:isBindable()
  -- If the class name contains the "???" sequence then it is not bindable:
  local fname = self:getFullName()
  if fname:find("%?%?%?") then
    return false
  end
  
  return not self:isIgnored() and not self:isAnonymous()
end</sxhjs>

===== Invalid type fully qualified names =====

  * Then I was trying to generate bindings for my DataMapArray class and got this:

<code>
[Debug] 	      Creating function nv::DataMapArray::begin
[Debug] 	      Adding signature ArrayType::iterator () to function begin
[Debug] 	      Return type: ArrayType::iterator
[Debug] 	      Resolving type std::vector<RefPtr<DataMap>,allocator<RefPtr<DataMap>>>::iterator
[Debug] 	      Creating new type object for std::vector<RefPtr<DataMap>,allocator<RefPtr<DataMap>>>::iterator
[Debug] 	      Resolving type std::vector<nv::RefPtr<nv::DataMap>,std::allocator<nv::RefPtr<nv::DataMap>>>
[Debug] 	      Creating new type object for std::vector<nv::RefPtr<nv::DataMap>,std::allocator<nv::RefPtr<nv::DataMap>>>
</code>

=> Problem here is on the missing namespace specification for "allocator"

So built this minimal test: <sxh cpp; highlight: []>#include <vector>

namespace nvt 
{

class MyVectorNameTest
{
public:
	/** Definition of the array type itself.*/
	typedef std::vector< double > ArrayType;

    typedef ArrayType::iterator IteratorType;

public:

    inline ArrayType::iterator begin() { return _data.begin(); }

protected:

    ArrayType _data;
};

}

And found that this would produce this kind of AST:

[Debug] 	      level 2: 'MyVectorNameTest' of kind 'ClassDecl', type: 'nvt::MyVectorNameTest'
[Debug] 	      level 3: '' of kind 'CXXAccessSpecifier', type: ''
[Debug] 	      level 3: 'ArrayType' of kind 'TypedefDecl', type: 'nvt::MyVectorNameTest::ArrayType'
[Debug] 	      level 4: 'std' of kind 'NamespaceRef', type: ''
[Debug] 	      level 4: 'vector' of kind 'TemplateRef', type: ''
[Debug] 	      level 3: 'IteratorType' of kind 'TypedefDecl', type: 'nvt::MyVectorNameTest::IteratorType'
[Debug] 	      level 4: 'nvt::MyVectorNameTest::ArrayType' of kind 'TypeRef', type: 'nvt::MyVectorNameTest::ArrayType'
[Debug] 	      level 4: 'std::vector<double, class std::allocator<double> >::iterator' of kind 'TypeRef', type: 'std::vector<double, allocator<double>>::iterator'
[Debug] 	      level 3: '' of kind 'CXXAccessSpecifier', type: ''
[Debug] 	      level 3: 'begin' of kind 'CXXMethod', type: 'ArrayType::iterator ()'
[Debug] 	      level 4: 'nvt::MyVectorNameTest::ArrayType' of kind 'TypeRef', type: 'nvt::MyVectorNameTest::ArrayType'
[Debug] 	      level 4: 'std::vector<double, class std::allocator<double> >::iterator' of kind 'TypeRef', type: 'std::vector<double, allocator<double>>::iterator'
[Debug] 	      level 4: '' of kind 'CompoundStmt', type: ''
[Debug] 	      level 5: '' of kind 'ReturnStmt', type: ''
[Debug] 	      level 6: 'begin' of kind 'CallExpr', type: 'std::vector<double, allocator<double>>::iterator'
[Debug] 	      level 7: 'begin' of kind 'MemberRefExpr', type: '<bound member function type>'
[Debug] 	      level 8: '_data' of kind 'MemberRefExpr', type: 'nvt::MyVectorNameTest::ArrayType'
[Debug] 	      level 3: '' of kind 'CXXAccessSpecifier', type: ''
[Debug] 	      level 3: '_data' of kind 'FieldDecl', type: 'nvt::MyVectorNameTest::ArrayType'
[Debug] 	      level 4: 'nvt::MyVectorNameTest::ArrayType' of kind 'TypeRef', type: 'nvt::MyVectorNameTest::ArrayType'

⇒ So it seemed that we had a few TypeRef cursors here that could prouve very useful to name our types correctly.

I then added more content to investigate the AST parsing:

#include <vector>

namespace nvt 
{

class MyVectorNameTest
{
public:
	/** Definition of the array type itself.*/
	typedef std::vector< double > ArrayType;
	typedef std::vector< int > IndexType;

    typedef nvt::ArrayType::iterator IteratorType;

public:

    inline ArrayType::iterator begin() { return _data.begin(); }

    ArrayType::iterator begin(const IndexType& idx) { return _data.begin(); }

    ArrayType::iterator begin(float val, const IndexType& idx) { return _data.begin(); }

protected:

    ArrayType _data;
};

And this will produce the AST:

[Debug][Debug] 	      level 2: 'MyVectorNameTest' of kind 'ClassDecl', type: 'nvt::MyVectorNameTest'
[Debug] 	      level 3: '' of kind 'CXXAccessSpecifier', type: ''
[Debug] 	      level 3: 'ArrayType' of kind 'TypedefDecl', type: 'nvt::MyVectorNameTest::ArrayType'
[Debug] 	      level 4: 'std' of kind 'NamespaceRef', type: ''
[Debug] 	      level 4: 'vector' of kind 'TemplateRef', type: ''
[Debug] 	      level 3: 'IndexType' of kind 'TypedefDecl', type: 'nvt::MyVectorNameTest::IndexType'
[Debug] 	      level 4: 'std' of kind 'NamespaceRef', type: ''
[Debug] 	      level 4: 'vector' of kind 'TemplateRef', type: ''
[Debug] 	      level 3: 'IteratorType' of kind 'TypedefDecl', type: 'nvt::MyVectorNameTest::IteratorType'
[Debug] 	      level 3: '' of kind 'CXXAccessSpecifier', type: ''
[Debug] 	      level 3: 'begin' of kind 'CXXMethod', type: 'ArrayType::iterator ()'
[Debug] 	      level 4: 'nvt::MyVectorNameTest::ArrayType' of kind 'TypeRef', type: 'nvt::MyVectorNameTest::ArrayType'
[Debug] 	      level 4: 'std::vector<double, class std::allocator<double> >::iterator' of kind 'TypeRef', type: 'std::vector<double, allocator<double>>::iterator'
[Debug] 	      level 4: '' of kind 'CompoundStmt', type: ''
[Debug] 	      level 5: '' of kind 'ReturnStmt', type: ''
[Debug] 	      level 6: 'begin' of kind 'CallExpr', type: 'std::vector<double, allocator<double>>::iterator'
[Debug] 	      level 7: 'begin' of kind 'MemberRefExpr', type: '<bound member function type>'
[Debug] 	      level 8: '_data' of kind 'MemberRefExpr', type: 'nvt::MyVectorNameTest::ArrayType'
[Debug] 	      level 3: 'begin' of kind 'CXXMethod', type: 'ArrayType::iterator (const nvt::MyVectorNameTest::IndexType &)'
[Debug] 	      level 4: 'nvt::MyVectorNameTest::ArrayType' of kind 'TypeRef', type: 'nvt::MyVectorNameTest::ArrayType'
[Debug] 	      level 4: 'std::vector<double, class std::allocator<double> >::iterator' of kind 'TypeRef', type: 'std::vector<double, allocator<double>>::iterator'
[Debug] 	      level 4: 'idx' of kind 'ParmDecl', type: 'const nvt::MyVectorNameTest::IndexType &'
[Debug] 	      level 5: 'nvt::MyVectorNameTest::IndexType' of kind 'TypeRef', type: 'nvt::MyVectorNameTest::IndexType'
[Debug] 	      level 4: '' of kind 'CompoundStmt', type: ''
[Debug] 	      level 5: '' of kind 'ReturnStmt', type: ''
[Debug] 	      level 6: 'begin' of kind 'CallExpr', type: 'std::vector<double, allocator<double>>::iterator'
[Debug] 	      level 7: 'begin' of kind 'MemberRefExpr', type: '<bound member function type>'
[Debug] 	      level 8: '_data' of kind 'MemberRefExpr', type: 'nvt::MyVectorNameTest::ArrayType'
[Debug] 	      level 3: 'begin' of kind 'CXXMethod', type: 'ArrayType::iterator (float, const nvt::MyVectorNameTest::IndexType &)'
[Debug] 	      level 4: 'nvt::MyVectorNameTest::ArrayType' of kind 'TypeRef', type: 'nvt::MyVectorNameTest::ArrayType'
[Debug] 	      level 4: 'std::vector<double, class std::allocator<double> >::iterator' of kind 'TypeRef', type: 'std::vector<double, allocator<double>>::iterator'
[Debug] 	      level 4: 'val' of kind 'ParmDecl', type: 'float'
[Debug] 	      level 4: 'idx' of kind 'ParmDecl', type: 'const nvt::MyVectorNameTest::IndexType &'
[Debug] 	      level 5: 'nvt::MyVectorNameTest::IndexType' of kind 'TypeRef', type: 'nvt::MyVectorNameTest::IndexType'
[Debug] 	      level 4: '' of kind 'CompoundStmt', type: ''
[Debug] 	      level 5: '' of kind 'ReturnStmt', type: ''
[Debug] 	      level 6: 'begin' of kind 'CallExpr', type: 'std::vector<double, allocator<double>>::iterator'
[Debug] 	      level 7: 'begin' of kind 'MemberRefExpr', type: '<bound member function type>'
[Debug] 	      level 8: '_data' of kind 'MemberRefExpr', type: 'nvt::MyVectorNameTest::ArrayType'
[Debug] 	      level 3: '' of kind 'CXXAccessSpecifier', type: ''
[Debug] 	      level 3: '_data' of kind 'FieldDecl', type: 'nvt::MyVectorNameTest::ArrayType'
[Debug] 	      level 4: 'nvt::MyVectorNameTest::ArrayType' of kind 'TypeRef', type: 'nvt::MyVectorNameTest::ArrayType'

So to me, it really seemed that if we could retrieve the TypeRef(s) then we could maybe name our types “more correctly” ?

Trying to study this further I added the following functions in my test class above:

    void testInit(std::initializer_list<std::vector<double>::value_type>) { };
    void testMap(std::initializer_list<std::map<std::string, std::vector<double>::value_type>>) { };
    void testMap2(std::initializer_list<std::map<int, std::vector<double>::value_type>::value_type>) { };
    void testMap3(std::map<std::vector<double>::value_type*, std::vector<double>>::value_type*) { };

    void testVector(const std::vector<double>::value_type& val) { };

And this produced this kind of AST:

[Debug] 	      level 3: 'testInit' of kind 'CXXMethod', type: 'void (std::initializer_list<std::vector<double>::value_type>)'
[Debug] 	      level 4: '' of kind 'ParmDecl', type: 'std::initializer_list<std::vector<double, allocator<double>>::value_type>'
[Debug] 	      level 5: 'std' of kind 'NamespaceRef', type: ''
[Debug] 	      level 5: 'initializer_list' of kind 'TemplateRef', type: ''
[Debug] 	      level 5: 'std' of kind 'NamespaceRef', type: ''
[Debug] 	      level 5: 'vector' of kind 'TemplateRef', type: ''
[Debug] 	      level 5: 'std::vector<double, class std::allocator<double> >::value_type' of kind 'TypeRef', type: 'std::vector<double, allocator<double>>::value_type'
[Debug] 	      level 4: '' of kind 'CompoundStmt', type: ''
[Debug] 	      level 3: 'testMap' of kind 'CXXMethod', type: 'void (std::initializer_list<std::map<std::string, std::vector<double>::value_type>>)'
[Debug] 	      level 4: '' of kind 'ParmDecl', type: 'std::initializer_list<std::map<std::string, std::vector<double, allocator<double>>::value_type>>'
[Debug] 	      level 5: 'std' of kind 'NamespaceRef', type: ''
[Debug] 	      level 5: 'initializer_list' of kind 'TemplateRef', type: ''
[Debug] 	      level 5: 'std' of kind 'NamespaceRef', type: ''
[Debug] 	      level 5: 'map' of kind 'TemplateRef', type: ''
[Debug] 	      level 5: 'std' of kind 'NamespaceRef', type: ''
[Debug] 	      level 5: 'std::string' of kind 'TypeRef', type: 'std::string'
[Debug] 	      level 5: 'std' of kind 'NamespaceRef', type: ''
[Debug] 	      level 5: 'vector' of kind 'TemplateRef', type: ''
[Debug] 	      level 5: 'std::vector<double, class std::allocator<double> >::value_type' of kind 'TypeRef', type: 'std::vector<double, allocator<double>>::value_type'
[Debug] 	      level 4: '' of kind 'CompoundStmt', type: ''
[Debug] 	      level 3: 'testMap2' of kind 'CXXMethod', type: 'void (std::initializer_list<std::map<int, std::vector<double>::value_type>::value_type>)'
[Debug] 	      level 4: '' of kind 'ParmDecl', type: 'std::initializer_list<std::map<int, double, less<int>, allocator<pair<const int, double>>>::value_type>'
[Debug] 	      level 5: 'std' of kind 'NamespaceRef', type: ''
[Debug] 	      level 5: 'initializer_list' of kind 'TemplateRef', type: ''
[Debug] 	      level 5: 'std' of kind 'NamespaceRef', type: ''
[Debug] 	      level 5: 'map' of kind 'TemplateRef', type: ''
[Debug] 	      level 5: 'std' of kind 'NamespaceRef', type: ''
[Debug] 	      level 5: 'vector' of kind 'TemplateRef', type: ''
[Debug] 	      level 5: 'std::vector<double, class std::allocator<double> >::value_type' of kind 'TypeRef', type: 'std::vector<double, allocator<double>>::value_type'
[Debug] 	      level 5: 'std::map<int, double, struct std::less<int>, class std::allocator<struct std::pair<const int, double> > >::value_type' of kind 'TypeRef', type: 'std::map<int, double, less<int>, allocator<pair<const int, double>>>::value_type'
[Debug] 	      level 4: '' of kind 'CompoundStmt', type: ''
[Debug] 	      level 3: 'testMap3' of kind 'CXXMethod', type: 'void (std::map<std::vector<double>::value_type *, std::vector<double>>::value_type *)'
[Debug] 	      level 4: '' of kind 'ParmDecl', type: 'std::map<double *, vector<double, allocator<double>>, less<double *>, allocator<pair<double *const, vector<double, allocator<double>>>>>::value_type *'
[Debug] 	      level 5: 'std' of kind 'NamespaceRef', type: ''
[Debug] 	      level 5: 'map' of kind 'TemplateRef', type: ''
[Debug] 	      level 5: 'std' of kind 'NamespaceRef', type: ''
[Debug] 	      level 5: 'vector' of kind 'TemplateRef', type: ''
[Debug] 	      level 5: 'std::vector<double, class std::allocator<double> >::value_type' of kind 'TypeRef', type: 'std::vector<double, allocator<double>>::value_type'
[Debug] 	      level 5: 'std' of kind 'NamespaceRef', type: ''
[Debug] 	      level 5: 'vector' of kind 'TemplateRef', type: ''
[Debug] 	      level 5: 'std::map<double *, class std::vector<double, class std::allocator<double> >, struct std::less<double *>, class std::allocator<struct std::pair<double *const, class std::vector<double, class std::allocator<double> > > > >::value_type' of kind 'TypeRef', type: 'std::map<double *, vector<double, allocator<double>>, less<double *>, allocator<pair<double *const, vector<double, allocator<double>>>>>::value_type'
[Debug] 	      level 4: '' of kind 'CompoundStmt', type: ''
[Debug] 	      level 3: 'testVector' of kind 'CXXMethod', type: 'void (const std::vector<double>::value_type &)'
[Debug] 	      level 4: 'val' of kind 'ParmDecl', type: 'const std::vector<double, allocator<double>>::value_type &'
[Debug] 	      level 5: 'std' of kind 'NamespaceRef', type: ''
[Debug] 	      level 5: 'vector' of kind 'TemplateRef', type: ''
[Debug] 	      level 5: 'std::vector<double, class std::allocator<double> >::value_type' of kind 'TypeRef', type: 'std::vector<double, allocator<double>>::value_type'
[Debug] 	      level 4: '' of kind 'CompoundStmt', type: ''

From there I realized it could become very complex to try to build corret type names using the TypeRef elements, and at the same time the NamespaceRef and TemplateRef elements: those elements are listed in the order they appear on the input code line, but then you don't get any “TypeRef” for simple types, you don't get any qualifiers in the type refs, and I'm still not really sure when you get a TemplateRef or a TypeRef exactly. So rebuilding a correct type name from those various inputs might be possible, but I think this could be quite tricky to do.. [come on, there must be another simpler solution to do this right ?]

Then I went back to studying the clang_getCursorFullRefSpelling function, since, when given a TypeRef that function seems to produce a correct name every time ? And here is what is happening in there:

  PrintingPolicy P = getCursorContext(C).getPrintingPolicy();
  P.FullyQualifiedName = 1;

  if (clang_isReference(C.kind)) {
    switch (C.kind) {
    case CXCursor_TypeRef: {
      const TypeDecl *Type = getCursorTypeRef(C).first;
      assert(Type && "Missing type decl");

      return cxstring::createDup(
          getCursorContext(C).getTypeDeclType(Type).getAsString(P));
    }

    // (more stuff here)
  }

  // (More stuff here)

⇒ So why not simply try to use that code for our type “getFullSpelling” custom function ? Trying that…

So here is the updated code I used to retrieve the full name of a given type:

CXString clang_getTypeFullSpelling(CXType CT) {
  if (CT.kind == CXType_Invalid)
    return cxstring::createEmpty();

  CXCursor C = clang_getTypeDeclaration(CT);

  CXTranslationUnit TU = GetTU(CT);
  auto& ctx = cxtu::getASTUnit(TU)->getASTContext();
  PrintingPolicy P = ctx.getPrintingPolicy();
  P.FullyQualifiedName = 1;

  const Decl *D = cxcursor::getCursorDecl(C);
  if (!D) {
    QualType T = GetQualType(CT);
    QualType T2 = clang::TypeName::getFullyQualifiedType(T, ctx);

    std::string fname = clang::TypeName::getFullyQualifiedName(T2, ctx, P);
    DEBUG_MSG("WARN: clang_getTypeFullSpelling: no Typedecl found for "<<fname);
    return cxstring::createDup(fname);
  }

  const TypeDecl *TD = dyn_cast<TypeDecl>(D);
  if(!TD) {
    DEBUG_MSG("ERROR: clang_getTypeFullSpelling: Invalid TypeDecl object for type.");
    return cxstring::createEmpty();
  }

  return cxstring::createDup(cxcursor::getCursorContext(C).getTypeDeclType(TD).getAsString(P));
}

⇒ Basically I introduced a special handling case when we can retrieve a valid TypeDecl for our current type. And strangely enough this seems to really help a lot! And I finally got my ArrayType::iterator detected and used as expected in the following location:

// Bind for begin (1) with signature: ArrayType::iterator ()
static int _bind_begin_sig1(lua_State* L) {
	nv::DataMapArray* self = Luna< nv::DataMapArray >::get(L,1);
	ASSERT(self!=nullptr);


	nv::DataMapArray::ArrayType::iterator res = self->begin();
	nv::DataMapArray::ArrayType::iterator* res_ptr = new nv::DataMapArray::ArrayType::iterator(res);
	Luna< nv::DataMapArray::ArrayType::iterator >::push(L, res_ptr, true);

	return 1;
}

Next issue was with this generated code:

// Bind for DoubleMap constructor (2) with signature: void (std::initializer_list<std::map<int, double>::value_type>)
static nv::DoubleMap* _bind_DoubleMap_sig2(lua_State* L) {
	std::initializer_list<std::map<const int,double>>* vals = Luna< std::initializer_list<std::map<const int,double>> >::get(L,1,false);

	return new nv::DoubleMap(*vals);
}

Here it seems that our type “std::initializer_list<std::map<int, double>::value_type>” is considered to be the same thing as “std::initializer_list<std::map<const int,double>>”, not quite true… How could that be ?

⇒ I found what was wrong here:

[Debug] 	      Adding signature void (std::initializer_list<std::map<int, double>::value_type>) to function DoubleMap
[Debug] 	      Signature is *not* defined: nv::DoubleMap::DoubleMap [void (std::initializer_list<std::map<int, double>::value_type>)]
[Debug] 	      Return type: void
WARN: clang_getTypeFullSpelling: no Typedecl found for void
[Debug] 	      cursor num arguments: 1
[Debug] 	      Detected DLL imported function signature: nv::DoubleMap::DoubleMap [void (std::initializer_list<std::map<int, double>::value_type>)]
[Debug] 	      Arg 0 cursor: vals, type: std::initializer_list<std::pair<const int,double>>
[Debug] 	      Resolving type std::initializer_list<std::pair<const int,double>>
[Debug] 	      Type kind: Elaborated
[Debug] 	      Detected templated type: std::initializer_list<std::pair<const int,double>> with 1 template arguments
[Debug] 	      Searching template ref in cursor context 'vals'
[Debug] 	      TemplateRef name: initializer_list
[Debug] 	      TemplateRef full name: std::initializer_list
[Warning] 	Cannot retrieve class template from cursor location: D:\Apps\VisualStudio2017_CE\VC\Tools\MSVC\14.16.27023\include\initializer_list:18:8:437
[Debug] 	      Processing 1 template arguments...
[Debug] 	      Resolving type std::map<int,double,std::less<int>,std::allocator<std::pair<const int,double>>>::value_type
[Debug] 	      Type kind: Elaborated
[Debug] 	      Detected templated type: std::map<int,double,std::less<int>,std::allocator<std::pair<const int,double>>>::value_type with 2 template arguments
[Debug] 	      Searching template ref in cursor context 'value_type'
[Debug] 	      Search for template class in type declaration cursor at: D:\Apps\VisualStudio2017_CE\VC\Tools\MSVC\14.16.27023\include\map:92:8:2784
[Debug] 	      Processing 2 template arguments...
WARN: clang_getTypeFullSpelling: no Typedecl found for const int
[Debug] 	      Resolved template argument 0: const int
WARN: clang_getTypeFullSpelling: no Typedecl found for double
[Debug] 	      Resolved template argument 1: double
[Debug] 	      No class template name provided, manually extracting class name from type name: std::map<int,double,std::less<int>,std::allocator<std::pair<const int,double>>>::value_type
[Debug] 	      Base class name before resolving: map
[Warning] 	No class template registered for 'map<const int,double>' creating default place holder class in namespace 'std'
[Debug] 	      Creating class std::map<const int,double>
[Debug] 	      Creating concrete class std::map<const int,double> for initial type name: std::map<int,double,std::less<int>,std::allocator<std::pair<const int,double>>>::value_type
[Debug] 	      Adding name std::map<const int,double> to type {
  "std::map<int,double,std::less<int>,std::allocator<std::pair<const int,double>>>::value_type"
}
[Debug] 	      TypeManager: Registering type {
  "std::map<const int,double>",
  "std::map<int,double,std::less<int>,std::allocator<std::pair<const int,double>>>::value_type"
}
[Debug] 	      Resolved template argument 0: std::map<const int,double>
[Debug] 	      Extracting namespace from cltplName: std::initializer_list
[Debug] 	      Base class name before resolving: initializer_list
[Warning] 	No class template registered for 'initializer_list<std::map<const int,double>>' creating default place holder class in namespace 'std'
[Debug] 	      Creating class std::initializer_list<std::map<const int,double>>
[Debug] 	      Creating concrete class std::initializer_list<std::map<const int,double>> for initial type name: std::initializer_list<std::pair<const int,double>>
[Debug] 	      Adding name std::initializer_list<std::map<const int,double>> to type {
  "std::initializer_list<std::pair<const int,double>>"
}
[Debug] 	      TypeManager: Registering type {
  "std::initializer_list<std::map<const int,double>>",
  "std::initializer_list<std::pair<const int,double>>"
}

So:

  • I start with a type which is std::initializer_list<std::map<int, double>::value_type>
  • Then clang will (correctly) detect that this type can be named std::initializer_list<std::pair<const int,double>> (and that is awesome! ;-))
  • From there I extract the template name std::initializer_list
  • And then I'm back to resolving the template parameter std::map<int,double,std::less<int>,std::allocator<std::pair<const int,double>>>::value_type
  • And that one is detected as being a template type with 2 parameters: but this is NOT correct [Or is it?]. Let's check the code at that level.

Okay, so what's happening here is that my function Type:isTemplated() will simply check if my type has template arguments. And indeed, we have template arguments here since my target type above is really an std::pair<const int,double> (and in fact these are the template parameter types that I find eventually const int and double, and not int and double as one would expect from the std::map template itself!)

So I should really retrieve the correct name of that type at that level somehow. The problem is that in this case, I cannot retrieve a template class name from the cursor context or the type declaration.

Yet, I can retrieve the template arguments correctly, so I'm sure there is a way to do what I need from the C/C++ interface. Let's check that!

So I implemented the following function to get the name of a template class from a given type:

CXString clang_Type_getTemplateFullName(CXType CT) {
  QualType T = GetQualType(CT);
  if (T.isNull())
    return cxstring::createEmpty();

  CXTranslationUnit TU = GetTU(CT);
  auto& ctx = cxtu::getASTUnit(TU)->getASTContext();
  PrintingPolicy P = ctx.getPrintingPolicy();
  P.FullyQualifiedName = 1;

  if (const auto *Specialization = T->getAs<TemplateSpecializationType>())
  {
    auto* tdecl = Specialization->getTemplateName().getAsTemplateDecl();
    if(tdecl) {
      DEBUG_MSG("DEBUG: clang_Type_getTemplateFullName: returning TemplateDecl name: "<<tdecl->getQualifiedNameAsString());
      return cxstring::createDup(tdecl->getQualifiedNameAsString());
    }
    else {
      DEBUG_MSG("WARN: clang_Type_getTemplateFullName: Cannot print TemplateSpecializationType of kind: "<<Specialization->getTemplateName().getKind());
      return cxstring::createEmpty();
    }
  }

  if (const auto *RecordDecl = T->getAsCXXRecordDecl()) {
    const auto *TemplateDecl = dyn_cast<ClassTemplateSpecializationDecl>(RecordDecl);
    if (TemplateDecl) {
      // std::string tname = ctx.getTypeDeclType(TemplateDecl).getAsString(P); // This will give us the **full** template name with arguments.
      std::string tname = TemplateDecl->getQualifiedNameAsString();

      DEBUG_MSG("DEBUG: clang_Type_getTemplateFullName: returning ClassTemplateSpecializationDecl name: "<<tname);
      return cxstring::createDup(tname);
    }
    else {
      DEBUG_MSG("ERROR: clang_Type_getTemplateFullName: Invalid ClassTemplateSpecializationDecl object.");
      return cxstring::createEmpty();
    }
  }

  DEBUG_MSG("ERROR: clang_Type_getTemplateFullName: not a TemplateSpecializationType or CXXRecordDecl.");
  return cxstring::createEmpty();
}

⇒ This seems to work pretty fine and using this the error mentioned above is gone :-)

Next error I faced was:

W:\Projects\NervSeed\sources\lua_bindings\wip_core\src\luna\bind_nv_LuaScript.cpp(68): error C2440: 'initializing': cannot convert from 'const nv::String' to 'nv::String &'
This is the same issue than the one I left unsolved in the section “Incorrect type mapping from fully qualified name” above, and this time we are fixing it ;-)

What I eventually realized with this is that my function used to retrieve the full name of a type will in fact discard the const qualification when I use the type declaration cursor, but then I can easily restore it when needed using the provided input type, so I updated the code as follow:

CXString clang_getTypeFullSpelling(CXType CT) {
  if (CT.kind == CXType_Invalid)
    return cxstring::createEmpty();

  CXCursor C = clang_getTypeDeclaration(CT);

  // auto& ctx = cxcursor::getCursorContext(C);
  CXTranslationUnit TU = GetTU(CT);
  auto& ctx = cxtu::getASTUnit(TU)->getASTContext();
  PrintingPolicy P = ctx.getPrintingPolicy();
  P.FullyQualifiedName = 1;

  QualType T = GetQualType(CT);
  QualType T2 = clang::TypeName::getFullyQualifiedType(T, ctx);

  std::string fname = clang::TypeName::getFullyQualifiedName(T2, ctx, P);
  
  const Decl *D = cxcursor::getCursorDecl(C);
  if (!D) {
    DEBUG_MSG("WARN: clang_getTypeFullSpelling: no decl found for "<<fname);
    return cxstring::createDup(fname);
  }

  const TypeDecl *TD = dyn_cast<TypeDecl>(D);
  if(TD) {
    QualType T3 = cxcursor::getCursorContext(C).getTypeDeclType(TD);
    std::string tname = T3.getAsString(P);

    if(T.isConstQualified())
    {
      tname = "const "+tname;
    }

    return cxstring::createDup(tname);
  }

  DEBUG_MSG("ERROR: clang_getTypeFullSpelling: Invalid Decl object class for type "<<fname);
  return cxstring::createDup(fname);
}

And this worked just as expected!

  • Next issue was with this binding code:
    std::vector<double,std::allocator<double> >::_Mybase* LunaTraits< std::vector<double,std::allocator<double> >::_Mybase >::construct(lua_State* L) {
    	luaL_error(L, "No public constructor available for class std::vector<double,std::allocator<double> >::_Mybase");
    	return nullptr;
    }

This came from this kind of source code:

template<class _Ty,
	class _Alloc = allocator<_Ty>>
	class vector
		: public _Vector_alloc<_Vec_base_types<_Ty, _Alloc>>
	{	// varying size array of values
private:
	using _Mybase = _Vector_alloc<_Vec_base_types<_Ty, _Alloc>>;
	using _Alty = typename _Mybase::_Alty;
	using _Alty_traits = typename _Mybase::_Alty_traits;
}

And thus, when you try to compile those bindings, the compiler wil complain because the typedef is private in this case, so we are not allowed to access _Mybase from outside.

⇒ To solve this I updated the handling of the typedef/type alias to ensure that I do not use a type name if it a protected/private type:

if baseTypeName ~= tname then
        -- Also if the typedef name is shorter than the base name, we register an alias for it:
      
        local v = decl:getCXXAccessSpecifier()
        if v==clang.CXXAccessSpecifier.Protected or v==clang.CXXAccessSpecifier.Private then
          logDEBUG("Ignoring non-public typedef name: ", tname)
        else
          resolved:addName(tname)
        end
        
      end

And that seemed to do the trick once more ;-).

  • Then I got another issue with:
    W:\Projects\NervSeed\sources\lua_bindings\wip_core\src\luna\register_package.cpp(354): error C3203: 'STATIC_ASSERTION_FAILURE': unspecialized class template can't be used as a template argument for template parameter 'T', expected a real type
    W:\Projects\NervSeed\sources\lua_bindings\wip_core\src\luna\register_package.cpp(356): error C2059: syntax error: '?'
    W:\Projects\NervSeed\sources\lua_bindings\wip_core\src\luna\register_package.cpp(356): error C2039: 'Register': is not a member of '`global namespace''
    W:\Projects\NervSeed\sources\lua_bindings\wip_core\src\luna\register_package.cpp(361): error C3203: 'EncodedInputStream': unspecialized class template can't be used as a template argument for template parameter 'T', expected a real type
    W:\Projects\NervSeed\sources\lua_bindings\wip_core\src\luna\register_package.cpp(373): error C3203: 'SelectIfImpl': unspecialized class template can't be used as a template argument for template parameter 'T', expected a real type
    W:\Projects\NervSeed\sources\lua_bindings\wip_core\src\luna\register_package.cpp(374): error C3203: 'AndExprCond': unspecialized class template can't be used as a template argument for template parameter 'T', expected a real type
    W:\Projects\NervSeed\sources\lua_bindings\wip_core\src\luna\register_package.cpp(375): error C3203: 'OrExprCond': unspecialized class template can't be used as a template argument for template parameter 'T', expected a real type

This was due to the fact that full template specialization will also produce the base ClassDecl/StructDecl cursor, but in these cases, we still want to parse them as template types, so I update my parseClass function accordingly:

function Class:parseClass(cursor, parentScope, className, hasTemplate)
  local cname = className
  local cxtype = cursor:getType()
  local tname = cxtype:getFullName()

  if cname=="" then
    cname = luna:getNextName(luna.anonymous_class_prefix)
    logDEBUG("Registering anonymous struct/class '", cname, "' with type ", tname)
  end

  local scope = nil

  -- Here we might also have a fully specialized template and in that case we should use the template parameters in the class name:
  -- local tname = cur:getType():getFullName()

  local parts = utils.splitEntityName(tname)
  local sname = parts[#parts]
  if sname ~= "" and cname ~= sname then
    logDEBUG("Detected specialized class/struct/enum name: ", cname, " => ", sname," (isTemplated: ", cxtype:isTemplated(), ")")
    cname = sname

    -- if this is a templated type, then we parse it as a template here:
    if cxtype:isTemplated() then
      local t = typeManager:getType(tname)
      if t then
        -- We should already have a class inside this type:
        logDEBUG("Parsing existing template specialization: ", tname)
        scope = t:getTarget()
      else
        -- We have to parse this type:
        logDEBUG("Creating template specialization: ", tname)
        t = self:parseTemplateType(nil, cxtype, tname)
        CHECK(t, "Cannot parse template type: ",tname)
        scope = t:getTarget()

        -- Ensure that this class uses the current scope as parent:
        if scope:getParent() ~= parentScope then
          logDEBUG("Reparenting template specialization class ", tname, " to scope ", parentScope:getFullName())
          scope:setParent(parentScope)
        end
      end
    end
  end

  if not scope then
    scope = hasTemplate and parentScope:getOrCreateClassTemplate(cname) or parentScope:getOrCreateClass(cname)
  end

  -- (...more stuff here...)
end
In my bindings generation so far I never get a case of “Parsing existing template specialization” or “Reparenting template specialization class” so far but I'm keeping those sections here because I'm not quite sure they can never happen ;-)

And with that fix all my template specialized classes are now correctly named in the bindings, yeepee!

Finally, bindings compilation is passed! But of course now I have some more problems on the linking stage:

bind_nv_DataMap.cpp.obj : error LNK2001: unresolved external symbol "public: static char const * const nv::LunaTraits<class boost::any>::className" (?className@?$LunaTraits@Vany@boost@@@nv@@2QBDB)
bind_nv_DataMap.cpp.obj : error LNK2001: unresolved external symbol "public: static char const * const nv::LunaTraits<class boost::any>::fullName" (?fullName@?$LunaTraits@Vany@boost@@@nv@@2QBDB)
bind_nv_DataMap.cpp.obj : error LNK2001: unresolved external symbol "public: static char const * * nv::LunaTraits<class boost::any>::namespaces" (?namespaces@?$LunaTraits@Vany@boost@@@nv@@2PAPEBDA)

⇒ So it seems we have no class binding for boost::any so far, but we are still trying to use it in one of our functions. How could we handle that gracefully ?

Hmmm, in fact we already have a type registered for boost::any since the log contains:

[Debug] 	      Resolving type boost::any
[Debug] 	      Type kind: Elaborated
[Debug] 	      TypeManager: Registering type {
  "boost::any"
}

But resolving that type didn't lead to the creation of the corresponding class.

I eventually figured out that most of those “unresolved types” would actually point to a very simple canonical type, so I updated the handling as follow in that case:

-- Check if we can resolve a canonical type for that type:
      local ctype = cxtype:getCanonicalType()
      local canName = ctype:getFullName()

      if tname ~= canName then
        -- We resolve the canonical type:
        local resolved = self:resolveType(ctype, ctype:getTypeDeclaration())

        -- Add the alternative name:
        resolved:addName(tname)

        -- And return this:
        return resolved
      else
        --  This is really a type on its own, so we should register it:

        -- This is a record on a class instance.
        -- So we should retrieve that class.
        -- But it might be that the target class is not registered yet
        -- so we should instead store the type directly as target
        local cl = luna:getClass(cxtype)
    
        if cl then
          t:setTarget(cl)
        else
          -- assign a target type to resolve later:
          logDEBUG("Adding type with class target name: ", tname, ", (canonical type: ", cxtype:getCanonicalType():getFullName(),")")
          t:setTargetType(cxtype)
        end
      end

Yet, my “boost::any” type would still not be created as a type with this update, so I continued with updating the finaly type resolution post processing stage in the type manager itself:

function Class:resolveTargetTypes()
  logDEBUG("Resolving target types...")
  local luna = import "bind.LunaManager"
  local count = 0
  for _, type in ipairs(self._types) do
    local tgtType = type:getTargetType()
    
    if tgtType then
      local tname = type:getName()
      logDEBUG("Resolving target for type: ", tname)
      count = count + 1
      local cl = luna:getClass(tgtType)
      if not cl then
        tgtName = tgtType:getFullName()
        logDEBUG("Couldn't resolve type target, using target name instead: ", tgtName)

        -- Once we have a type name then we split it in parts:
        local parts = utils.splitEntityName(tgtName)

        local ns = luna:getRootNamespace()
        local nparts = #parts
        for i=1,nparts-1 do
          if ns:hasClass(parts[i]) then
            ns = ns:getOrCreateClass(parts[i])
          else
            ns = ns:getOrCreateNamespace(parts[i])
          end
        end

        -- Then we create our target class:
        cl = ns:getOrCreateClass(parts[nparts])
      end
      
      type:setTarget(cl)

    end
  end

  logDEBUG("Done resolving ",count," target types.")
end

⇒ Now I have my dedicated class for “boost::any”, yeah! :-)

But still more problems on the linking stage:

bind_nv_DataMapArray.cpp.obj : error LNK2001: unresolved external symbol "public: static char const * const nv::LunaTraits<class std::_Vector_iterator<class std::_Vector_val<struct std::_Simple_types<class nv::RefPtr<class nv::DataMap> > > > >::className" (?className@?$LunaTraits@V?$_Vector_iterator@V?$_Vector_val@U?$_Simple_types@V?$RefPtr@VDataMap@nv@@@nv@@@std@@@std@@@std@@@nv@@2QBDB)

⇒ Well, simple: that would be because I'm currently ignoring entities with a name ending with “::iterator”, and that type above is actually called “nv::DataMapArray::ArrayType::iterator”

For the moment, let's just not ignore those iterators anymore.

Eventually, when ignoring a type on user request like that, I should rather ensure that functions using that type are not binded at all.

Now, it seems we are really close to successfull compilation but there is still one kind of linking error:

register_functions.cpp.obj : error LNK2001: unresolved external symbol "public: static char const * const nv::LunaTraits<struct rapidjson::internal::DiyFp>::className" (?className@?$LunaTraits@UDiyFp@internal@rapidjson@@@nv@@2QBDB)
register_functions.cpp.obj : error LNK2001: unresolved external symbol "public: static char const * const nv::LunaTraits<struct rapidjson::internal::DiyFp>::fullName" (?fullName@?$LunaTraits@UDiyFp@internal@rapidjson@@@nv@@2QBDB)
register_functions.cpp.obj : error LNK2001: unresolved external symbol "public: static char const * * nv::LunaTraits<struct rapidjson::internal::DiyFp>::namespaces" (?namespaces@?$LunaTraits@UDiyFp@internal@rapidjson@@@nv@@2PAPEBDA)
register_functions.cpp.obj : error LNK2001: unresolved external symbol "public: static unsigned __int64 const * const nv::LunaTraits<struct rapidjson::internal::DiyFp>::baseIDs" (?baseIDs@?$LunaTraits@UDiyFp@internal@rapidjson@@@nv@@2QB_KB)
register_functions.cpp.obj : error LNK2001: unresolved external symbol "public: static unsigned __int64 const nv::LunaTraits<struct rapidjson::internal::DiyFp>::id" (?id@?$LunaTraits@UDiyFp@internal@rapidjson@@@nv@@2_KB)
register_functions.cpp.obj : error LNK2001: unresolved external symbol "public: static unsigned __int64 const nv::LunaTraits<class rapidjson::internal::BigInteger>::id" (?id@?$LunaTraits@VBigInteger@internal@rapidjson@@@nv@@2_KB)
luaCoreWIP.dll : fatal error LNK1120: 6 unresolved externals

And this was really because I needed to clearly map classes with their corresponding type and then mark types as non-lua convertible in case the class is not bindable, so in Type:getLuaConverter() we now have:

-- If the class is not bindable then we should not provide a converter
    if type(self._target) == "table" and not self._target:isBindable() then
      return nil
    end

And now… Miracle! My bindings for the nvCore module are compiling again! That's super great! LOL

So, this was another of those large development sessions that took a lot of energy to complete. But in the end, I'm really satisfied with the results so far: the refactoring of the types handling was really needed and now I feel the complete system is more robust and ready to move to the next level.

Yet, I should really not rush anything and take some additional time to clean up and “polish” the current binding generator: for instance, I'm pretty many of the unit tests I built initially are not passing anymore, and I definitely want to have those unit tests working as expected before moving forward. So I'll take care of that first.

  • blog/2020/0807_nervluna_type_handling_mechanism.txt
  • Last modified: 2020/08/07 13:56
  • by 127.0.0.1