blog:2020:0414_jit_cpp_compiler

JIT C++ compiler with LLVM - Part 2

In my previous article on this topic I described how I tried to use LLVM and clang to perform some initial dynamic C++ compilation tests. Now, in this post, I want to push this concept a bit further and build a working JIT compiler that I could eventually use “in production” either directly from C++ code or from Lua or other bindings.

So far I had only been testing a single compilation operation with a single function call (runClang(““W:/Projects/NervSeed/temp/test1.cxx””)). So first thing I decided I should do next was to encapsulate the LLVM compilation logic in an approprite class, that I could then use to perform “multiple compilation steps” and multiple “function call steps”. Like this for instance (in pseudo code):

JIT* jit = new MyJITClass();

jit->loadModule("my_file1.cpp");
typedef int(*Func)(int)
Func func = (Func)jit->lookup("my_print_number_function")

func(42);

jit->loadModule("my_file2.cpp");
typedef int(*AddFunc)(int,int);
AddFunc add = (AdddFunc)jit->lookup("my_add_function");
add(3,5);

So, with that usage in mind, I came up with the following initial declaration for my NervJIT class:

#include <llvm_common.h>

namespace nv
{

struct NervJITImpl;

class NVLLVM_EXPORT NervJIT : public nv::RefObject
{
private:
    std::unique_ptr<NervJITImpl> impl;

public:

    NervJIT();
    ~NervJIT();

    void loadModuleFromFiles(const FileList& files) {
        for(auto& f: files) {
            loadModuleFromFile(f);
        }
    }

    void loadModuleFromFile(const std::string& file);

    void loadModuleFromBuffer(const std::string& buffer);
    
    uint64_t lookup(const std::string& name);
};

};

Some key notes concerning the code above:

  • This class just provides support to:
    1. Compile a source file or a source buffer, and load the result as a “module” in the “LLVM JIT context” from where we can retrieve compiled functions and execute them dynamically. This is done with the loadModuleFromFile and loadmoduleFromBuffer obviously.
    2. Retrieve a compiled function pointer that can then be used as a regular function in the host process. This is what we do in the lookup function above.
  • I use a nv::RefObject base class simply to provide a basic “intrusive smart pointer” mechanism that I've always been using (yeah, I know… most people would use std::unique_ptr or std::shared_ptr with modern C++, but I still think intrusive ref count is a better idea sometimes ;-)) ⇒ Anyway, this should really not be a concern here.
  • I used the hidden implementation idiom (ie. that “impl” member above) to basically hide all of the LLVM stuff, the headers are massive and complex, so I don't want to get an external dependency on them here.
  • The FileList type is simply a vector of strings (ie: std::vector<std::string>), and I just added an helper method loadModuleFromFiles to be able to compile multiple files at once [but I actually doubt this will be very useful in the end lol]

Next in the implementation file, those functions will simply be redirected to the NervJITImpl member as follow:

NervJIT::NervJIT()
{
    logDEBUG("Creating NervJIT object.");
    impl = std::make_unique<NervJITImpl>();
    logDEBUG("Done creating NervJIT object.");
}

NervJIT::~NervJIT()
{
  logDEBUG("Deleting NervJIT object.");
  impl.reset();
  logDEBUG("Deleted NervJIT object.");
}

void NervJIT::loadModuleFromBuffer(const std::string& buffer)
{
    CHECK(impl, "Invalid NervJIT implementation.");
    auto& compilerInvocation = impl->compilerInstance->getInvocation();
    auto& frontEndOptions = compilerInvocation.getFrontendOpts();
    frontEndOptions.Inputs.clear();
    std::unique_ptr<MemoryBuffer> buf = llvm::MemoryBuffer::getMemBuffer(llvm::StringRef(buffer.data(), buffer.size()));
    frontEndOptions.Inputs.push_back(clang::FrontendInputFile(buf.get(), clang::InputKind(clang::Language::CXX)));

    impl->loadModule();
}

void NervJIT::loadModuleFromFile(const std::string& file)
{
    CHECK(impl, "Invalid NervJIT implementation.");
    // We prepare the file list:
    auto& compilerInvocation = impl->compilerInstance->getInvocation();
    auto& frontEndOptions = compilerInvocation.getFrontendOpts();

    frontEndOptions.Inputs.clear();
    frontEndOptions.Inputs.push_back(clang::FrontendInputFile(llvm::StringRef(file), clang::InputKind(clang::Language::CXX)));

    impl->loadModule();
}

uint64_t NervJIT::lookup(const std::string& name)
{
    THROW_IF(!impl, "Invalid NervJIT implementation.");
    return impl->lookup(name);
}

In the code above, we see that we already have some interactions with the LLVM objects in the loadModule functions: basically, I retrieve the compilerInstance object that is part of the NervJITImpl, and then updates its list of input “files” before calling the actual compilation function (ie. loadModule()). So in the end, whether the inputs are comming from file or from memory buffer doesn't make any difference once the frontend “Inputs” list is updated.

Here you would probably wonder why I added the loadModuleFromFiles function then ? After all, we could rather just “push_back” multiple files at onces inside the frontEndOptions.Inputs vector above to achive the same result more efficiently… Well, in fact, no :-S That doesn't seem to work: when I try to do so, it seems the compiler will really find and parse all the files provided as input, but then, the generated “Module” object only contains the functions from the last file in the list… so far I really have no idea why unfortunately.

And now we come to the core of the compiler implementation.

As mentioned above, the actual compilation is handled inside the NervJITImpl structure, which is declared as follow (just keep in mind that we put that declaration directly inside a .cpp file, so nothing is visible in the NervJIT interface):

#define NV_MAX_FUNCTION_NAME_LENGTH 256

namespace nv
{

typedef std::unordered_map<std::string, std::string> FunctionNameMap;

struct NervJITImpl {
    std::unique_ptr<llvm::orc::LLJIT> lljit;
    clang::IntrusiveRefCntPtr<clang::DiagnosticOptions> diagnosticOptions;
    std::unique_ptr<clang::TextDiagnosticPrinter> textDiagnosticPrinter;
    clang::IntrusiveRefCntPtr<clang::DiagnosticIDs> diagIDs;
    clang::IntrusiveRefCntPtr<clang::DiagnosticsEngine> diagnosticsEngine;

    std::unique_ptr<llvm::orc::ThreadSafeContext> tsContext;
    std::unique_ptr<clang::CodeGenAction> action;
    std::unique_ptr<clang::CompilerInstance> compilerInstance;

    std::unique_ptr<llvm::PassBuilder> passBuilder;
    std::unique_ptr<llvm::ModuleAnalysisManager> moduleAnalysisManager;
    std::unique_ptr<llvm::CGSCCAnalysisManager> cGSCCAnalysisManager;
    std::unique_ptr<llvm::FunctionAnalysisManager> functionAnalysisManager;
    std::unique_ptr<llvm::LoopAnalysisManager> loopAnalysisManager;

    std::unique_ptr<clang::LangOptions> langOptions;

    llvm::ModulePassManager modulePassManager;

    FunctionNameMap functionNames;

    char tmpFuncName[NV_MAX_FUNCTION_NAME_LENGTH];

    NervJITImpl();
    ~NervJITImpl();

    void loadModule();

    uint64_t lookup(const std::string& name);
};

}

As you can see above, this structure is used to keep references on all of the required LLVM objects used during the JIT compilation. And we set those objects up during the construction of our NervJITImpl object with the following constructor:

void checkLLVMError(llvm::Error Err)
{
    if(Err) {
        // logAllUnhandledErrors(std::move(Err), errs(), Banner);
        THROW_MSG("LLVM error: "<<llvm::toString(std::move(Err)));
    }
}

template <typename T> T CHECK_LLVM(llvm::Expected<T> &&E) {
    checkLLVMError(E.takeError());
    return std::move(*E);
}

NervJITImpl::NervJITImpl()
{
    lljit = CHECK_LLVM(LLJITBuilder().setNumCompileThreads(2).create());

    // ES.getMainJITDylib().setGenerator(
    //     cantFail(DynamicLibrarySearchGenerator::GetForCurrentProcess(DL)));
    
    // cf. https://github.com/tensorflow/mlir/issues/24
    lljit->getMainJITDylib().addGenerator(cantFail(DynamicLibrarySearchGenerator::GetForCurrentProcess(lljit->getDataLayout().getGlobalPrefix())));

    diagnosticOptions = new clang::DiagnosticOptions;

    textDiagnosticPrinter = std::make_unique<clang::TextDiagnosticPrinter>(llvm::outs(), diagnosticOptions.get());

    // The diagnotic engine should not own the client below (or it could if we release our unique_ptr.)
    diagnosticsEngine = new clang::DiagnosticsEngine(diagIDs, diagnosticOptions, textDiagnosticPrinter.get(), false);

    compilerInstance = std::make_unique<clang::CompilerInstance>();
    auto& compilerInvocation = compilerInstance->getInvocation();

    std::stringstream ss;
    ss << "-triple=" << llvm::sys::getDefaultTargetTriple();

    std::vector<const char*> itemcstrs;
    std::vector<std::string> itemstrs;
    itemstrs.push_back(ss.str());

    // cf. https://clang.llvm.org/docs/MSVCCompatibility.html
    // cf. https://stackoverflow.com/questions/34531071/clang-cl-on-windows-8-1-compiling-error
    itemstrs.push_back("-x");
    itemstrs.push_back("c++");
    itemstrs.push_back("-fms-extensions");
    itemstrs.push_back("-fms-compatibility");
    itemstrs.push_back("-fdelayed-template-parsing");
    itemstrs.push_back("-fms-compatibility-version=19.00");
    itemstrs.push_back("-std=c++17");

    for (unsigned idx = 0; idx < itemstrs.size(); idx++) {
      // note: if itemstrs is modified after this, itemcstrs will be full
      // of invalid pointers! Could make copies, but would have to clean up then...
      itemcstrs.push_back(itemstrs[idx].c_str());
    }
 
    // clang::CompilerInvocation::CreateFromArgs(compilerInvocation, itemcstrs.data(), itemcstrs.data() + itemcstrs.size(), *diagnosticsEngine.release());
    clang::CompilerInvocation::CreateFromArgs(compilerInvocation, llvm::ArrayRef(itemcstrs.data(), itemcstrs.size()), *diagnosticsEngine.get());

    // langOptions = std::make_unique<clang::LangOptions>();
    // langOptions->CPlusPlus = true;
    // langOptions->CPlusPlus17 = true;

    // compilerInvocation.setLangDefaults(*langOptions, clang::InputKind(clang::Language::CXX), triple, );
    auto* languageOptions = compilerInvocation.getLangOpts();
    auto& preprocessorOptions = compilerInvocation.getPreprocessorOpts();
    auto& targetOptions = compilerInvocation.getTargetOpts();
    
    auto& frontEndOptions = compilerInvocation.getFrontendOpts();
    frontEndOptions.ShowStats = false;

    auto& headerSearchOptions = compilerInvocation.getHeaderSearchOpts();

    headerSearchOptions.Verbose = false;
    headerSearchOptions.UserEntries.clear();
    headerSearchOptions.AddPath("D:/Apps/VisualStudio2017_CE/VC/Tools/MSVC/14.16.27023/include", clang::frontend::System, false, false);
    
    headerSearchOptions.AddPath("C:/Program Files (x86)/Windows Kits/10/Include/10.0.18362.0/um", clang::frontend::System, false, false);
    headerSearchOptions.AddPath("C:/Program Files (x86)/Windows Kits/10/Include/10.0.18362.0/shared", clang::frontend::System, false, false);
    headerSearchOptions.AddPath("C:/Program Files (x86)/Windows Kits/10/Include/10.0.18362.0/ucrt", clang::frontend::System, false, false);
    headerSearchOptions.AddPath("C:/Program Files (x86)/Windows Kits/10/Include/10.0.18362.0/winrt", clang::frontend::System, false, false);

    auto& codeGenOptions = compilerInvocation.getCodeGenOpts();

    targetOptions.Triple = llvm::sys::getDefaultTargetTriple();
    compilerInstance->createDiagnostics(textDiagnosticPrinter.get(), false);

    std::unique_ptr<llvm::LLVMContext> context = std::make_unique<llvm::LLVMContext>();
    tsContext = std::make_unique<llvm::orc::ThreadSafeContext>(std::move(context));

    action = std::make_unique<clang::EmitLLVMOnlyAction>(tsContext->getContext());

    // Now we build the optimization passes:
    logDEBUG("Building pass builder.");
    passBuilder = std::make_unique<llvm::PassBuilder>();
    loopAnalysisManager.reset(new llvm::LoopAnalysisManager(codeGenOptions.DebugPassManager));
    functionAnalysisManager.reset(new llvm::FunctionAnalysisManager(codeGenOptions.DebugPassManager));
    cGSCCAnalysisManager.reset(new llvm::CGSCCAnalysisManager(codeGenOptions.DebugPassManager));
    moduleAnalysisManager.reset(new llvm::ModuleAnalysisManager(codeGenOptions.DebugPassManager));
 
    logDEBUG("Registering passes.");
    passBuilder->registerModuleAnalyses(*moduleAnalysisManager);
    passBuilder->registerCGSCCAnalyses(*cGSCCAnalysisManager);
    passBuilder->registerFunctionAnalyses(*functionAnalysisManager);
    passBuilder->registerLoopAnalyses(*loopAnalysisManager);

    logDEBUG("Cross registering proxies.");
    passBuilder->crossRegisterProxies(*loopAnalysisManager, *functionAnalysisManager, *cGSCCAnalysisManager, *moduleAnalysisManager);
 
    logDEBUG("Creating default pipeline.");
    modulePassManager = passBuilder->buildPerModuleDefaultPipeline(llvm::PassBuilder::OptimizationLevel::O3);
}

The code above is somewhat similar to the initial test code I used in the previous article but also contains some important updates that we should discuss here:

1. The CHECK_LLVM helper function

I introduced a couple of functions above to handle the LLVM errors and exceptions in a more “integrated” way: in LLVM, many functions will return Expected<T> values that may contain error messages, instead of… a value of type T, obviously. Then, the framework also provide a typical helper class called ExitOnError, that you could use to encapsulate your calls to those LLVM functions as this for instance:

lljit = ExitOnErr(LLJITBuilder().create());

But the point is, that helper class will print the error message on the standard outputs (or at least, what is mapped to llvm::errs() as far as I understand) and then immediately exit the process with a call to exit(exitCode). Instead, I'd rather handle the error display/handling on my own and I use my regular macros/log sinks to do so. And that seems to work just as well so far.

2. Updated header search paths

The second major change in the code above was the update of the header search paths:

    auto& headerSearchOptions = compilerInvocation.getHeaderSearchOpts();

    headerSearchOptions.Verbose = false;
    headerSearchOptions.UserEntries.clear();
    headerSearchOptions.AddPath("D:/Apps/VisualStudio2017_CE/VC/Tools/MSVC/14.16.27023/include", clang::frontend::System, false, false);
    
    headerSearchOptions.AddPath("C:/Program Files (x86)/Windows Kits/10/Include/10.0.18362.0/um", clang::frontend::System, false, false);
    headerSearchOptions.AddPath("C:/Program Files (x86)/Windows Kits/10/Include/10.0.18362.0/shared", clang::frontend::System, false, false);
    headerSearchOptions.AddPath("C:/Program Files (x86)/Windows Kits/10/Include/10.0.18362.0/ucrt", clang::frontend::System, false, false);
    headerSearchOptions.AddPath("C:/Program Files (x86)/Windows Kits/10/Include/10.0.18362.0/winrt", clang::frontend::System, false, false);

⇒ Actually I first updated my test2.cxx file to contain the following code:

#include <string>

int nv_add2(int a, int b)
{
    return (a+b)*2;
}

int nv_sub2(int a, int b)
{
    return (a-b)*2;
}

int nv_length(const std::string& input)
{
    return input.size();
}

And, as you can imagine, without the header paths added above, the compiler couldn't find the included <string> and thus produced an error. The <string> file is part of the VisualStudio header files, thus the first include path, but of course, you then get recursively included files, and you eventually reach a dependency on the Windows SDK, which I'm providing with the subsequent 4 include paths.

Note: hardcoding all those paths is clearly not the way to go… so we'll fix that eventually ;-).

3. Updated command line arguments for compiler invocation creation

At the same time I was providing correct include paths to find the <string> file, I also started to receive a great lot of strange errors from clang trying to compile my file (for instance syntax or undefined type errors… all in those included system headers obviously). So this lead me another major change on the list of command line arguments passed to the helper function we use to setup the compiler invocation for our compilerInstance:

    std::stringstream ss;
    ss << "-triple=" << llvm::sys::getDefaultTargetTriple();
 
    std::vector<const char*> itemcstrs;
    std::vector<std::string> itemstrs;
    itemstrs.push_back(ss.str());
 
    // cf. https://clang.llvm.org/docs/MSVCCompatibility.html
    // cf. https://stackoverflow.com/questions/34531071/clang-cl-on-windows-8-1-compiling-error
    itemstrs.push_back("-x");
    itemstrs.push_back("c++");
    itemstrs.push_back("-fms-extensions");
    itemstrs.push_back("-fms-compatibility");
    itemstrs.push_back("-fdelayed-template-parsing");
    itemstrs.push_back("-fms-compatibility-version=19.00");
    itemstrs.push_back("-std=c++17");

  • All the “-fxxx” flags are required for compatibility with “microsoft version” of the standard C++ header files that are available inside VisualStudio (such as <string>). Again, this only make sense if you are programming on windows with VisualStudio, but from what I understood, clang might also still have some incompatibility with gcc headers too, and that might also require some other command line flags to get fixed: keep that in mind ;-)
  • And perharps more surprisingly, in the process of trying to fix the clang errors, I decided I should specify the C++ language level I want on the command line too (ie. -std==c++11 / c++14 / c++17) and then I got the following error:
    error: invalid argument '-std=c++17' not allowed with 'C'

⇒ So this means that the default language configured for my compiler execution was actually set to 'C' by default and not 'C++', and in fact thinking about it more carefully, in my previous article I could retrieve the symbols for the functions I created (ie. “nv_add”, “nv_sub”) using just those names, because there was no name mangling at play! And that, even if I specified the clang::InputKind(clang::Language::CXX) as kind for the frontend input files I provided… So, is seems that clang was happily compiling my C++ source files as C sources ;-), and this until I started with including the <string> header… don't ask me why :-D

After that, even if I still had the error above at first (“invalid argument '-std=c++17' not allowed with 'C'”), the function names became mangled (such as “?nv_add@@YAHHH@Z”), and I couldn't retrieve them as easily as before. So I think this is when I started to perform some actual C++ compilation for the first time [hey, better late than never ;-)!].

Then I just wanted to get rid of that error for the default language (ie. was still set to “C”), so I explicitly asked clang to compile “C++ content” with the command line arguments “-x c++”, and this did the trick..

As one can read from the commented line in the code above, I also found the function setLangDefaults that we may apparently call to tweak the default language settings… I considered using this, but then realized it would be even easier to specify the command line argument directly [but I think that option should work too? Maybe…]

4. Keeping references on PassBuilder and Pass managers

Last but not least, you will also notice that I'm allocating the resources needed for the IR module optimization on the heap now and not on the stack anymore:

logDEBUG("Building pass builder.");
passBuilder = std::make_unique<llvm::PassBuilder>();
loopAnalysisManager.reset(new llvm::LoopAnalysisManager(codeGenOptions.DebugPassManager));
functionAnalysisManager.reset(new llvm::FunctionAnalysisManager(codeGenOptions.DebugPassManager));
cGSCCAnalysisManager.reset(new llvm::CGSCCAnalysisManager(codeGenOptions.DebugPassManager));
moduleAnalysisManager.reset(new llvm::ModuleAnalysisManager(codeGenOptions.DebugPassManager));

logDEBUG("Registering passes.");
passBuilder->registerModuleAnalyses(*moduleAnalysisManager);
passBuilder->registerCGSCCAnalyses(*cGSCCAnalysisManager);
passBuilder->registerFunctionAnalyses(*functionAnalysisManager);
passBuilder->registerLoopAnalyses(*loopAnalysisManager);

And unfortunately, there is a very good [or rather, very bad] reason for that: one thing I wasn't very carefully about with my initial experiments was that, there was a silent crash in my test program, just when exiting the runClang() function call :-S. And eventually, I tracked this down to the destruction of the XXXAnalysisManager (replace XXX with Loop/Function/CGSCC/Module). It's pretty simple: I simply couldn't find a way to destroy those resources properly once allocated. So it goes even further than just storing them on the heap, I also deliberately left memory leaks, not trying to destroy those objects when destroying my NervJITImpl object:

NervJITImpl::~NervJITImpl()
{
    // Note: the memory for the LLJIT object will leak here because we get a crash when we try to delete it.
    std::cout << "Releasing undestructible pointers." << std::endl;
    lljit.release();

    moduleAnalysisManager.release();
    cGSCCAnalysisManager.release();
    functionAnalysisManager.release();
    loopAnalysisManager.release();

    std::cout << "Done releasing undestructible pointers." << std::endl;
}

As reported above, I made the same unfortunate observation on the lljit pointer itself: I cannot delete that object without a crash… :-| [“Hello Houston… we've got a problem here…”].

First I thought I could be doing something really wrong in the code myself, but in fact I could reproduce a similar crash building the “HowToUseLLJIT” example from the LLVM sources [Still, note that I'm building with my custom cmake setup, so maybe that's where I'm doing something wrong…]. Then I also tried to build LLVM version 10.0.0 instead of the current git version (11.0.0git) but I got the same result… So, still not quite sure what this is, and how to fix it, so I'll leave it as is for now and I will come back to this issue later.

And this is it for the resources allocation: this happens only once (when creating the NervJITImpl object) and from there, I had good hope I would be able to use and re-use those resources to compile multiple C++ source files adding more and more stuff in the JIT context, so let's continue our journey with the IR module contruction function.

The main function used to perform the C++ compilation LLVM IR is the following:

void NervJITImpl::loadModule()
{
    if (!compilerInstance->ExecuteAction(*action))
    {
        logERROR("Cannot execute action with compiler instance!");
    }

    std::unique_ptr<llvm::Module> module = action->takeModule();
    if (!module)
    {
        logERROR("Cannot retrieve IR module.");
    }

    // List the functions in the module (before optimizations):
    logDEBUG("Module function list: ");
    int i=0;
    for(auto& f: *module)
    {
        logDEBUG("func"<<i++<<": '"<<f.getName().str()<<"'");

        // We try to demangle the function name here:
        // cf. llvm/Demangle/Demangle.h
        size_t len = NV_MAX_FUNCTION_NAME_LENGTH;
        int status = 0;
        char* res = llvm::microsoftDemangle(f.getName().str().c_str(), tmpFuncName, &len, &status, MSDF_NoCallingConvention);
        if(res) {
            logDEBUG("Function demangled name is: '"<<res<<"'");

            // And we map that entry in the function names:
            functionNames.insert(std::make_pair<std::string, std::string>(res, f.getName().str()));
        }
    }

    // We run the optimizations:
    modulePassManager.run(*module, *moduleAnalysisManager);

    auto err = lljit->addIRModule(ThreadSafeModule(std::move(module), *tsContext));
    checkLLVMError(std::move(err));
}

First, as already reported above, keep in mind that I update the frontEndOptions.Inputs list just before calling this loadModule() function, so the compiler will receive a new cpp source file to work on.

Then we proceed as follow (starting just as in the previous article):

  1. We execute the compilation action
  2. Then we retrieve the resulting Module
  3. Then I added something new here: I list all the functions compiled in this new module, and for each of them, I map the function mangled name to a “somewhat demangled name” ⇒ I will come back to this point a bit later (see below).
  4. We continue with the optimization passes running the modulePassManager on our newly generated module
  5. And we end with adding the resulting optimized module to our JIT (as part of the default main JITDylib),
We could technically put different modules in different JITDylibs, but I'm not sure I would have a real need for that… to be clarified later ;-)
Again, in the code above, I use the microsoftDemangle function: that part should be adapted in a different development environment

And this is it: if this function call completes successfully, then it means that our code was successfully compiled and loaded into the “JIT dynamic library”, ready to be retrieved and used! And that's exactly what we do in the next section.

To retrieve a compiled function, we rely on the “symbol lookup” mechanism available in the llvm::orc::LLJIT object, but there is a catch that we must take into account: C++ function names are “mangled” and thus we cannot retrieve them just requesting to look for a function by its name. And that's where our function name mapping generating in the loadModule() call above comes into play :-):

uint64_t NervJITImpl::lookup(const std::string& name)
{
    // First we check if we have registered a mangled name for that function:
    auto it = functionNames.find(name);
    std::string fname;
    if(it != functionNames.end()) {
        fname = it->second;
        logDEBUG("Using mangled name: "<<fname);
    }
    else {
        fname = name;
    }
    
    JITEvaluatedSymbol sym = CHECK_LLVM(lljit->lookupLinkerMangled(fname.c_str()));
    return sym.getAddress();
}

Actually, the llvm::orc::LLJIT object also profite some support for retrieving a function from it's not mangled name: using then mangle() and mangleAndIntern() functions. I only had a very quick try at those and it didn't seem to produce the results I was expecting, that's why I built this custom mapping myself, but I should probably spend more time studying how to use the core functionalities…

With this code, we can request for instance:

	typedef int(*Func)(int, int);
	Func add1 = (Func)jit->lookup("int nv_add(int, int)");

And this will work here, even if the nv_add function is really a C++ function! (ie. it's not exported as a C function :-) so its “symbol” is not just its name!)

Now of course, this kind of usage is somewhat limited:

  • You still have to be aware of the “function signature” and type it exactly as LLVM will demangle its name. Obviously, I cheated here and I used the debug outputs from loadModule() to figure out what should be the input name I provide as test input:
    [Debug]               func0: '?nv_add@YAHHH@Z'
    [Debug]               Function demangled name is: 'int nv_add(int, int)'
    [Debug]               func1: '?nv_sub@@YAHHH@Z'
    [Debug]               Function demangled name is: 'int nv_sub(int, int)'
    [Debug]               Before optimization module function list:
    [Debug]               func0: '?nv_add2@@YAHHH@Z'
    [Debug]               Function demangled name is: 'int nv_add2(int, int)'
    [Debug]               func1: '?nv_sub2@@YAHHH@Z'
    [Debug]               Function demangled name is: 'int nv_sub2(int, int)'

But clearly, if you start using even “just slightly” complex types, things can quickly get… well… “far less handy”:

[Debug]               func2: '?nv_length@@YAHAEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@Z'
[Debug]               Function demangled name is: 'int nv_length(class std::basic_string<char, struct std::char_traits<char>, class std::allocator<char>> const &)'
[Debug]               func3: '?size@?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@QEBA_KXZ'
[Debug]               Function demangled name is: 'public: unsigned __int64 std::basic_string<char, struct std::char_traits<char>, class std::allocator<char>>::size(void) const'
[Debug]               func4: '?createClass@@YAPEAVMyClass@@H@Z'
[Debug]               Function demangled name is: 'class MyClass * createClass(int)'
[Debug]               func5: '??2@YAPEAX_K@Z'
[Debug]               Function demangled name is: 'void * operator new(unsigned __int64)'
[Debug]               func6: '??0MyClass@@QEAA@H@Z'
[Debug]               Function demangled name is: 'public: MyClass::MyClass(int)'
[Debug]               func7: '?showValue@@YAHPEAVMyClass@@@Z'
[Debug]               Function demangled name is: 'int showValue(class MyClass *)'
[Debug]               func8: '?getValue@MyClass@@QEBAHXZ'
[Debug]               Function demangled name is: 'public: int MyClass::getValue(void) const'
[Debug]               func9: '?_Get_data@?$_String_alloc@U?$_String_base_types@DV?$allocator@D@std@@@std@@@std@@QEBAAEBV?$_String_val@U?$_Simple_types@D@std@@@2@XZ'
[Debug]               Function demangled name is: 'public: class std::_String_val<struct std::_Simple_types<char>> const & std::_String_alloc<struct std::_String_base_types<char, class std::allocator<char>>>::_Get_data(void) const'
[Debug]               func10: '?_Get_second@?$_Compressed_pair@V?$allocator@D@std@@V?$_String_val@U?$_Simple_types@D@std@@@2@$00@std@@QEBAAEBV?$_String_val@U?$_Simple_types@D@std@@@2@XZ'
[Debug]               Function demangled name is: 'public: class std::_String_val<struct std::_Simple_types<char>> const & std::_Compressed_pair<class std::allocator<char>, class std::_String_val<struct std::_Simple_types<char>>, 1>::_Get_second(void) const'
Yet one thing we could do to handle this mess would be to use some kind of partial/regex matching between the computed demangled function name, and some part of it that we could provide as input for the lookup function. For instance, we could ensure that an input such as “createClass” would match the demamgled name “class MyClass * createClass(int)”, and thus find the symbol "?createClass@@YAPEAVMyClass@@H@Z". But I'm not sure this is really worth it.
  • In most cases, if you are writing the JIT code, it's way easier to just export the functions you want to use as extern “C”, this way you can avoid the mangling problem completely.
But thinking about it, maybe this could become useful in a situation where we have to call existing C++ code without any proper C interface. Eventually, I should consider this point more carefully: this could be fun!

So, finally, to ensure that this JIT implementation was working as expected, I updated my minimal test program with the following code:

#include <llvm_common.h>
#include <iostream>
#include <NervJIT.h>

#ifdef DEBUG_MSG
#undef DEBUG_MSG
#endif

#define DEBUG_MSG(msg) std::cout << msg << std::endl;

int main(int argc, char *argv[])
{
#if 0
	DEBUG_MSG("Running clang compilation...");
    runClang({"W:/Projects/NervSeed/temp/test1.cxx",
			  "W:/Projects/NervSeed/temp/test2.cxx"});
	DEBUG_MSG("Done running clang compilation.");
#else
	DEBUG_MSG("Initializing LLVM...");
	nv::initLLVM();

	DEBUG_MSG("Creating NervJIT...");
	nv::RefPtr<nv::NervJIT> jit = new nv::NervJIT();

	DEBUG_MSG("loading modules");
	jit->loadModuleFromFile("W:/Projects/NervSeed/temp/test1.cxx");
	jit->loadModuleFromFile("W:/Projects/NervSeed/temp/test2.cxx");
	// jit->loadModuleFromFile("W:/Projects/NervSeed/temp/test3.cxx");
	jit->loadModuleFromBuffer(R"(
int nv_add3(int a, int b)
{
    return (a+b)*3;
}

int nv_sub3(int a, int b)
{
    return (a-b)*3;
}
)");

	typedef int(*Func)(int, int);

	Func add1 = (Func)jit->lookup("int nv_add(int, int)");
	CHECK_RET(add1!=nullptr,1,"Invalid nv_add function.");
	DEBUG_MSG("nv_add(12,3) = "<< add1(12,3));
	// Func add2 = (Func)jit->lookup("nv_add2");
	Func add2 = (Func)jit->lookup("int nv_add2(int, int)");
	CHECK_RET(add2!=nullptr,1,"Invalid nv_add2 function.");
	DEBUG_MSG("nv_add(12,3) = "<< add2(12,3));
	Func add3 = (Func)jit->lookup("int nv_add3(int, int)");
	CHECK_RET(add3!=nullptr,1,"Invalid nv_add3 function.");
	DEBUG_MSG("nv_add(12,3) = "<< add3(12,3));

	DEBUG_MSG("Destroying NervJIT...");
	jit = nullptr;

	DEBUG_MSG("Uninitializing LLVM...");
	nv::uninitLLVM();
#endif

	DEBUG_MSG("Exiting...");
	return 0;
}

And this worked just fine, cool! :-)

⇒ In case this code would be of interest to anyone, here is a zip containing all the discussed files in their current version:

nvllvm_20200414.zip

The files in this zip package will not compile out of the box of course! (I'm not sharing my complete NervSeed project here…) But the provided code could be used as a template at least.
  • Clearly, I need to provide a more flexible mechanism to specify header files before running a compilation.
  • It might also be worth it to provide support to define macro variables (?)
  • I need to think a bit more about the function name demangling system: still not quite sure if it belongs here or not.
  • I'm also thinking maybe I should get rid of the PassBuilder/Managers system and I should rather just use a “legacy FunctionPassManager” implementation as described here: https://llvm.org/docs/tutorial/BuildingAJIT2.html (maybe this wa I could get rid of some of those rogue pointers I can't delete for now ?)

17/04/2020 Update: If you found this post interesting or helpful, then you might want to continue reading on this topic, with the next article available here: JIT Compiler with LLVM - Part 3 - Fixing the ModulePassManager crash

  • blog/2020/0414_jit_cpp_compiler.txt
  • Last modified: 2021/09/02 13:39
  • by manu