blog:2020:0416_jit_compiler_part3_fixing_crash

JIT Compiler with LLVM - Part 3 - Fixing the ModulePassManager crash

If you read my previous article on this topic (JIT C++ compiler with LLVM - Part 2) then you probably noticed there was a serious issue with the “NervJIT” implementation I described (as well as in the toy implementation in the runClang() described in the very first article): I couldn't release the IR optimization pass resources, nor the llvm::orc::LLJIT object as trying to do so was producing silent crashes. In this new post, we will focus on the steps I took to finaly get rid of this problem.

So let's get to it! :-)

So if you recall, in our NervJITImpl destructor, we had to explicitly “release” the offending pointers (thus producing memory leaks unfortunately…) as follow:

NervJITImpl::~NervJITImpl()
{
    // Note: the memory for the LLJIT object will leak here because we get a crash when we try to delete it.
    std::cout << "Releasing undestructible pointers." << std::endl;
    lljit.release();

    moduleAnalysisManager.release();
    cGSCCAnalysisManager.release();
    functionAnalysisManager.release();
    loopAnalysisManager.release();

    std::cout << "Done releasing undestructible pointers." << std::endl;
}

Thus, the first thing I decided to focus on was those 4 AnalysisManager objects. And before trying to clarify “why I coudn't delete them properly”, I rather started with the question: “Do I really need that stuff ?” ;-) ⇒ So I went in search of the techniques that were discussed online to optimize LLVM IR modules.

The problem at this level, is that LLVM is continuously evolving (quickly) and so, it's not easy to find something that is up-to-date with the git version I'm using. But still, I eventually found that page on stackoverflow: https://stackoverflow.com/questions/53738883/run-default-optimization-pipeline-using-modern-llvm/53739108

And even if this was meant for LLVM version 7.0 the good thing was that it was referencing one of the LLVM provided tool as base template: the opt tool, which I could find in the current version of the LLVM source tree!

I had a quick look at the sources inside llvm\tool\opt (mainly the file opt.cpp) and from there you could easily confirm that the code that was presented on the stackoverflow article was still largely similar to what is actually done inside opt, and thus I happily jumped on that train why very high hopes and implemented my own updated “module optimization” logic for my NervJIT component :-):

void addOptPasses(
  llvm::legacy::PassManagerBase &passes,
  llvm::legacy::FunctionPassManager &fnPasses,
  llvm::TargetMachine *machine
) {
  llvm::PassManagerBuilder builder;
  builder.OptLevel = 3;
  builder.SizeLevel = 0;
  builder.Inliner = llvm::createFunctionInliningPass(3, 0, false);
  builder.LoopVectorize = true;
  builder.DisableUnrollLoops = false;
  builder.SLPVectorize = true;
  machine->adjustPassManager(builder);

  builder.populateFunctionPassManager(fnPasses);
  builder.populateModulePassManager(passes);
}

void addLinkPasses(llvm::legacy::PassManagerBase &passes) {
  llvm::PassManagerBuilder builder;
  builder.VerifyInput = true;
  builder.Inliner = llvm::createFunctionInliningPass(3, 0, false);
  builder.populateLTOPassManager(passes);
}

void NervJITImpl::optimizeModule(llvm::Module *module) {
    module->setTargetTriple(targetMachine->getTargetTriple().str());
    module->setDataLayout(targetMachine->createDataLayout());

    // DEBUG_MSG("Creating legacy pass manager.");
    llvm::legacy::PassManager passes;
    passes.add(new llvm::TargetLibraryInfoWrapperPass(targetMachine->getTargetTriple()));
    passes.add(llvm::createTargetTransformInfoWrapperPass(targetMachine->getTargetIRAnalysis()));

    // DEBUG_MSG("Creating legacy function pass manager.");
    llvm::legacy::FunctionPassManager fnPasses(module);
    fnPasses.add(llvm::createTargetTransformInfoWrapperPass(targetMachine->getTargetIRAnalysis()));

    addOptPasses(passes, fnPasses, targetMachine.get());
    addLinkPasses(passes);

    // DEBUG_MSG("Optimizing functions");
    fnPasses.doInitialization();
    for (llvm::Function &func : *module) {
        fnPasses.run(func);
    }
    fnPasses.doFinalization();

    passes.add(llvm::createVerifierPass());
    // DEBUG_MSG("Running PassManager.");
    passes.run(*module);
    // DEBUG_MSG("Optimization done.");
}

To be able to compile that code above, the only new thing I needed was a proper llvm::TargetMachine pointer: took me some time to figure out how to get that one, but in the end, it's quite straightforward (again, I build that object in my NervJITImpl constructor):

    auto jtmb = CHECK_LLVM(JITTargetMachineBuilder::detectHost());
    targetMachine = CHECK_LLVM(jtmb.createTargetMachine());

Then all I had to do was to replace the call to the ModulePassManager::run() function inside my NervJITImpl::loadModule() method with my new optimizeModule() function:

    // We run the optimizations:
    DEBUG_MSG("Optimizing module...");
    //modulePassManager.run(*module, *moduleAnalysisManager);
    optimizeModule(module.get());

    DEBUG_MSG("Module function list: ");

… And… this failed LAMENTABLY… LOL ⇒ silent crash somewhere after the output “Optimizing module…” [Arrff, I guess I had my hopes too high then…]

So back to investigations, countless “change/build/run tests”, and then I eventually tracked the issue to be related, once more, to the module pass manager! In fact, I got the optimization stage to work if I commented those two lines (respectively at the very end of addOptPasses() and addLinkPasses()):

    // in addOptPasses():
    // builder.populateModulePassManager(passes);
    
    // in addLinkPasses():
    // builder.populateLTOPassManager(passes);

But of course, this was no acceptable solution: this code was working fine in the llvm opt tool afterall! So I thought is was time to change my perspective and focus instead on the opt tool itself in a quest to understand what I was doing wrong in my own project.

While I was orbiting around the LLVM opt, I did successfully run that tool on the command line with some random LLVM assembly test file (that I found somewhere else in the LLVM source tree), using this kind of command:

opt --O3 --debugify-each --verify-each bcsection.ll -o result.bc

And this would produce this kind of result:

CheckModuleDebugify [Force set function attributes]: PASS
CheckModuleDebugify [Infer set function attributes]: PASS
CheckFunctionDebugify [Call-site splitting]: PASS
CheckModuleDebugify [Interprocedural Sparse Conditional Constant Propagation]: PASS
CheckModuleDebugify [Called Value Propagation]: PASS
CheckModuleDebugify [Global Variable Optimizer]: PASS
CheckFunctionDebugify [Promote Memory to Register]: PASS
CheckModuleDebugify [Dead Argument Elimination]: PASS
CheckFunctionDebugify [Combine redundant instructions]: PASS
CheckFunctionDebugify [Simplify the CFG]: PASS
CheckModuleDebugify [Globals Alias Analysis]: PASS
CheckFunctionDebugify [SROA]: PASS
CheckFunctionDebugify [Early CSE w/ MemorySSA]: PASS
CheckFunctionDebugify [Speculatively execute instructions if target has divergent branches]: PASS
CheckFunctionDebugify [Jump Threading]: PASS
CheckFunctionDebugify [Value Propagation]: PASS
CheckFunctionDebugify [Simplify the CFG]: PASS
CheckFunctionDebugify [Combine pattern based expressions]: PASS
CheckFunctionDebugify [Combine redundant instructions]: PASS
CheckFunctionDebugify [Conditionally eliminate dead library calls]: PASS
CheckFunctionDebugify [PGOMemOPSize]: PASS
CheckFunctionDebugify [Tail Call Elimination]: PASS
CheckFunctionDebugify [Simplify the CFG]: PASS
CheckFunctionDebugify [Reassociate expressions]: PASS
CheckFunctionDebugify [Simplify the CFG]: PASS
CheckFunctionDebugify [Combine redundant instructions]: PASS
CheckFunctionDebugify [MergedLoadStoreMotion]: PASS
CheckFunctionDebugify [Global Value Numbering]: PASS
CheckFunctionDebugify [MemCpy Optimization]: PASS
CheckFunctionDebugify [Sparse Conditional Constant Propagation]: PASS
CheckFunctionDebugify [Bit-Tracking Dead Code Elimination]: PASS
CheckFunctionDebugify [Combine redundant instructions]: PASS
CheckFunctionDebugify [Jump Threading]: PASS
CheckFunctionDebugify [Value Propagation]: PASS
CheckFunctionDebugify [Dead Store Elimination]: PASS
CheckFunctionDebugify [Aggressive Dead Code Elimination]: PASS
CheckFunctionDebugify [Simplify the CFG]: PASS
CheckFunctionDebugify [Combine redundant instructions]: PASS
CheckModuleDebugify [A No-Op Barrier Pass]: PASS
CheckModuleDebugify [Eliminate Available Externally Globals]: PASS
CheckModuleDebugify [Deduce function attributes in RPO]: PASS
CheckModuleDebugify [Global Variable Optimizer]: PASS
CheckModuleDebugify [Dead Global Elimination]: PASS
CheckModuleDebugify [Globals Alias Analysis]: PASS
CheckFunctionDebugify [Float to int]: PASS
CheckFunctionDebugify [Lower constant intrinsics]: PASS
CheckFunctionDebugify [Loop Distribution]: PASS
CheckFunctionDebugify [Loop Vectorization]: PASS
CheckFunctionDebugify [Optimize scalar/vector ops]: PASS
CheckFunctionDebugify [Early CSE]: PASS
CheckFunctionDebugify [Loop Load Elimination]: PASS
CheckFunctionDebugify [Combine redundant instructions]: PASS
CheckFunctionDebugify [Simplify the CFG]: PASS
CheckFunctionDebugify [SLP Vectorizer]: PASS
CheckFunctionDebugify [Combine redundant instructions]: PASS
CheckFunctionDebugify [Combine redundant instructions]: PASS
CheckFunctionDebugify [Warn about non-applied transformations]: PASS
CheckFunctionDebugify [Alignment from assumptions]: PASS
CheckModuleDebugify [Strip Unused Function Prototypes]: PASS
CheckModuleDebugify [Dead Global Elimination]: PASS
CheckModuleDebugify [Merge Duplicate Global Constants]: PASS
CheckFunctionDebugify [Remove redundant instructions]: PASS
CheckFunctionDebugify [Hoist/decompose integer division and remainder]: PASS
CheckFunctionDebugify [Simplify the CFG]: PASS
To be honest, I was still not quite sure if this program was ending just fine or also silently crashing after the last “CheckFunctionDebugify line” reported above, but still, I thought this was a good reference point anyway, and I should at least be able to reproduce those results if I were to rebuild this tool from sources myself, no?

At this point, one will naturally refer to the opt tool cmake files inside the LLVM source tree to figure out how to setup a clone project “out of LLVM sources”, usually Cmake is a pretty simple language to deal with, right? Well… unfortunately, this time, unless you are an absolute CMake Guru, this path will not lead you very far I'm afraid: LLVM CMake files are really cryptic, so I wish you good luck to try to extract the information you need from this kind of content:

set(LLVM_LINK_COMPONENTS
  AllTargetsAsmParsers
  AllTargetsCodeGens
  AllTargetsDescs
  AllTargetsInfos
  AggressiveInstCombine
  Analysis
  BitWriter
  CodeGen
  Core
  Coroutines
  IPO
  IRReader
  InstCombine
  Instrumentation
  MC
  ObjCARCOpts
  Remarks
  ScalarOpts
  Support
  Target
  TransformUtils
  Vectorize
  Passes
  )

add_llvm_tool(opt
  AnalysisWrappers.cpp
  BreakpointPrinter.cpp
  GraphPrinters.cpp
  NewPMDriver.cpp
  PassPrinters.cpp
  PrintSCC.cpp
  opt.cpp

  ENABLE_PLUGINS

  DEPENDS
  intrinsics_gen
  SUPPORT_PLUGINS
  )
export_executable_symbols_for_plugins(opt)

if(LLVM_BUILD_EXAMPLES)
    target_link_libraries(opt PRIVATE ExampleIRTransforms)
endif(LLVM_BUILD_EXAMPLES)
If you really feel like giving this a try and want to find the definitions and behavior of all the llvm specific functions/macros used in that file, a good starting point is the file ${your_llvm_install_dir}/lib/cmake/llvm/AddLLVM.cmake. But you have been warned… [So, have fun! ;-) and just make sure you stop reading that file before you hang yourself in your bathroom…]

So I decided it could be good enough to just take the source files in that tool folder and build the cmake file myself: afterall, I could already build a “working” nvLLVM [Well, expect your “working module” is silently crashing or leaking memory Manu, don't forget that lol] with a manually crafted CMakeLists.txt file, so I could maybe use that as a template and see how it goes ?

And that's exactly was I did: building a test sub project which I called test_llvm_opt, linking with all the LLVM libraries and using the same CXX flags as in my previous nvLLVM sub project.

Surprisingly, getting this clone of the llvm opt tool to compile was not too hard in the end, but then, it was time to give it a try (using the exact same input as with the official opt.exe tool obviously:)

test_llvm_opt.exe --O3 --debugify-each --verify-each bcsection.ll -o result.bc
CheckModuleDebugify [Force set function attributes]: PASS
CheckModuleDebugify [Infer set function attributes]: PASS
CheckFunctionDebugify [Call-site splitting]: PASS
CheckModuleDebugify [Interprocedural Sparse Conditional Constant Propagation]: PASS
CheckModuleDebugify [Called Value Propagation]: PASS
CheckModuleDebugify [Global Variable Optimizer]: PASS
CheckFunctionDebugify [Promote Memory to Register]: PASS
CheckModuleDebugify [Dead Argument Elimination]: PASS
CheckFunctionDebugify [Combine redundant instructions]: PASS
CheckFunctionDebugify [Simplify the CFG]: PASS
CheckModuleDebugify [Globals Alias Analysis]: PASS
CheckFunctionDebugify [SROA]: PASS
CheckFunctionDebugify [Early CSE w/ MemorySSA]: PASS
CheckFunctionDebugify [Speculatively execute instructions if target has divergent branches]: PASS
CheckFunctionDebugify [Jump Threading]: PASS
CheckFunctionDebugify [Value Propagation]: PASS
CheckFunctionDebugify [Simplify the CFG]: PASS
CheckFunctionDebugify [Combine pattern based expressions]: PASS
CheckFunctionDebugify [Combine redundant instructions]: PASS
CheckFunctionDebugify [Conditionally eliminate dead library calls]: PASS
CheckFunctionDebugify [PGOMemOPSize]: PASS
CheckFunctionDebugify [Tail Call Elimination]: PASS
CheckFunctionDebugify [Simplify the CFG]: PASS
CheckFunctionDebugify [Reassociate expressions]: PASS
CheckFunctionDebugify [Simplify the CFG]: PASS
CheckFunctionDebugify [Combine redundant instructions]: PASS
CheckFunctionDebugify [MergedLoadStoreMotion]: PASS
CheckFunctionDebugify [Global Value Numbering]: PASS
CheckFunctionDebugify [MemCpy Optimization]: PASS
CheckFunctionDebugify [Sparse Conditional Constant Propagation]: PASS
CheckFunctionDebugify [Bit-Tracking Dead Code Elimination]: PASS
CheckFunctionDebugify [Combine redundant instructions]: PASS
CheckFunctionDebugify [Jump Threading]: PASS
CheckFunctionDebugify [Value Propagation]: PASS
CheckFunctionDebugify [Dead Store Elimination]: PASS

And BBAAMMM! The process will die at this exact same location everytime I run this command (whereas the offical opt tool will continue here with many additional passes: [Aggressive Dead Code Elimination], etc…)

So at this point I really convinced there was something going wrong in my cmake setup… but what?!

I went back to the official opt cmake files, trying to focus very hard on all the configuration elements, macros, conditions, etc… and this was clearly just overwhelming and unmanageable :-( I was starting to loose faith but then I realized something that saved my day: cmake is anyway just writing (somewhat) “regular” makefile in the end! (at least when using the NMake generator like I am (?))

I didn't have any other option left anyway, so I just dived into the generated cmake build files for my LLVM git sources, and this really was an eye opener! :-)

First I found the file tools\opt\CMakeFiles\opt.dir\build.make, in there you will find content such as:

tools\opt\CMakeFiles\opt.dir\AnalysisWrappers.cpp.obj: tools\opt\CMakeFiles\opt.dir\flags.make
tools\opt\CMakeFiles\opt.dir\AnalysisWrappers.cpp.obj: W:\Projects\NervSeed\deps\build\llvm-20200409\llvm\tools\opt\AnalysisWrappers.cpp
	@$(CMAKE_COMMAND) -E cmake_echo_color --switch=$(COLOR) --green --progress-dir=W:\Projects\NervSeed\deps\build\llvm-20200409\build\CMakeFiles --progress-num=$(CMAKE_PROGRESS_1) "Building CXX object tools/opt/CMakeFiles/opt.dir/AnalysisWrappers.cpp.obj"
	cd W:\Projects\NervSeed\deps\build\llvm-20200409\build\tools\opt
	D:\Apps\VisualStudio2017_CE\VC\Tools\MSVC\14.16.27023\bin\Hostx64\x64\cl.exe @<<
 /nologo /TP $(CXX_DEFINES) $(CXX_INCLUDES) $(CXX_FLAGS) /FoCMakeFiles\opt.dir\AnalysisWrappers.cpp.obj /FdCMakeFiles\opt.dir\ /FS -c W:\Projects\NervSeed\deps\build\llvm-20200409\llvm\tools\opt\AnalysisWrappers.cpp

And at the begining of that file we have:

# Include any dependencies generated for this target.
include tools\opt\CMakeFiles\opt.dir\depend.make

# Include the progress variables for this target.
include tools\opt\CMakeFiles\opt.dir\progress.make

# Include the compile flags for this target's objects.
include tools\opt\CMakeFiles\opt.dir\flags.make

So, naturally, you check the sibling flags.make file, and here, bingo!:

# compile CXX with D:/Apps/VisualStudio2017_CE/VC/Tools/MSVC/14.16.27023/bin/Hostx64/x64/cl.exe
# compile RC with C:/Program Files (x86)/Windows Kits/10/bin/10.0.18362.0/x64/rc.exe
CXX_FLAGS = /DWIN32 /D_WINDOWS   /Zc:inline /Zc:strictStrings /Oi /Zc:rvalueCast /W4 -wd4141 -wd4146 -wd4244 -wd4267 -wd4291 -wd4351 -wd4456 -wd4457 -wd4458 -wd4459 -wd4503 -wd4624 -wd4722 -wd4100 -wd4127 -wd4512 -wd4505 -wd4610 -wd4510 -wd4702 -wd4245 -wd4706 -wd4310 -wd4701 -wd4703 -wd4389 -wd4611 -wd4805 -wd4204 -wd4577 -wd4091 -wd4592 -wd4319 -wd4709 -wd4324 -w14062 -we4238 /Gw /MD /O2 /Ob2 /DNDEBUG    /EHs-c- /GR-

CXX_DEFINES = -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS

CXX_INCLUDES = -IW:\Projects\NervSeed\deps\build\llvm-20200409\build\tools\opt -IW:\Projects\NervSeed\deps\build\llvm-20200409\llvm\tools\opt -IW:\Projects\NervSeed\deps\build\llvm-20200409\build\include -IW:\Projects\NervSeed\deps\build\llvm-20200409\llvm\include 

⇒ So with these lines, you know exactly what are the command lines executed to compile the cpp files in that folder to obj files.

And finally back to the build.make, you will also find the linking target definition:

bin\opt.exe: tools\opt\CMakeFiles\opt.dir\objects1.rsp
	@$(CMAKE_COMMAND) -E cmake_echo_color --switch=$(COLOR) --green --bold --progress-dir=W:\Projects\NervSeed\deps\build\llvm-20200409\build\CMakeFiles --progress-num=$(CMAKE_PROGRESS_9) "Linking CXX executable ..\..\bin\opt.exe"
	cd W:\Projects\NervSeed\deps\build\llvm-20200409\build\tools\opt
	W:\Projects\NervSeed\tools\windows\cmake-3.9.2\bin\cmake.exe -E vs_link_exe --intdir=CMakeFiles\opt.dir --manifests  -- D:\Apps\VisualStudio2017_CE\VC\Tools\MSVC\14.16.27023\bin\Hostx64\x64\link.exe /nologo @CMakeFiles\opt.dir\objects1.rsp @<<
 /out:..\..\bin\opt.exe /implib:..\..\lib\opt.lib /pdb:W:\Projects\NervSeed\deps\build\llvm-20200409\build\bin\opt.pdb /version:0.0  /machine:x64 /STACK:10000000 /INCREMENTAL:NO /subsystem:console ..\..\lib\LLVMAArch64AsmParser.lib ..\..\lib\LLVMAMDGPUAsmParser.lib ..\..\lib\LLVMARMAsmParser.lib ..\..\lib\LLVMAVRAsmParser.lib ..\..\lib\LLVMBPFAsmParser.lib ..\..\lib\LLVMHexagonAsmParser.lib ..\..\lib\LLVMLanaiAsmParser.lib ..\..\lib\LLVMMipsAsmParser.lib ..\..\lib\LLVMMSP430AsmParser.lib ..\..\lib\LLVMPowerPCAsmParser.lib ..\..\lib\LLVMRISCVAsmParser.lib ..\..\lib\LLVMSparcAsmParser.lib ..\..\lib\LLVMSystemZAsmParser.lib ..\..\lib\LLVMWebAssemblyAsmParser.lib ..\..\lib\LLVMX86AsmParser.lib ..\..\lib\LLVMAArch64CodeGen.lib ..\..\lib\LLVMAMDGPUCodeGen.lib ..\..\lib\LLVMARMCodeGen.lib ..\..\lib\LLVMAVRCodeGen.lib ..\..\lib\LLVMBPFCodeGen.lib ..\..\lib\LLVMHexagonCodeGen.lib ..\..\lib\LLVMLanaiCodeGen.lib ..\..\lib\LLVMMipsCodeGen.lib ..\..\lib\LLVMMSP430CodeGen.lib ..\..\lib\LLVMNVPTXCodeGen.lib ..\..\lib\LLVMPowerPCCodeGen.lib ..\..\lib\LLVMRISCVCodeGen.lib ..\..\lib\LLVMSparcCodeGen.lib ..\..\lib\LLVMSystemZCodeGen.lib ..\..\lib\LLVMWebAssemblyCodeGen.lib ..\..\lib\LLVMX86CodeGen.lib ..\..\lib\LLVMXCoreCodeGen.lib ..\..\lib\LLVMAArch64Desc.lib ..\..\lib\LLVMAMDGPUDesc.lib ..\..\lib\LLVMARMDesc.lib ..\..\lib\LLVMAVRDesc.lib ..\..\lib\LLVMBPFDesc.lib ..\..\lib\LLVMHexagonDesc.lib ..\..\lib\LLVMLanaiDesc.lib ..\..\lib\LLVMMipsDesc.lib ..\..\lib\LLVMMSP430Desc.lib ..\..\lib\LLVMNVPTXDesc.lib ..\..\lib\LLVMPowerPCDesc.lib ..\..\lib\LLVMRISCVDesc.lib ..\..\lib\LLVMSparcDesc.lib ..\..\lib\LLVMSystemZDesc.lib ..\..\lib\LLVMWebAssemblyDesc.lib ..\..\lib\LLVMX86Desc.lib ..\..\lib\LLVMXCoreDesc.lib ..\..\lib\LLVMAArch64Info.lib ..\..\lib\LLVMAMDGPUInfo.lib ..\..\lib\LLVMARMInfo.lib ..\..\lib\LLVMAVRInfo.lib ..\..\lib\LLVMBPFInfo.lib ..\..\lib\LLVMHexagonInfo.lib ..\..\lib\LLVMLanaiInfo.lib ..\..\lib\LLVMMipsInfo.lib ..\..\lib\LLVMMSP430Info.lib ..\..\lib\LLVMNVPTXInfo.lib ..\..\lib\LLVMPowerPCInfo.lib ..\..\lib\LLVMRISCVInfo.lib ..\..\lib\LLVMSparcInfo.lib ..\..\lib\LLVMSystemZInfo.lib ..\..\lib\LLVMWebAssemblyInfo.lib ..\..\lib\LLVMX86Info.lib ..\..\lib\LLVMXCoreInfo.lib ..\..\lib\LLVMAggressiveInstCombine.lib ..\..\lib\LLVMAnalysis.lib ..\..\lib\LLVMBitWriter.lib ..\..\lib\LLVMCodeGen.lib ..\..\lib\LLVMCore.lib ..\..\lib\LLVMCoroutines.lib ..\..\lib\LLVMipo.lib ..\..\lib\LLVMIRReader.lib ..\..\lib\LLVMInstCombine.lib ..\..\lib\LLVMInstrumentation.lib ..\..\lib\LLVMMC.lib ..\..\lib\LLVMObjCARCOpts.lib ..\..\lib\LLVMRemarks.lib ..\..\lib\LLVMScalarOpts.lib ..\..\lib\LLVMSupport.lib ..\..\lib\LLVMTarget.lib ..\..\lib\LLVMTransformUtils.lib ..\..\lib\LLVMVectorize.lib ..\..\lib\LLVMPasses.lib ..\..\lib\LLVMAArch64Utils.lib ..\..\lib\LLVMAMDGPUUtils.lib ..\..\lib\LLVMMIRParser.lib ..\..\lib\LLVMARMUtils.lib ..\..\lib\LLVMHexagonAsmParser.lib ..\..\lib\LLVMHexagonDesc.lib ..\..\lib\LLVMHexagonInfo.lib ..\..\lib\LLVMLanaiAsmParser.lib ..\..\lib\LLVMLanaiDesc.lib ..\..\lib\LLVMLanaiInfo.lib ..\..\lib\LLVMRISCVUtils.lib ..\..\lib\LLVMMCDisassembler.lib ..\..\lib\LLVMCFGuard.lib ..\..\lib\LLVMGlobalISel.lib ..\..\lib\LLVMX86Utils.lib ..\..\lib\LLVMAsmPrinter.lib ..\..\lib\LLVMDebugInfoDWARF.lib ..\..\lib\LLVMSelectionDAG.lib ..\..\lib\LLVMCodeGen.lib ..\..\lib\LLVMCoroutines.lib ..\..\lib\LLVMipo.lib ..\..\lib\LLVMBitWriter.lib ..\..\lib\LLVMIRReader.lib ..\..\lib\LLVMAsmParser.lib ..\..\lib\LLVMFrontendOpenMP.lib ..\..\lib\LLVMLinker.lib ..\..\lib\LLVMInstrumentation.lib ..\..\lib\LLVMScalarOpts.lib ..\..\lib\LLVMAggressiveInstCombine.lib ..\..\lib\LLVMInstCombine.lib ..\..\lib\LLVMTarget.lib ..\..\lib\LLVMVectorize.lib ..\..\lib\LLVMTransformUtils.lib ..\..\lib\LLVMAnalysis.lib ..\..\lib\LLVMProfileData.lib ..\..\lib\LLVMObject.lib ..\..\lib\LLVMMCParser.lib ..\..\lib\LLVMMC.lib ..\..\lib\LLVMDebugInfoCodeView.lib ..\..\lib\LLVMDebugInfoMSF.lib ..\..\lib\LLVMBitReader.lib ..\..\lib\LLVMTextAPI.lib ..\..\lib\LLVMCore.lib ..\..\lib\LLVMRemarks.lib ..\..\lib\LLVMBitstreamReader.lib ..\..\lib\LLVMBinaryFormat.lib ..\..\lib\LLVMSupport.lib psapi.lib shell32.lib ole32.lib uuid.lib advapi32.lib delayimp.lib -delayload:shell32.dll -delayload:ole32.dll ..\..\lib\LLVMDemangle.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib 
I think this is actually a very important lesson every developer should try to keep in mind with this kind of experiment: if you feel overwhelmed by the cmake functions in use in a given project that you want to replicate somewhere else, then check the cmake generated Makefile to find raw build comands with CXX and link flags!

So of course, with those new inputs, I immediately updated my test_llvm_opt CMakeLists.txt file as follow:

SET(TARGET_NAME "test_llvm_opt")
SET(TARGET_DIR "./")

SET(LLVM_LIBS LLVMAArch64AsmParser.lib LLVMAMDGPUAsmParser.lib LLVMARMAsmParser.lib LLVMAVRAsmParser.lib LLVMBPFAsmParser.lib LLVMHexagonAsmParser.lib LLVMLanaiAsmParser.lib LLVMMipsAsmParser.lib LLVMMSP430AsmParser.lib LLVMPowerPCAsmParser.lib LLVMRISCVAsmParser.lib LLVMSparcAsmParser.lib LLVMSystemZAsmParser.lib LLVMWebAssemblyAsmParser.lib LLVMX86AsmParser.lib LLVMAArch64CodeGen.lib LLVMAMDGPUCodeGen.lib LLVMARMCodeGen.lib LLVMAVRCodeGen.lib LLVMBPFCodeGen.lib LLVMHexagonCodeGen.lib LLVMLanaiCodeGen.lib LLVMMipsCodeGen.lib LLVMMSP430CodeGen.lib LLVMNVPTXCodeGen.lib LLVMPowerPCCodeGen.lib LLVMRISCVCodeGen.lib LLVMSparcCodeGen.lib LLVMSystemZCodeGen.lib LLVMWebAssemblyCodeGen.lib LLVMX86CodeGen.lib LLVMXCoreCodeGen.lib LLVMAArch64Desc.lib LLVMAMDGPUDesc.lib LLVMARMDesc.lib LLVMAVRDesc.lib LLVMBPFDesc.lib LLVMHexagonDesc.lib LLVMLanaiDesc.lib LLVMMipsDesc.lib LLVMMSP430Desc.lib LLVMNVPTXDesc.lib LLVMPowerPCDesc.lib LLVMRISCVDesc.lib LLVMSparcDesc.lib LLVMSystemZDesc.lib LLVMWebAssemblyDesc.lib LLVMX86Desc.lib LLVMXCoreDesc.lib LLVMAArch64Info.lib LLVMAMDGPUInfo.lib LLVMARMInfo.lib LLVMAVRInfo.lib LLVMBPFInfo.lib LLVMHexagonInfo.lib LLVMLanaiInfo.lib LLVMMipsInfo.lib LLVMMSP430Info.lib LLVMNVPTXInfo.lib LLVMPowerPCInfo.lib LLVMRISCVInfo.lib LLVMSparcInfo.lib LLVMSystemZInfo.lib LLVMWebAssemblyInfo.lib LLVMX86Info.lib LLVMXCoreInfo.lib LLVMAggressiveInstCombine.lib LLVMAnalysis.lib LLVMBitWriter.lib LLVMCodeGen.lib LLVMCore.lib LLVMCoroutines.lib LLVMipo.lib LLVMIRReader.lib LLVMInstCombine.lib LLVMInstrumentation.lib LLVMMC.lib LLVMObjCARCOpts.lib LLVMRemarks.lib LLVMScalarOpts.lib LLVMSupport.lib LLVMTarget.lib LLVMTransformUtils.lib LLVMVectorize.lib LLVMPasses.lib LLVMAArch64Utils.lib LLVMAMDGPUUtils.lib LLVMMIRParser.lib LLVMARMUtils.lib LLVMHexagonAsmParser.lib LLVMHexagonDesc.lib LLVMHexagonInfo.lib LLVMLanaiAsmParser.lib LLVMLanaiDesc.lib LLVMLanaiInfo.lib LLVMRISCVUtils.lib LLVMMCDisassembler.lib LLVMCFGuard.lib LLVMGlobalISel.lib LLVMX86Utils.lib LLVMAsmPrinter.lib LLVMDebugInfoDWARF.lib LLVMSelectionDAG.lib LLVMCodeGen.lib LLVMCoroutines.lib LLVMipo.lib LLVMBitWriter.lib LLVMIRReader.lib LLVMAsmParser.lib LLVMFrontendOpenMP.lib LLVMLinker.lib LLVMInstrumentation.lib LLVMScalarOpts.lib LLVMAggressiveInstCombine.lib LLVMInstCombine.lib LLVMTarget.lib LLVMVectorize.lib LLVMTransformUtils.lib LLVMAnalysis.lib LLVMProfileData.lib LLVMObject.lib LLVMMCParser.lib LLVMMC.lib LLVMDebugInfoCodeView.lib LLVMDebugInfoMSF.lib LLVMBitReader.lib LLVMTextAPI.lib LLVMCore.lib LLVMRemarks.lib LLVMBitstreamReader.lib LLVMBinaryFormat.lib LLVMSupport.lib psapi.lib shell32.lib ole32.lib uuid.lib advapi32.lib delayimp.lib LLVMDemangle.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib)

SET(CMAKE_CXX_FLAGS "/std:c++14 -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS /DWIN32 /D_WINDOWS   /Zc:inline /Zc:strictStrings /Oi /Zc:rvalueCast /W4 -wd4141 -wd4146 -wd4244 -wd4267 -wd4291 -wd4351 -wd4456 -wd4457 -wd4458 -wd4459 -wd4503 -wd4624 -wd4722 -wd4100 -wd4127 -wd4512 -wd4505 -wd4610 -wd4510 -wd4702 -wd4245 -wd4706 -wd4310 -wd4701 -wd4703 -wd4389 -wd4611 -wd4805 -wd4204 -wd4577 -wd4091 -wd4592 -wd4319 -wd4709 -wd4324 -w14062 -we4238 /Gw /MD /O2 /Ob2 /DNDEBUG  /EHs-c- /GR-")

include_directories(${LLVM_CLANG_DIR}/include)
LINK_DIRECTORIES(${LLVM_CLANG_DIR}/lib)

FILE(GLOB_RECURSE SOURCE_FILES "*.cpp" )
set(SOURCE_FILES ${SOURCE_FILES} windows_version_resource.rc)

ADD_EXECUTABLE (${TARGET_NAME} ${SOURCE_FILES})
TARGET_LINK_LIBRARIES(${TARGET_NAME} ${LLVM_LIBS})

INSTALL(TARGETS ${TARGET_NAME}
	RUNTIME DESTINATION ${TARGET_DIR}
  LIBRARY DESTINATION ${TARGET_DIR})

Again, the test project compiled just fine, but this time, running the command test_llvm_opt.exe --O3 --debugify-each --verify-each bcsection.ll -o result.bc produced the same results as the official LLVM opt tool: yyeeeeepppeeee! Victory! :-D

From there the path started to smoothly clear itself: updating the cmake build files fixed a silent crash in my test_llvm_opt project, so naturally, the next logical step was to apply the same updates to the nvLLVM project, and its main CmakeLists.txt was thus modified as follow:

SET(TARGET_DIR "./")

include_directories(${LLVM_CLANG_DIR}/include)
LINK_DIRECTORIES(${LLVM_CLANG_DIR}/lib)

INCLUDE_DIRECTORIES(../nvCore/include)

SET(CMAKE_CXX_FLAGS "/std:c++14 -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS /DWIN32 /D_WINDOWS   /Zc:inline /Zc:strictStrings /Oi /Zc:rvalueCast /W4 -wd4141 -wd4146 -wd4244 -wd4267 -wd4291 -wd4351 -wd4456 -wd4457 -wd4458 -wd4459 -wd4503 -wd4624 -wd4722 -wd4100 -wd4127 -wd4512 -wd4505 -wd4610 -wd4510 -wd4702 -wd4245 -wd4706 -wd4310 -wd4701 -wd4703 -wd4389 -wd4611 -wd4805 -wd4204 -wd4577 -wd4091 -wd4592 -wd4319 -wd4709 -wd4324 -w14062 -we4238 /Gw /MD /O2 /Ob2 /DNDEBUG  /EHs-c- /GR-")
ADD_DEFINITIONS(-D_CRT_SECURE_NO_WARNINGS)
ADD_DEFINITIONS(-DNOMINMAX)

# Note: used llvm-config.exe --libs to retrieve the list of libraries below:
# LLVM version 11.0.0git
SET(LLVM_LIBS LLVMXRay LLVMWindowsManifest LLVMTableGen LLVMSymbolize LLVMDebugInfoPDB LLVMOrcJIT LLVMOrcError LLVMJITLink LLVMObjectYAML LLVMMCA LLVMLTO LLVMPasses LLVMCoroutines LLVMObjCARCOpts LLVMLineEditor LLVMLibDriver LLVMInterpreter LLVMFuzzMutate LLVMMCJIT LLVMExecutionEngine LLVMRuntimeDyld LLVMDWARFLinker LLVMDlltoolDriver LLVMOption LLVMDebugInfoGSYM LLVMCoverage LLVMXCoreDisassembler LLVMXCoreCodeGen LLVMXCoreDesc LLVMXCoreInfo LLVMX86Disassembler LLVMX86AsmParser LLVMX86CodeGen LLVMX86Desc LLVMX86Utils LLVMX86Info LLVMWebAssemblyDisassembler LLVMWebAssemblyCodeGen LLVMWebAssemblyDesc LLVMWebAssemblyAsmParser LLVMWebAssemblyInfo LLVMSystemZDisassembler LLVMSystemZCodeGen LLVMSystemZAsmParser LLVMSystemZDesc LLVMSystemZInfo LLVMSparcDisassembler LLVMSparcCodeGen LLVMSparcAsmParser LLVMSparcDesc LLVMSparcInfo LLVMRISCVDisassembler LLVMRISCVCodeGen LLVMRISCVAsmParser LLVMRISCVDesc LLVMRISCVUtils LLVMRISCVInfo LLVMPowerPCDisassembler LLVMPowerPCCodeGen LLVMPowerPCAsmParser LLVMPowerPCDesc LLVMPowerPCInfo LLVMNVPTXCodeGen LLVMNVPTXDesc LLVMNVPTXInfo LLVMMSP430Disassembler LLVMMSP430CodeGen LLVMMSP430AsmParser LLVMMSP430Desc LLVMMSP430Info LLVMMipsDisassembler LLVMMipsCodeGen LLVMMipsAsmParser LLVMMipsDesc LLVMMipsInfo LLVMLanaiDisassembler LLVMLanaiCodeGen LLVMLanaiAsmParser LLVMLanaiDesc LLVMLanaiInfo LLVMHexagonDisassembler LLVMHexagonCodeGen LLVMHexagonAsmParser LLVMHexagonDesc LLVMHexagonInfo LLVMBPFDisassembler LLVMBPFCodeGen LLVMBPFAsmParser LLVMBPFDesc LLVMBPFInfo LLVMAVRDisassembler LLVMAVRCodeGen LLVMAVRAsmParser LLVMAVRDesc LLVMAVRInfo LLVMARMDisassembler LLVMARMCodeGen LLVMARMAsmParser LLVMARMDesc LLVMARMUtils LLVMARMInfo LLVMAMDGPUDisassembler LLVMAMDGPUCodeGen LLVMMIRParser LLVMipo LLVMInstrumentation LLVMVectorize LLVMLinker LLVMIRReader LLVMAsmParser LLVMFrontendOpenMP LLVMAMDGPUAsmParser LLVMAMDGPUDesc LLVMAMDGPUUtils LLVMAMDGPUInfo LLVMAArch64Disassembler LLVMMCDisassembler LLVMAArch64CodeGen LLVMCFGuard LLVMGlobalISel LLVMSelectionDAG LLVMAsmPrinter LLVMDebugInfoDWARF LLVMCodeGen LLVMTarget LLVMScalarOpts LLVMInstCombine LLVMAggressiveInstCombine LLVMTransformUtils LLVMBitWriter LLVMAnalysis LLVMProfileData LLVMObject LLVMTextAPI LLVMBitReader LLVMCore LLVMRemarks LLVMBitstreamReader LLVMAArch64AsmParser LLVMMCParser LLVMAArch64Desc LLVMMC LLVMDebugInfoCodeView LLVMDebugInfoMSF LLVMBinaryFormat LLVMAArch64Utils LLVMAArch64Info LLVMSupport LLVMDemangle)

SET(FLAVOR_LIBS psapi.lib shell32.lib ole32.lib uuid.lib advapi32.lib delayimp.lib LLVMDemangle.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib)

SET(CLANG_LIBS clangAST clangBasic clangLex clangCodeGen clangFrontend clangEdit 
    clangSerialization clangSema clangDriver clangParse clangAnalysis)

INCLUDE_DIRECTORIES (include)

FILE(GLOB_RECURSE PUBLIC_HEADERS "include/*.h")

FILE(GLOB_RECURSE SOURCE_FILES "src/*.cpp" )

ADD_SUBDIRECTORY(src)
Note that in the code above, I don't even need CMake to call findPackage() to import the LLVM helper macros anymore…

And once more, this all went very well: I only had to make a few minor changes to the code, and then, the optimizeModule() function call worked just fine!

And in fact, on top of that, I could now also remove the previously required lljit.release() in the NervJITImpl destructor. Yeah! ;-)

⇒ As a result, it seems my nvLLVM module and the NervJIT class I built in there are now memory leak and silent crash free! Feeeewww… This was hard: it took me hard a day, but it was really worth it!

And this is it for today guys! Again, in case someone wants to have a more careful look at the code, here is an updated zip package containing the current version of the nvLLVM module as well as the test_nvLLVM and test_llvm_opt projects:

nv_llvm_20200416.zip

I didn't mention these above, but the code provided in the zip package also contains some additional small updates, such as a proper mechanism to specify the header search paths, the removal of the dependency on the nvCore module [so it should be much easier for anyone to try to rebuild something from that code], and other minor cleaning considerations.

19/04/2020 Update: If you found this post interesting or helpful, then you might want to continue reading on this topic, with the next article available here: JIT Compiler with LLVM - Part 4 - CRT dependency

  • blog/2020/0416_jit_compiler_part3_fixing_crash.txt
  • Last modified: 2021/09/02 13:39
  • by manu