====== JIT Compiler with LLVM - Part 3 - Fixing the ModulePassManager crash ====== {{tag>cpp dev}} If you read my previous article on this topic ([[blog:2020:0414_jit_cpp_compiler|JIT C++ compiler with LLVM - Part 2]]) then you probably noticed there was a serious issue with the "NervJIT" implementation I described (as well as in the toy implementation in the **runClang()** described in the very first article): **I couldn't release the IR optimization pass resources**, nor the **llvm::orc::LLJIT** object as trying to do so was producing silent crashes. In this new post, we will focus on the steps I took to finaly get rid of this problem. So let's get to it! :-) ====== ====== ===== Updating IR Optimization mechanism ===== So if you recall, in our NervJITImpl destructor, we had to explicitly "release" the offending pointers (thus producing memory leaks unfortunately...) as follow: NervJITImpl::~NervJITImpl() { // Note: the memory for the LLJIT object will leak here because we get a crash when we try to delete it. std::cout << "Releasing undestructible pointers." << std::endl; lljit.release(); moduleAnalysisManager.release(); cGSCCAnalysisManager.release(); functionAnalysisManager.release(); loopAnalysisManager.release(); std::cout << "Done releasing undestructible pointers." << std::endl; } Thus, the first thing I decided to focus on was those 4 AnalysisManager objects. And before trying to clarify "why I coudn't delete them properly", I rather started with the question: "Do I really need that stuff ?" ;-) => So I went in search of the techniques that were discussed online to optimize LLVM IR modules. The problem at this level, is that LLVM is continuously evolving (quickly) and so, it's not easy to find something that is up-to-date with the git version I'm using. But still, I eventually found that page on stackoverflow: https://stackoverflow.com/questions/53738883/run-default-optimization-pipeline-using-modern-llvm/53739108 And even if this was meant for LLVM version 7.0 the good thing was that it was referencing one of the LLVM provided tool as base template: the **opt** tool, which I could find in the current version of the LLVM source tree! I had a quick look at the sources inside llvm\tool\opt (mainly the file opt.cpp) and from there you could easily confirm that the code that was presented on the stackoverflow article was still largely similar to what is actually done inside opt, and thus I happily jumped on that train why very high hopes and implemented my own updated "module optimization" logic for my NervJIT component :-): void addOptPasses( llvm::legacy::PassManagerBase &passes, llvm::legacy::FunctionPassManager &fnPasses, llvm::TargetMachine *machine ) { llvm::PassManagerBuilder builder; builder.OptLevel = 3; builder.SizeLevel = 0; builder.Inliner = llvm::createFunctionInliningPass(3, 0, false); builder.LoopVectorize = true; builder.DisableUnrollLoops = false; builder.SLPVectorize = true; machine->adjustPassManager(builder); builder.populateFunctionPassManager(fnPasses); builder.populateModulePassManager(passes); } void addLinkPasses(llvm::legacy::PassManagerBase &passes) { llvm::PassManagerBuilder builder; builder.VerifyInput = true; builder.Inliner = llvm::createFunctionInliningPass(3, 0, false); builder.populateLTOPassManager(passes); } void NervJITImpl::optimizeModule(llvm::Module *module) { module->setTargetTriple(targetMachine->getTargetTriple().str()); module->setDataLayout(targetMachine->createDataLayout()); // DEBUG_MSG("Creating legacy pass manager."); llvm::legacy::PassManager passes; passes.add(new llvm::TargetLibraryInfoWrapperPass(targetMachine->getTargetTriple())); passes.add(llvm::createTargetTransformInfoWrapperPass(targetMachine->getTargetIRAnalysis())); // DEBUG_MSG("Creating legacy function pass manager."); llvm::legacy::FunctionPassManager fnPasses(module); fnPasses.add(llvm::createTargetTransformInfoWrapperPass(targetMachine->getTargetIRAnalysis())); addOptPasses(passes, fnPasses, targetMachine.get()); addLinkPasses(passes); // DEBUG_MSG("Optimizing functions"); fnPasses.doInitialization(); for (llvm::Function &func : *module) { fnPasses.run(func); } fnPasses.doFinalization(); passes.add(llvm::createVerifierPass()); // DEBUG_MSG("Running PassManager."); passes.run(*module); // DEBUG_MSG("Optimization done."); } To be able to compile that code above, the only new thing I needed was a proper **llvm::TargetMachine** pointer: took me some time to figure out how to get that one, but in the end, it's quite straightforward (again, I build that object in my NervJITImpl constructor): auto jtmb = CHECK_LLVM(JITTargetMachineBuilder::detectHost()); targetMachine = CHECK_LLVM(jtmb.createTargetMachine()); Then all I had to do was to replace the call to the ModulePassManager::run() function inside my NervJITImpl::loadModule() method with my new optimizeModule() function: // We run the optimizations: DEBUG_MSG("Optimizing module..."); //modulePassManager.run(*module, *moduleAnalysisManager); optimizeModule(module.get()); DEBUG_MSG("Module function list: "); ... And... **this failed LAMENTABLY...** LOL => silent crash somewhere after the output "Optimizing module..." [//Arrff, I guess I had my hopes too high then...//] So back to investigations, countless "change/build/run tests", and then I eventually tracked the issue to be related, once more, to the **module pass manager**! In fact, I got the optimization stage to work if I commented those two lines (respectively at the very end of **addOptPasses()** and **addLinkPasses()**): // in addOptPasses(): // builder.populateModulePassManager(passes); // in addLinkPasses(): // builder.populateLTOPassManager(passes); But of course, this was no acceptable solution: this code was working fine in the llvm opt tool afterall! So I thought is was time to change my perspective and focus instead on the **opt tool** itself in a quest to understand what I was doing wrong in my own project. ===== Re-building the LLVM opt tool ===== While I was orbiting around the LLVM opt, I did successfully run that tool on the command line with some random LLVM assembly test file (that I found somewhere else in the LLVM source tree), using this kind of command: opt --O3 --debugify-each --verify-each bcsection.ll -o result.bc And this would produce this kind of result: CheckModuleDebugify [Force set function attributes]: PASS CheckModuleDebugify [Infer set function attributes]: PASS CheckFunctionDebugify [Call-site splitting]: PASS CheckModuleDebugify [Interprocedural Sparse Conditional Constant Propagation]: PASS CheckModuleDebugify [Called Value Propagation]: PASS CheckModuleDebugify [Global Variable Optimizer]: PASS CheckFunctionDebugify [Promote Memory to Register]: PASS CheckModuleDebugify [Dead Argument Elimination]: PASS CheckFunctionDebugify [Combine redundant instructions]: PASS CheckFunctionDebugify [Simplify the CFG]: PASS CheckModuleDebugify [Globals Alias Analysis]: PASS CheckFunctionDebugify [SROA]: PASS CheckFunctionDebugify [Early CSE w/ MemorySSA]: PASS CheckFunctionDebugify [Speculatively execute instructions if target has divergent branches]: PASS CheckFunctionDebugify [Jump Threading]: PASS CheckFunctionDebugify [Value Propagation]: PASS CheckFunctionDebugify [Simplify the CFG]: PASS CheckFunctionDebugify [Combine pattern based expressions]: PASS CheckFunctionDebugify [Combine redundant instructions]: PASS CheckFunctionDebugify [Conditionally eliminate dead library calls]: PASS CheckFunctionDebugify [PGOMemOPSize]: PASS CheckFunctionDebugify [Tail Call Elimination]: PASS CheckFunctionDebugify [Simplify the CFG]: PASS CheckFunctionDebugify [Reassociate expressions]: PASS CheckFunctionDebugify [Simplify the CFG]: PASS CheckFunctionDebugify [Combine redundant instructions]: PASS CheckFunctionDebugify [MergedLoadStoreMotion]: PASS CheckFunctionDebugify [Global Value Numbering]: PASS CheckFunctionDebugify [MemCpy Optimization]: PASS CheckFunctionDebugify [Sparse Conditional Constant Propagation]: PASS CheckFunctionDebugify [Bit-Tracking Dead Code Elimination]: PASS CheckFunctionDebugify [Combine redundant instructions]: PASS CheckFunctionDebugify [Jump Threading]: PASS CheckFunctionDebugify [Value Propagation]: PASS CheckFunctionDebugify [Dead Store Elimination]: PASS CheckFunctionDebugify [Aggressive Dead Code Elimination]: PASS CheckFunctionDebugify [Simplify the CFG]: PASS CheckFunctionDebugify [Combine redundant instructions]: PASS CheckModuleDebugify [A No-Op Barrier Pass]: PASS CheckModuleDebugify [Eliminate Available Externally Globals]: PASS CheckModuleDebugify [Deduce function attributes in RPO]: PASS CheckModuleDebugify [Global Variable Optimizer]: PASS CheckModuleDebugify [Dead Global Elimination]: PASS CheckModuleDebugify [Globals Alias Analysis]: PASS CheckFunctionDebugify [Float to int]: PASS CheckFunctionDebugify [Lower constant intrinsics]: PASS CheckFunctionDebugify [Loop Distribution]: PASS CheckFunctionDebugify [Loop Vectorization]: PASS CheckFunctionDebugify [Optimize scalar/vector ops]: PASS CheckFunctionDebugify [Early CSE]: PASS CheckFunctionDebugify [Loop Load Elimination]: PASS CheckFunctionDebugify [Combine redundant instructions]: PASS CheckFunctionDebugify [Simplify the CFG]: PASS CheckFunctionDebugify [SLP Vectorizer]: PASS CheckFunctionDebugify [Combine redundant instructions]: PASS CheckFunctionDebugify [Combine redundant instructions]: PASS CheckFunctionDebugify [Warn about non-applied transformations]: PASS CheckFunctionDebugify [Alignment from assumptions]: PASS CheckModuleDebugify [Strip Unused Function Prototypes]: PASS CheckModuleDebugify [Dead Global Elimination]: PASS CheckModuleDebugify [Merge Duplicate Global Constants]: PASS CheckFunctionDebugify [Remove redundant instructions]: PASS CheckFunctionDebugify [Hoist/decompose integer division and remainder]: PASS CheckFunctionDebugify [Simplify the CFG]: PASS To be honest, I was still not quite sure if this program was ending just fine or also silently crashing after the last "CheckFunctionDebugify line" reported above, but still, I thought this was a good reference point anyway, and I should at least be able to reproduce those results if I were to **rebuild this tool** from sources myself, no? ==== The naive build setup ==== At this point, one will naturally refer to the opt tool cmake files inside the LLVM source tree to figure out how to setup a clone project "out of LLVM sources", usually Cmake is a pretty simple language to deal with, right? Well... unfortunately, this time, unless you are an **absolute CMake Guru**, this path will not lead you very far I'm afraid: LLVM CMake files are **really cryptic**, so I wish you good luck to try to extract the information you need from this kind of content: set(LLVM_LINK_COMPONENTS AllTargetsAsmParsers AllTargetsCodeGens AllTargetsDescs AllTargetsInfos AggressiveInstCombine Analysis BitWriter CodeGen Core Coroutines IPO IRReader InstCombine Instrumentation MC ObjCARCOpts Remarks ScalarOpts Support Target TransformUtils Vectorize Passes ) add_llvm_tool(opt AnalysisWrappers.cpp BreakpointPrinter.cpp GraphPrinters.cpp NewPMDriver.cpp PassPrinters.cpp PrintSCC.cpp opt.cpp ENABLE_PLUGINS DEPENDS intrinsics_gen SUPPORT_PLUGINS ) export_executable_symbols_for_plugins(opt) if(LLVM_BUILD_EXAMPLES) target_link_libraries(opt PRIVATE ExampleIRTransforms) endif(LLVM_BUILD_EXAMPLES) If you really feel like giving this a try and want to find the definitions and behavior of all the llvm specific functions/macros used in that file, a good starting point is the file **${your_llvm_install_dir}/lib/cmake/llvm/AddLLVM.cmake**. But you have been warned... [So, have fun! ;-) and just make sure you stop reading that file before you hang yourself in your bathroom...] So I decided it could be good enough to just take the source files in that tool folder and build the cmake file myself: afterall, I could already build a "working" nvLLVM [Well, expect your "working module" is silently crashing **or** leaking memory Manu, don't forget that lol] with a manually crafted CMakeLists.txt file, so I could maybe use that as a template and see how it goes ? And that's exactly was I did: building a test sub project which I called **test_llvm_opt**, linking with all the LLVM libraries and using the same CXX flags as in my previous nvLLVM sub project. Surprisingly, getting this clone of the llvm opt tool to compile was not too hard in the end, but then, it was time to give it a try (using the exact same input as with the official opt.exe tool obviously:) test_llvm_opt.exe --O3 --debugify-each --verify-each bcsection.ll -o result.bc CheckModuleDebugify [Force set function attributes]: PASS CheckModuleDebugify [Infer set function attributes]: PASS CheckFunctionDebugify [Call-site splitting]: PASS CheckModuleDebugify [Interprocedural Sparse Conditional Constant Propagation]: PASS CheckModuleDebugify [Called Value Propagation]: PASS CheckModuleDebugify [Global Variable Optimizer]: PASS CheckFunctionDebugify [Promote Memory to Register]: PASS CheckModuleDebugify [Dead Argument Elimination]: PASS CheckFunctionDebugify [Combine redundant instructions]: PASS CheckFunctionDebugify [Simplify the CFG]: PASS CheckModuleDebugify [Globals Alias Analysis]: PASS CheckFunctionDebugify [SROA]: PASS CheckFunctionDebugify [Early CSE w/ MemorySSA]: PASS CheckFunctionDebugify [Speculatively execute instructions if target has divergent branches]: PASS CheckFunctionDebugify [Jump Threading]: PASS CheckFunctionDebugify [Value Propagation]: PASS CheckFunctionDebugify [Simplify the CFG]: PASS CheckFunctionDebugify [Combine pattern based expressions]: PASS CheckFunctionDebugify [Combine redundant instructions]: PASS CheckFunctionDebugify [Conditionally eliminate dead library calls]: PASS CheckFunctionDebugify [PGOMemOPSize]: PASS CheckFunctionDebugify [Tail Call Elimination]: PASS CheckFunctionDebugify [Simplify the CFG]: PASS CheckFunctionDebugify [Reassociate expressions]: PASS CheckFunctionDebugify [Simplify the CFG]: PASS CheckFunctionDebugify [Combine redundant instructions]: PASS CheckFunctionDebugify [MergedLoadStoreMotion]: PASS CheckFunctionDebugify [Global Value Numbering]: PASS CheckFunctionDebugify [MemCpy Optimization]: PASS CheckFunctionDebugify [Sparse Conditional Constant Propagation]: PASS CheckFunctionDebugify [Bit-Tracking Dead Code Elimination]: PASS CheckFunctionDebugify [Combine redundant instructions]: PASS CheckFunctionDebugify [Jump Threading]: PASS CheckFunctionDebugify [Value Propagation]: PASS CheckFunctionDebugify [Dead Store Elimination]: PASS **And BBAAMMM!** The process will die at this exact same location everytime I run this command (whereas the offical opt tool will continue here with many additional passes: [Aggressive Dead Code Elimination], etc...) So at this point I really convinced there was something going wrong in my cmake setup... but what?! ==== The correct build setup ==== I went back to the official opt cmake files, trying to focus very hard on all the configuration elements, macros, conditions, etc... and this was clearly just overwhelming and unmanageable :-( I was starting to loose faith but then I realized something that saved my day: **cmake is anyway just writing (somewhat) "regular" makefile in the end!** (at least when using the NMake generator like I am (?)) I didn't have any other option left anyway, so I just dived into the **generated cmake build files** for my LLVM git sources, and this really was an eye opener! :-) First I found the file **tools\opt\CMakeFiles\opt.dir\build.make**, in there you will find content such as: tools\opt\CMakeFiles\opt.dir\AnalysisWrappers.cpp.obj: tools\opt\CMakeFiles\opt.dir\flags.make tools\opt\CMakeFiles\opt.dir\AnalysisWrappers.cpp.obj: W:\Projects\NervSeed\deps\build\llvm-20200409\llvm\tools\opt\AnalysisWrappers.cpp @$(CMAKE_COMMAND) -E cmake_echo_color --switch=$(COLOR) --green --progress-dir=W:\Projects\NervSeed\deps\build\llvm-20200409\build\CMakeFiles --progress-num=$(CMAKE_PROGRESS_1) "Building CXX object tools/opt/CMakeFiles/opt.dir/AnalysisWrappers.cpp.obj" cd W:\Projects\NervSeed\deps\build\llvm-20200409\build\tools\opt D:\Apps\VisualStudio2017_CE\VC\Tools\MSVC\14.16.27023\bin\Hostx64\x64\cl.exe @<< /nologo /TP $(CXX_DEFINES) $(CXX_INCLUDES) $(CXX_FLAGS) /FoCMakeFiles\opt.dir\AnalysisWrappers.cpp.obj /FdCMakeFiles\opt.dir\ /FS -c W:\Projects\NervSeed\deps\build\llvm-20200409\llvm\tools\opt\AnalysisWrappers.cpp And at the begining of that file we have: # Include any dependencies generated for this target. include tools\opt\CMakeFiles\opt.dir\depend.make # Include the progress variables for this target. include tools\opt\CMakeFiles\opt.dir\progress.make # Include the compile flags for this target's objects. include tools\opt\CMakeFiles\opt.dir\flags.make So, naturally, you check the sibling **flags.make** file, and here, **bingo!**: # compile CXX with D:/Apps/VisualStudio2017_CE/VC/Tools/MSVC/14.16.27023/bin/Hostx64/x64/cl.exe # compile RC with C:/Program Files (x86)/Windows Kits/10/bin/10.0.18362.0/x64/rc.exe CXX_FLAGS = /DWIN32 /D_WINDOWS /Zc:inline /Zc:strictStrings /Oi /Zc:rvalueCast /W4 -wd4141 -wd4146 -wd4244 -wd4267 -wd4291 -wd4351 -wd4456 -wd4457 -wd4458 -wd4459 -wd4503 -wd4624 -wd4722 -wd4100 -wd4127 -wd4512 -wd4505 -wd4610 -wd4510 -wd4702 -wd4245 -wd4706 -wd4310 -wd4701 -wd4703 -wd4389 -wd4611 -wd4805 -wd4204 -wd4577 -wd4091 -wd4592 -wd4319 -wd4709 -wd4324 -w14062 -we4238 /Gw /MD /O2 /Ob2 /DNDEBUG /EHs-c- /GR- CXX_DEFINES = -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS CXX_INCLUDES = -IW:\Projects\NervSeed\deps\build\llvm-20200409\build\tools\opt -IW:\Projects\NervSeed\deps\build\llvm-20200409\llvm\tools\opt -IW:\Projects\NervSeed\deps\build\llvm-20200409\build\include -IW:\Projects\NervSeed\deps\build\llvm-20200409\llvm\include => So with these lines, you know exactly what are the command lines executed to compile the cpp files in that folder to obj files. And finally back to the **build.make**, you will also find the linking target definition: bin\opt.exe: tools\opt\CMakeFiles\opt.dir\objects1.rsp @$(CMAKE_COMMAND) -E cmake_echo_color --switch=$(COLOR) --green --bold --progress-dir=W:\Projects\NervSeed\deps\build\llvm-20200409\build\CMakeFiles --progress-num=$(CMAKE_PROGRESS_9) "Linking CXX executable ..\..\bin\opt.exe" cd W:\Projects\NervSeed\deps\build\llvm-20200409\build\tools\opt W:\Projects\NervSeed\tools\windows\cmake-3.9.2\bin\cmake.exe -E vs_link_exe --intdir=CMakeFiles\opt.dir --manifests -- D:\Apps\VisualStudio2017_CE\VC\Tools\MSVC\14.16.27023\bin\Hostx64\x64\link.exe /nologo @CMakeFiles\opt.dir\objects1.rsp @<< /out:..\..\bin\opt.exe /implib:..\..\lib\opt.lib /pdb:W:\Projects\NervSeed\deps\build\llvm-20200409\build\bin\opt.pdb /version:0.0 /machine:x64 /STACK:10000000 /INCREMENTAL:NO /subsystem:console ..\..\lib\LLVMAArch64AsmParser.lib ..\..\lib\LLVMAMDGPUAsmParser.lib ..\..\lib\LLVMARMAsmParser.lib ..\..\lib\LLVMAVRAsmParser.lib ..\..\lib\LLVMBPFAsmParser.lib ..\..\lib\LLVMHexagonAsmParser.lib ..\..\lib\LLVMLanaiAsmParser.lib ..\..\lib\LLVMMipsAsmParser.lib ..\..\lib\LLVMMSP430AsmParser.lib ..\..\lib\LLVMPowerPCAsmParser.lib ..\..\lib\LLVMRISCVAsmParser.lib ..\..\lib\LLVMSparcAsmParser.lib ..\..\lib\LLVMSystemZAsmParser.lib ..\..\lib\LLVMWebAssemblyAsmParser.lib ..\..\lib\LLVMX86AsmParser.lib ..\..\lib\LLVMAArch64CodeGen.lib ..\..\lib\LLVMAMDGPUCodeGen.lib ..\..\lib\LLVMARMCodeGen.lib ..\..\lib\LLVMAVRCodeGen.lib ..\..\lib\LLVMBPFCodeGen.lib ..\..\lib\LLVMHexagonCodeGen.lib ..\..\lib\LLVMLanaiCodeGen.lib ..\..\lib\LLVMMipsCodeGen.lib ..\..\lib\LLVMMSP430CodeGen.lib ..\..\lib\LLVMNVPTXCodeGen.lib ..\..\lib\LLVMPowerPCCodeGen.lib ..\..\lib\LLVMRISCVCodeGen.lib ..\..\lib\LLVMSparcCodeGen.lib ..\..\lib\LLVMSystemZCodeGen.lib ..\..\lib\LLVMWebAssemblyCodeGen.lib ..\..\lib\LLVMX86CodeGen.lib ..\..\lib\LLVMXCoreCodeGen.lib ..\..\lib\LLVMAArch64Desc.lib ..\..\lib\LLVMAMDGPUDesc.lib ..\..\lib\LLVMARMDesc.lib ..\..\lib\LLVMAVRDesc.lib ..\..\lib\LLVMBPFDesc.lib ..\..\lib\LLVMHexagonDesc.lib ..\..\lib\LLVMLanaiDesc.lib ..\..\lib\LLVMMipsDesc.lib ..\..\lib\LLVMMSP430Desc.lib ..\..\lib\LLVMNVPTXDesc.lib ..\..\lib\LLVMPowerPCDesc.lib ..\..\lib\LLVMRISCVDesc.lib ..\..\lib\LLVMSparcDesc.lib ..\..\lib\LLVMSystemZDesc.lib ..\..\lib\LLVMWebAssemblyDesc.lib ..\..\lib\LLVMX86Desc.lib ..\..\lib\LLVMXCoreDesc.lib ..\..\lib\LLVMAArch64Info.lib ..\..\lib\LLVMAMDGPUInfo.lib ..\..\lib\LLVMARMInfo.lib ..\..\lib\LLVMAVRInfo.lib ..\..\lib\LLVMBPFInfo.lib ..\..\lib\LLVMHexagonInfo.lib ..\..\lib\LLVMLanaiInfo.lib ..\..\lib\LLVMMipsInfo.lib ..\..\lib\LLVMMSP430Info.lib ..\..\lib\LLVMNVPTXInfo.lib ..\..\lib\LLVMPowerPCInfo.lib ..\..\lib\LLVMRISCVInfo.lib ..\..\lib\LLVMSparcInfo.lib ..\..\lib\LLVMSystemZInfo.lib ..\..\lib\LLVMWebAssemblyInfo.lib ..\..\lib\LLVMX86Info.lib ..\..\lib\LLVMXCoreInfo.lib ..\..\lib\LLVMAggressiveInstCombine.lib ..\..\lib\LLVMAnalysis.lib ..\..\lib\LLVMBitWriter.lib ..\..\lib\LLVMCodeGen.lib ..\..\lib\LLVMCore.lib ..\..\lib\LLVMCoroutines.lib ..\..\lib\LLVMipo.lib ..\..\lib\LLVMIRReader.lib ..\..\lib\LLVMInstCombine.lib ..\..\lib\LLVMInstrumentation.lib ..\..\lib\LLVMMC.lib ..\..\lib\LLVMObjCARCOpts.lib ..\..\lib\LLVMRemarks.lib ..\..\lib\LLVMScalarOpts.lib ..\..\lib\LLVMSupport.lib ..\..\lib\LLVMTarget.lib ..\..\lib\LLVMTransformUtils.lib ..\..\lib\LLVMVectorize.lib ..\..\lib\LLVMPasses.lib ..\..\lib\LLVMAArch64Utils.lib ..\..\lib\LLVMAMDGPUUtils.lib ..\..\lib\LLVMMIRParser.lib ..\..\lib\LLVMARMUtils.lib ..\..\lib\LLVMHexagonAsmParser.lib ..\..\lib\LLVMHexagonDesc.lib ..\..\lib\LLVMHexagonInfo.lib ..\..\lib\LLVMLanaiAsmParser.lib ..\..\lib\LLVMLanaiDesc.lib ..\..\lib\LLVMLanaiInfo.lib ..\..\lib\LLVMRISCVUtils.lib ..\..\lib\LLVMMCDisassembler.lib ..\..\lib\LLVMCFGuard.lib ..\..\lib\LLVMGlobalISel.lib ..\..\lib\LLVMX86Utils.lib ..\..\lib\LLVMAsmPrinter.lib ..\..\lib\LLVMDebugInfoDWARF.lib ..\..\lib\LLVMSelectionDAG.lib ..\..\lib\LLVMCodeGen.lib ..\..\lib\LLVMCoroutines.lib ..\..\lib\LLVMipo.lib ..\..\lib\LLVMBitWriter.lib ..\..\lib\LLVMIRReader.lib ..\..\lib\LLVMAsmParser.lib ..\..\lib\LLVMFrontendOpenMP.lib ..\..\lib\LLVMLinker.lib ..\..\lib\LLVMInstrumentation.lib ..\..\lib\LLVMScalarOpts.lib ..\..\lib\LLVMAggressiveInstCombine.lib ..\..\lib\LLVMInstCombine.lib ..\..\lib\LLVMTarget.lib ..\..\lib\LLVMVectorize.lib ..\..\lib\LLVMTransformUtils.lib ..\..\lib\LLVMAnalysis.lib ..\..\lib\LLVMProfileData.lib ..\..\lib\LLVMObject.lib ..\..\lib\LLVMMCParser.lib ..\..\lib\LLVMMC.lib ..\..\lib\LLVMDebugInfoCodeView.lib ..\..\lib\LLVMDebugInfoMSF.lib ..\..\lib\LLVMBitReader.lib ..\..\lib\LLVMTextAPI.lib ..\..\lib\LLVMCore.lib ..\..\lib\LLVMRemarks.lib ..\..\lib\LLVMBitstreamReader.lib ..\..\lib\LLVMBinaryFormat.lib ..\..\lib\LLVMSupport.lib psapi.lib shell32.lib ole32.lib uuid.lib advapi32.lib delayimp.lib -delayload:shell32.dll -delayload:ole32.dll ..\..\lib\LLVMDemangle.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib I think this is actually a very important lesson every developer should try to keep in mind with this kind of experiment: if you feel overwhelmed by the cmake functions in use in a given project that you want to replicate somewhere else, then **check the cmake generated Makefile to find raw build comands with CXX and link flags**! So of course, with those new inputs, I immediately updated my test_llvm_opt CMakeLists.txt file as follow: SET(TARGET_NAME "test_llvm_opt") SET(TARGET_DIR "./") SET(LLVM_LIBS LLVMAArch64AsmParser.lib LLVMAMDGPUAsmParser.lib LLVMARMAsmParser.lib LLVMAVRAsmParser.lib LLVMBPFAsmParser.lib LLVMHexagonAsmParser.lib LLVMLanaiAsmParser.lib LLVMMipsAsmParser.lib LLVMMSP430AsmParser.lib LLVMPowerPCAsmParser.lib LLVMRISCVAsmParser.lib LLVMSparcAsmParser.lib LLVMSystemZAsmParser.lib LLVMWebAssemblyAsmParser.lib LLVMX86AsmParser.lib LLVMAArch64CodeGen.lib LLVMAMDGPUCodeGen.lib LLVMARMCodeGen.lib LLVMAVRCodeGen.lib LLVMBPFCodeGen.lib LLVMHexagonCodeGen.lib LLVMLanaiCodeGen.lib LLVMMipsCodeGen.lib LLVMMSP430CodeGen.lib LLVMNVPTXCodeGen.lib LLVMPowerPCCodeGen.lib LLVMRISCVCodeGen.lib LLVMSparcCodeGen.lib LLVMSystemZCodeGen.lib LLVMWebAssemblyCodeGen.lib LLVMX86CodeGen.lib LLVMXCoreCodeGen.lib LLVMAArch64Desc.lib LLVMAMDGPUDesc.lib LLVMARMDesc.lib LLVMAVRDesc.lib LLVMBPFDesc.lib LLVMHexagonDesc.lib LLVMLanaiDesc.lib LLVMMipsDesc.lib LLVMMSP430Desc.lib LLVMNVPTXDesc.lib LLVMPowerPCDesc.lib LLVMRISCVDesc.lib LLVMSparcDesc.lib LLVMSystemZDesc.lib LLVMWebAssemblyDesc.lib LLVMX86Desc.lib LLVMXCoreDesc.lib LLVMAArch64Info.lib LLVMAMDGPUInfo.lib LLVMARMInfo.lib LLVMAVRInfo.lib LLVMBPFInfo.lib LLVMHexagonInfo.lib LLVMLanaiInfo.lib LLVMMipsInfo.lib LLVMMSP430Info.lib LLVMNVPTXInfo.lib LLVMPowerPCInfo.lib LLVMRISCVInfo.lib LLVMSparcInfo.lib LLVMSystemZInfo.lib LLVMWebAssemblyInfo.lib LLVMX86Info.lib LLVMXCoreInfo.lib LLVMAggressiveInstCombine.lib LLVMAnalysis.lib LLVMBitWriter.lib LLVMCodeGen.lib LLVMCore.lib LLVMCoroutines.lib LLVMipo.lib LLVMIRReader.lib LLVMInstCombine.lib LLVMInstrumentation.lib LLVMMC.lib LLVMObjCARCOpts.lib LLVMRemarks.lib LLVMScalarOpts.lib LLVMSupport.lib LLVMTarget.lib LLVMTransformUtils.lib LLVMVectorize.lib LLVMPasses.lib LLVMAArch64Utils.lib LLVMAMDGPUUtils.lib LLVMMIRParser.lib LLVMARMUtils.lib LLVMHexagonAsmParser.lib LLVMHexagonDesc.lib LLVMHexagonInfo.lib LLVMLanaiAsmParser.lib LLVMLanaiDesc.lib LLVMLanaiInfo.lib LLVMRISCVUtils.lib LLVMMCDisassembler.lib LLVMCFGuard.lib LLVMGlobalISel.lib LLVMX86Utils.lib LLVMAsmPrinter.lib LLVMDebugInfoDWARF.lib LLVMSelectionDAG.lib LLVMCodeGen.lib LLVMCoroutines.lib LLVMipo.lib LLVMBitWriter.lib LLVMIRReader.lib LLVMAsmParser.lib LLVMFrontendOpenMP.lib LLVMLinker.lib LLVMInstrumentation.lib LLVMScalarOpts.lib LLVMAggressiveInstCombine.lib LLVMInstCombine.lib LLVMTarget.lib LLVMVectorize.lib LLVMTransformUtils.lib LLVMAnalysis.lib LLVMProfileData.lib LLVMObject.lib LLVMMCParser.lib LLVMMC.lib LLVMDebugInfoCodeView.lib LLVMDebugInfoMSF.lib LLVMBitReader.lib LLVMTextAPI.lib LLVMCore.lib LLVMRemarks.lib LLVMBitstreamReader.lib LLVMBinaryFormat.lib LLVMSupport.lib psapi.lib shell32.lib ole32.lib uuid.lib advapi32.lib delayimp.lib LLVMDemangle.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib) SET(CMAKE_CXX_FLAGS "/std:c++14 -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS /DWIN32 /D_WINDOWS /Zc:inline /Zc:strictStrings /Oi /Zc:rvalueCast /W4 -wd4141 -wd4146 -wd4244 -wd4267 -wd4291 -wd4351 -wd4456 -wd4457 -wd4458 -wd4459 -wd4503 -wd4624 -wd4722 -wd4100 -wd4127 -wd4512 -wd4505 -wd4610 -wd4510 -wd4702 -wd4245 -wd4706 -wd4310 -wd4701 -wd4703 -wd4389 -wd4611 -wd4805 -wd4204 -wd4577 -wd4091 -wd4592 -wd4319 -wd4709 -wd4324 -w14062 -we4238 /Gw /MD /O2 /Ob2 /DNDEBUG /EHs-c- /GR-") include_directories(${LLVM_CLANG_DIR}/include) LINK_DIRECTORIES(${LLVM_CLANG_DIR}/lib) FILE(GLOB_RECURSE SOURCE_FILES "*.cpp" ) set(SOURCE_FILES ${SOURCE_FILES} windows_version_resource.rc) ADD_EXECUTABLE (${TARGET_NAME} ${SOURCE_FILES}) TARGET_LINK_LIBRARIES(${TARGET_NAME} ${LLVM_LIBS}) INSTALL(TARGETS ${TARGET_NAME} RUNTIME DESTINATION ${TARGET_DIR} LIBRARY DESTINATION ${TARGET_DIR}) Again, the test project compiled just fine, but this time, running the command **test_llvm_opt.exe --O3 --debugify-each --verify-each bcsection.ll -o result.bc** produced the same results as the official LLVM opt tool: **yyeeeeepppeeee! Victory!** :-D ===== Updating the nvLLVM build setup ===== From there the path started to smoothly clear itself: updating the cmake build files fixed a silent crash in my test_llvm_opt project, so naturally, the next logical step was to apply the same updates to the nvLLVM project, and its main CmakeLists.txt was thus modified as follow: SET(TARGET_DIR "./") include_directories(${LLVM_CLANG_DIR}/include) LINK_DIRECTORIES(${LLVM_CLANG_DIR}/lib) INCLUDE_DIRECTORIES(../nvCore/include) SET(CMAKE_CXX_FLAGS "/std:c++14 -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS /DWIN32 /D_WINDOWS /Zc:inline /Zc:strictStrings /Oi /Zc:rvalueCast /W4 -wd4141 -wd4146 -wd4244 -wd4267 -wd4291 -wd4351 -wd4456 -wd4457 -wd4458 -wd4459 -wd4503 -wd4624 -wd4722 -wd4100 -wd4127 -wd4512 -wd4505 -wd4610 -wd4510 -wd4702 -wd4245 -wd4706 -wd4310 -wd4701 -wd4703 -wd4389 -wd4611 -wd4805 -wd4204 -wd4577 -wd4091 -wd4592 -wd4319 -wd4709 -wd4324 -w14062 -we4238 /Gw /MD /O2 /Ob2 /DNDEBUG /EHs-c- /GR-") ADD_DEFINITIONS(-D_CRT_SECURE_NO_WARNINGS) ADD_DEFINITIONS(-DNOMINMAX) # Note: used llvm-config.exe --libs to retrieve the list of libraries below: # LLVM version 11.0.0git SET(LLVM_LIBS LLVMXRay LLVMWindowsManifest LLVMTableGen LLVMSymbolize LLVMDebugInfoPDB LLVMOrcJIT LLVMOrcError LLVMJITLink LLVMObjectYAML LLVMMCA LLVMLTO LLVMPasses LLVMCoroutines LLVMObjCARCOpts LLVMLineEditor LLVMLibDriver LLVMInterpreter LLVMFuzzMutate LLVMMCJIT LLVMExecutionEngine LLVMRuntimeDyld LLVMDWARFLinker LLVMDlltoolDriver LLVMOption LLVMDebugInfoGSYM LLVMCoverage LLVMXCoreDisassembler LLVMXCoreCodeGen LLVMXCoreDesc LLVMXCoreInfo LLVMX86Disassembler LLVMX86AsmParser LLVMX86CodeGen LLVMX86Desc LLVMX86Utils LLVMX86Info LLVMWebAssemblyDisassembler LLVMWebAssemblyCodeGen LLVMWebAssemblyDesc LLVMWebAssemblyAsmParser LLVMWebAssemblyInfo LLVMSystemZDisassembler LLVMSystemZCodeGen LLVMSystemZAsmParser LLVMSystemZDesc LLVMSystemZInfo LLVMSparcDisassembler LLVMSparcCodeGen LLVMSparcAsmParser LLVMSparcDesc LLVMSparcInfo LLVMRISCVDisassembler LLVMRISCVCodeGen LLVMRISCVAsmParser LLVMRISCVDesc LLVMRISCVUtils LLVMRISCVInfo LLVMPowerPCDisassembler LLVMPowerPCCodeGen LLVMPowerPCAsmParser LLVMPowerPCDesc LLVMPowerPCInfo LLVMNVPTXCodeGen LLVMNVPTXDesc LLVMNVPTXInfo LLVMMSP430Disassembler LLVMMSP430CodeGen LLVMMSP430AsmParser LLVMMSP430Desc LLVMMSP430Info LLVMMipsDisassembler LLVMMipsCodeGen LLVMMipsAsmParser LLVMMipsDesc LLVMMipsInfo LLVMLanaiDisassembler LLVMLanaiCodeGen LLVMLanaiAsmParser LLVMLanaiDesc LLVMLanaiInfo LLVMHexagonDisassembler LLVMHexagonCodeGen LLVMHexagonAsmParser LLVMHexagonDesc LLVMHexagonInfo LLVMBPFDisassembler LLVMBPFCodeGen LLVMBPFAsmParser LLVMBPFDesc LLVMBPFInfo LLVMAVRDisassembler LLVMAVRCodeGen LLVMAVRAsmParser LLVMAVRDesc LLVMAVRInfo LLVMARMDisassembler LLVMARMCodeGen LLVMARMAsmParser LLVMARMDesc LLVMARMUtils LLVMARMInfo LLVMAMDGPUDisassembler LLVMAMDGPUCodeGen LLVMMIRParser LLVMipo LLVMInstrumentation LLVMVectorize LLVMLinker LLVMIRReader LLVMAsmParser LLVMFrontendOpenMP LLVMAMDGPUAsmParser LLVMAMDGPUDesc LLVMAMDGPUUtils LLVMAMDGPUInfo LLVMAArch64Disassembler LLVMMCDisassembler LLVMAArch64CodeGen LLVMCFGuard LLVMGlobalISel LLVMSelectionDAG LLVMAsmPrinter LLVMDebugInfoDWARF LLVMCodeGen LLVMTarget LLVMScalarOpts LLVMInstCombine LLVMAggressiveInstCombine LLVMTransformUtils LLVMBitWriter LLVMAnalysis LLVMProfileData LLVMObject LLVMTextAPI LLVMBitReader LLVMCore LLVMRemarks LLVMBitstreamReader LLVMAArch64AsmParser LLVMMCParser LLVMAArch64Desc LLVMMC LLVMDebugInfoCodeView LLVMDebugInfoMSF LLVMBinaryFormat LLVMAArch64Utils LLVMAArch64Info LLVMSupport LLVMDemangle) SET(FLAVOR_LIBS psapi.lib shell32.lib ole32.lib uuid.lib advapi32.lib delayimp.lib LLVMDemangle.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib) SET(CLANG_LIBS clangAST clangBasic clangLex clangCodeGen clangFrontend clangEdit clangSerialization clangSema clangDriver clangParse clangAnalysis) INCLUDE_DIRECTORIES (include) FILE(GLOB_RECURSE PUBLIC_HEADERS "include/*.h") FILE(GLOB_RECURSE SOURCE_FILES "src/*.cpp" ) ADD_SUBDIRECTORY(src) Note that in the code above, I don't even need CMake to call **findPackage()** to import the LLVM helper macros anymore... And once more, this all went very well: I only had to make a few minor changes to the code, and then, **the optimizeModule() function call worked just fine**! And in fact, on top of that, I could **now also remove** the previously required **lljit.release()** in the NervJITImpl destructor. Yeah! ;-) => As a result, it seems my nvLLVM module and the NervJIT class I built in there are now **memory leak and silent crash free**! Feeeewww... This was hard: it took me hard a day, but it was really worth it! And this is it for today guys! Again, in case someone wants to have a more careful look at the code, here is an updated zip package containing the current version of the nvLLVM module as well as the test_nvLLVM and test_llvm_opt projects: {{ blog:2020:0416:nv_llvm_20200416.zip }} I didn't mention these above, but the code provided in the zip package also contains some additional small updates, such as a proper mechanism to specify the header search paths, the removal of the dependency on the nvCore module [so it should be much easier for anyone to try to rebuild something from that code], and other minor cleaning considerations. **19/04/2020 Update**: If you found this post interesting or helpful, then you might want to continue reading on this topic, with the next article available here: [[blog:2020:0418_jit_compiler_part4_crt_dependency|JIT Compiler with LLVM - Part 4 - CRT dependency]] ~~DISCUSSION:off~~