| ======================================================= |
| Building a JIT: Starting out with KaleidoscopeJIT |
| ======================================================= |
| |
| .. contents:: |
| :local: |
| |
| Chapter 1 Introduction |
| ====================== |
| |
| Welcome to Chapter 1 of the "Building an ORC-based JIT in LLVM" tutorial. This |
| tutorial runs through the implementation of a JIT compiler using LLVM's |
| On-Request-Compilation (ORC) APIs. It begins with a simplified version of the |
| KaleidoscopeJIT class used in the |
| `Implementing a language with LLVM <LangImpl01.html>`_ tutorials and then |
| introduces new features like optimization, lazy compilation and remote |
| execution. |
| |
| The goal of this tutorial is to introduce you to LLVM's ORC JIT APIs, show how |
| these APIs interact with other parts of LLVM, and to teach you how to recombine |
| them to build a custom JIT that is suited to your use-case. |
| |
| The structure of the tutorial is: |
| |
| - Chapter #1: Investigate the simple KaleidoscopeJIT class. This will |
| introduce some of the basic concepts of the ORC JIT APIs, including the |
| idea of an ORC *Layer*. |
| |
| - `Chapter #2 <BuildingAJIT2.html>`_: Extend the basic KaleidoscopeJIT by adding |
| a new layer that will optimize IR and generated code. |
| |
| - `Chapter #3 <BuildingAJIT3.html>`_: Further extend the JIT by adding a |
| Compile-On-Demand layer to lazily compile IR. |
| |
| - `Chapter #4 <BuildingAJIT4.html>`_: Improve the laziness of our JIT by |
| replacing the Compile-On-Demand layer with a custom layer that uses the ORC |
| Compile Callbacks API directly to defer IR-generation until functions are |
| called. |
| |
| - `Chapter #5 <BuildingAJIT5.html>`_: Add process isolation by JITing code into |
| a remote process with reduced privileges using the JIT Remote APIs. |
| |
| To provide input for our JIT we will use the Kaleidoscope REPL from |
| `Chapter 7 <LangImpl07.html>`_ of the "Implementing a language in LLVM tutorial", |
| with one minor modification: We will remove the FunctionPassManager from the |
| code for that chapter and replace it with optimization support in our JIT class |
| in Chapter #2. |
| |
| Finally, a word on API generations: ORC is the 3rd generation of LLVM JIT API. |
| It was preceded by MCJIT, and before that by the (now deleted) legacy JIT. |
| These tutorials don't assume any experience with these earlier APIs, but |
| readers acquainted with them will see many familiar elements. Where appropriate |
| we will make this connection with the earlier APIs explicit to help people who |
| are transitioning from them to ORC. |
| |
| JIT API Basics |
| ============== |
| |
| The purpose of a JIT compiler is to compile code "on-the-fly" as it is needed, |
| rather than compiling whole programs to disk ahead of time as a traditional |
| compiler does. To support that aim our initial, bare-bones JIT API will be: |
| |
| 1. Handle addModule(Module &M) -- Make the given IR module available for |
| execution. |
| 2. JITSymbol findSymbol(const std::string &Name) -- Search for pointers to |
| symbols (functions or variables) that have been added to the JIT. |
| 3. void removeModule(Handle H) -- Remove a module from the JIT, releasing any |
| memory that had been used for the compiled code. |
| |
| A basic use-case for this API, executing the 'main' function from a module, |
| will look like: |
| |
| .. code-block:: c++ |
| |
| std::unique_ptr<Module> M = buildModule(); |
| JIT J; |
| Handle H = J.addModule(*M); |
| int (*Main)(int, char*[]) = |
| (int(*)(int, char*[])J.findSymbol("main").getAddress(); |
| int Result = Main(); |
| J.removeModule(H); |
| |
| The APIs that we build in these tutorials will all be variations on this simple |
| theme. Behind the API we will refine the implementation of the JIT to add |
| support for optimization and lazy compilation. Eventually we will extend the |
| API itself to allow higher-level program representations (e.g. ASTs) to be |
| added to the JIT. |
| |
| KaleidoscopeJIT |
| =============== |
| |
| In the previous section we described our API, now we examine a simple |
| implementation of it: The KaleidoscopeJIT class [1]_ that was used in the |
| `Implementing a language with LLVM <LangImpl01.html>`_ tutorials. We will use |
| the REPL code from `Chapter 7 <LangImpl07.html>`_ of that tutorial to supply the |
| input for our JIT: Each time the user enters an expression the REPL will add a |
| new IR module containing the code for that expression to the JIT. If the |
| expression is a top-level expression like '1+1' or 'sin(x)', the REPL will also |
| use the findSymbol method of our JIT class find and execute the code for the |
| expression, and then use the removeModule method to remove the code again |
| (since there's no way to re-invoke an anonymous expression). In later chapters |
| of this tutorial we'll modify the REPL to enable new interactions with our JIT |
| class, but for now we will take this setup for granted and focus our attention on |
| the implementation of our JIT itself. |
| |
| Our KaleidoscopeJIT class is defined in the KaleidoscopeJIT.h header. After the |
| usual include guards and #includes [2]_, we get to the definition of our class: |
| |
| .. code-block:: c++ |
| |
| #ifndef LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H |
| #define LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H |
| |
| #include "llvm/ExecutionEngine/ExecutionEngine.h" |
| #include "llvm/ExecutionEngine/RTDyldMemoryManager.h" |
| #include "llvm/ExecutionEngine/Orc/CompileUtils.h" |
| #include "llvm/ExecutionEngine/Orc/IRCompileLayer.h" |
| #include "llvm/ExecutionEngine/Orc/LambdaResolver.h" |
| #include "llvm/ExecutionEngine/Orc/ObjectLinkingLayer.h" |
| #include "llvm/IR/Mangler.h" |
| #include "llvm/Support/DynamicLibrary.h" |
| |
| namespace llvm { |
| namespace orc { |
| |
| class KaleidoscopeJIT { |
| private: |
| std::unique_ptr<TargetMachine> TM; |
| const DataLayout DL; |
| ObjectLinkingLayer<> ObjectLayer; |
| IRCompileLayer<decltype(ObjectLayer)> CompileLayer; |
| |
| public: |
| typedef decltype(CompileLayer)::ModuleSetHandleT ModuleHandleT; |
| |
| Our class begins with four members: A TargetMachine, TM, which will be used |
| to build our LLVM compiler instance; A DataLayout, DL, which will be used for |
| symbol mangling (more on that later), and two ORC *layers*: an |
| ObjectLinkingLayer and a IRCompileLayer. We'll be talking more about layers in |
| the next chapter, but for now you can think of them as analogous to LLVM |
| Passes: they wrap up useful JIT utilities behind an easy to compose interface. |
| The first layer, ObjectLinkingLayer, is the foundation of our JIT: it takes |
| in-memory object files produced by a compiler and links them on the fly to make |
| them executable. This JIT-on-top-of-a-linker design was introduced in MCJIT, |
| however the linker was hidden inside the MCJIT class. In ORC we expose the |
| linker so that clients can access and configure it directly if they need to. In |
| this tutorial our ObjectLinkingLayer will just be used to support the next layer |
| in our stack: the IRCompileLayer, which will be responsible for taking LLVM IR, |
| compiling it, and passing the resulting in-memory object files down to the |
| object linking layer below. |
| |
| That's it for member variables, after that we have a single typedef: |
| ModuleHandleT. This is the handle type that will be returned from our JIT's |
| addModule method, and can be passed to the removeModule method to remove a |
| module. The IRCompileLayer class already provides a convenient handle type |
| (IRCompileLayer::ModuleSetHandleT), so we just alias our ModuleHandleT to this. |
| |
| .. code-block:: c++ |
| |
| KaleidoscopeJIT() |
| : TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()), |
| CompileLayer(ObjectLayer, SimpleCompiler(*TM)) { |
| llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr); |
| } |
| |
| TargetMachine &getTargetMachine() { return *TM; } |
| |
| Next up we have our class constructor. We begin by initializing TM using the |
| EngineBuilder::selectTarget helper method, which constructs a TargetMachine for |
| the current process. Next we use our newly created TargetMachine to initialize |
| DL, our DataLayout. Then we initialize our IRCompileLayer. Our IRCompile layer |
| needs two things: (1) A reference to our object linking layer, and (2) a |
| compiler instance to use to perform the actual compilation from IR to object |
| files. We use the off-the-shelf SimpleCompiler instance for now. Finally, in |
| the body of the constructor, we call the DynamicLibrary::LoadLibraryPermanently |
| method with a nullptr argument. Normally the LoadLibraryPermanently method is |
| called with the path of a dynamic library to load, but when passed a null |
| pointer it will 'load' the host process itself, making its exported symbols |
| available for execution. |
| |
| .. code-block:: c++ |
| |
| ModuleHandle addModule(std::unique_ptr<Module> M) { |
| // Build our symbol resolver: |
| // Lambda 1: Look back into the JIT itself to find symbols that are part of |
| // the same "logical dylib". |
| // Lambda 2: Search for external symbols in the host process. |
| auto Resolver = createLambdaResolver( |
| [&](const std::string &Name) { |
| if (auto Sym = CompileLayer.findSymbol(Name, false)) |
| return Sym; |
| return JITSymbol(nullptr); |
| }, |
| [](const std::string &S) { |
| if (auto SymAddr = |
| RTDyldMemoryManager::getSymbolAddressInProcess(Name)) |
| return JITSymbol(SymAddr, JITSymbolFlags::Exported); |
| return JITSymbol(nullptr); |
| }); |
| |
| // Build a singleton module set to hold our module. |
| std::vector<std::unique_ptr<Module>> Ms; |
| Ms.push_back(std::move(M)); |
| |
| // Add the set to the JIT with the resolver we created above and a newly |
| // created SectionMemoryManager. |
| return CompileLayer.addModuleSet(std::move(Ms), |
| make_unique<SectionMemoryManager>(), |
| std::move(Resolver)); |
| } |
| |
| Now we come to the first of our JIT API methods: addModule. This method is |
| responsible for adding IR to the JIT and making it available for execution. In |
| this initial implementation of our JIT we will make our modules "available for |
| execution" by adding them straight to the IRCompileLayer, which will |
| immediately compile them. In later chapters we will teach our JIT to be lazier |
| and instead add the Modules to a "pending" list to be compiled if and when they |
| are first executed. |
| |
| To add our module to the IRCompileLayer we need to supply two auxiliary objects |
| (as well as the module itself): a memory manager and a symbol resolver. The |
| memory manager will be responsible for managing the memory allocated to JIT'd |
| machine code, setting memory permissions, and registering exception handling |
| tables (if the JIT'd code uses exceptions). For our memory manager we will use |
| the SectionMemoryManager class: another off-the-shelf utility that provides all |
| the basic functionality we need. The second auxiliary class, the symbol |
| resolver, is more interesting for us. It exists to tell the JIT where to look |
| when it encounters an *external symbol* in the module we are adding. External |
| symbols are any symbol not defined within the module itself, including calls to |
| functions outside the JIT and calls to functions defined in other modules that |
| have already been added to the JIT. It may seem as though modules added to the |
| JIT should "know about one another" by default, but since we would still have to |
| supply a symbol resolver for references to code outside the JIT it turns out to |
| be easier to just re-use this one mechanism for all symbol resolution. This has |
| the added benefit that the user has full control over the symbol resolution |
| process. Should we search for definitions within the JIT first, then fall back |
| on external definitions? Or should we prefer external definitions where |
| available and only JIT code if we don't already have an available |
| implementation? By using a single symbol resolution scheme we are free to choose |
| whatever makes the most sense for any given use case. |
| |
| Building a symbol resolver is made especially easy by the *createLambdaResolver* |
| function. This function takes two lambdas [3]_ and returns a JITSymbolResolver |
| instance. The first lambda is used as the implementation of the resolver's |
| findSymbolInLogicalDylib method, which searches for symbol definitions that |
| should be thought of as being part of the same "logical" dynamic library as this |
| Module. If you are familiar with static linking: this means that |
| findSymbolInLogicalDylib should expose symbols with common linkage and hidden |
| visibility. If all this sounds foreign you can ignore the details and just |
| remember that this is the first method that the linker will use to try to find a |
| symbol definition. If the findSymbolInLogicalDylib method returns a null result |
| then the linker will call the second symbol resolver method, called findSymbol, |
| which searches for symbols that should be thought of as external to (but |
| visibile from) the module and its logical dylib. In this tutorial we will adopt |
| the following simple scheme: All modules added to the JIT will behave as if they |
| were linked into a single, ever-growing logical dylib. To implement this our |
| first lambda (the one defining findSymbolInLogicalDylib) will just search for |
| JIT'd code by calling the CompileLayer's findSymbol method. If we don't find a |
| symbol in the JIT itself we'll fall back to our second lambda, which implements |
| findSymbol. This will use the RTDyldMemoryManager::getSymbolAddressInProcess |
| method to search for the symbol within the program itself. If we can't find a |
| symbol definition via either of these paths, the JIT will refuse to accept our |
| module, returning a "symbol not found" error. |
| |
| Now that we've built our symbol resolver, we're ready to add our module to the |
| JIT. We do this by calling the CompileLayer's addModuleSet method [4]_. Since |
| we only have a single Module and addModuleSet expects a collection, we will |
| create a vector of modules and add our module as the only member. Since we |
| have already typedef'd our ModuleHandleT type to be the same as the |
| CompileLayer's handle type, we can return the handle from addModuleSet |
| directly from our addModule method. |
| |
| .. code-block:: c++ |
| |
| JITSymbol findSymbol(const std::string Name) { |
| std::string MangledName; |
| raw_string_ostream MangledNameStream(MangledName); |
| Mangler::getNameWithPrefix(MangledNameStream, Name, DL); |
| return CompileLayer.findSymbol(MangledNameStream.str(), true); |
| } |
| |
| void removeModule(ModuleHandle H) { |
| CompileLayer.removeModuleSet(H); |
| } |
| |
| Now that we can add code to our JIT, we need a way to find the symbols we've |
| added to it. To do that we call the findSymbol method on our IRCompileLayer, |
| but with a twist: We have to *mangle* the name of the symbol we're searching |
| for first. The reason for this is that the ORC JIT components use mangled |
| symbols internally the same way a static compiler and linker would, rather |
| than using plain IR symbol names. The kind of mangling will depend on the |
| DataLayout, which in turn depends on the target platform. To allow us to |
| remain portable and search based on the un-mangled name, we just re-produce |
| this mangling ourselves. |
| |
| We now come to the last method in our JIT API: removeModule. This method is |
| responsible for destructing the MemoryManager and SymbolResolver that were |
| added with a given module, freeing any resources they were using in the |
| process. In our Kaleidoscope demo we rely on this method to remove the module |
| representing the most recent top-level expression, preventing it from being |
| treated as a duplicate definition when the next top-level expression is |
| entered. It is generally good to free any module that you know you won't need |
| to call further, just to free up the resources dedicated to it. However, you |
| don't strictly need to do this: All resources will be cleaned up when your |
| JIT class is destructed, if they haven't been freed before then. |
| |
| This brings us to the end of Chapter 1 of Building a JIT. You now have a basic |
| but fully functioning JIT stack that you can use to take LLVM IR and make it |
| executable within the context of your JIT process. In the next chapter we'll |
| look at how to extend this JIT to produce better quality code, and in the |
| process take a deeper look at the ORC layer concept. |
| |
| `Next: Extending the KaleidoscopeJIT <BuildingAJIT2.html>`_ |
| |
| Full Code Listing |
| ================= |
| |
| Here is the complete code listing for our running example. To build this |
| example, use: |
| |
| .. code-block:: bash |
| |
| # Compile |
| clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orc native` -O3 -o toy |
| # Run |
| ./toy |
| |
| Here is the code: |
| |
| .. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter1/KaleidoscopeJIT.h |
| :language: c++ |
| |
| .. [1] Actually we use a cut-down version of KaleidoscopeJIT that makes a |
| simplifying assumption: symbols cannot be re-defined. This will make it |
| impossible to re-define symbols in the REPL, but will make our symbol |
| lookup logic simpler. Re-introducing support for symbol redefinition is |
| left as an exercise for the reader. (The KaleidoscopeJIT.h used in the |
| original tutorials will be a helpful reference). |
| |
| .. [2] +-----------------------+-----------------------------------------------+ |
| | File | Reason for inclusion | |
| +=======================+===============================================+ |
| | ExecutionEngine.h | Access to the EngineBuilder::selectTarget | |
| | | method. | |
| +-----------------------+-----------------------------------------------+ |
| | | Access to the | |
| | RTDyldMemoryManager.h | RTDyldMemoryManager::getSymbolAddressInProcess| |
| | | method. | |
| +-----------------------+-----------------------------------------------+ |
| | CompileUtils.h | Provides the SimpleCompiler class. | |
| +-----------------------+-----------------------------------------------+ |
| | IRCompileLayer.h | Provides the IRCompileLayer class. | |
| +-----------------------+-----------------------------------------------+ |
| | | Access the createLambdaResolver function, | |
| | LambdaResolver.h | which provides easy construction of symbol | |
| | | resolvers. | |
| +-----------------------+-----------------------------------------------+ |
| | ObjectLinkingLayer.h | Provides the ObjectLinkingLayer class. | |
| +-----------------------+-----------------------------------------------+ |
| | Mangler.h | Provides the Mangler class for platform | |
| | | specific name-mangling. | |
| +-----------------------+-----------------------------------------------+ |
| | DynamicLibrary.h | Provides the DynamicLibrary class, which | |
| | | makes symbols in the host process searchable. | |
| +-----------------------+-----------------------------------------------+ |
| |
| .. [3] Actually they don't have to be lambdas, any object with a call operator |
| will do, including plain old functions or std::functions. |
| |
| .. [4] ORC layers accept sets of Modules, rather than individual ones, so that |
| all Modules in the set could be co-located by the memory manager, though |
| this feature is not yet implemented. |