README.rst revision be849444026ddd3b864164d96068b37c231440c1
1=============================================================== 2libbcc: A Versatile Bitcode Execution Engine for Mobile Devices 3=============================================================== 4 5 6Introduction 7------------ 8 9libbcc is an LLVM bitcode execution engine that compiles the bitcode 10to an in-memory executable. libbcc is versatile because: 11 12* it implements both AOT (Ahead-of-Time) and JIT (Just-in-Time) 13 compilation. 14 15* Android devices demand fast start-up time, small size, and high 16 performance *at the same time*. libbcc attempts to address these 17 design constraints. 18 19* it supports on-device linking. Each device vendor can supply his or 20 her own runtime bitcode library (lib*.bc) that differentiates his or 21 her system. Specialization becomes ecosystem-friendly. 22 23libbcc provides: 24 25* a *just-in-time bitcode compiler*, which translates the LLVM bitcode 26 into machine code 27 28* a *caching mechanism*, which can: 29 30 * after each compilation, serialize the in-memory executable into a 31 cache file. Note that the compilation is triggered by a cache 32 miss. 33 * load from the cache file upon cache-hit. 34 35Highlights of libbcc are: 36 37* libbcc supports bitcode from various language frontends, such as 38 RenderScript, GLSL (pixelflinger2). 39 40* libbcc strives to balance between library size, launch time and 41 steady-state performance: 42 43 * The size of libbcc is aggressively reduced for mobile devices. We 44 customize and improve upon the default Execution Engine from 45 upstream. Otherwise, libbcc's execution engine can easily become 46 at least 2 times bigger. 47 48 * To reduce launch time, we support caching of 49 binaries. Just-in-Time compilation are oftentimes Just-too-Late, 50 if the given apps are performance-sensitive. Thus, we implemented 51 AOT to get the best of both worlds: Fast launch time and high 52 steady-state performance. 53 54 AOT is also important for projects such as NDK on LLVM with 55 portability enhancement. Launch time reduction after we 56 implemented AOT is signficant:: 57 58 59 Apps libbcc without AOT libbcc with AOT 60 launch time in libbcc launch time in libbcc 61 App_1 1218ms 9ms 62 App_2 842ms 4ms 63 Wallpaper: 64 MagicSmoke 182ms 3ms 65 Halo 127ms 3ms 66 Balls 149ms 3ms 67 SceneGraph 146ms 90ms 68 Model 104ms 4ms 69 Fountain 57ms 3ms 70 71 AOT also masks the launching time overhead of on-device linking 72 and helps it become reality. 73 74 * For steady-state performance, we enable VFP3 and aggressive 75 optimizations. 76 77* Currently we disable Lazy JITting. 78 79 80 81API 82--- 83 84**Basic:** 85 86* **bccCreateScript** - Create new bcc script 87 88* **bccRegisterSymbolCallback** - Register the callback function for external 89 symbol lookup 90 91* **bccReadBC** - Set the source bitcode for compilation 92 93* **bccReadModule** - Set the llvm::Module for compilation 94 95* **bccLinkBC** - Set the library bitcode for linking 96 97* **bccPrepareExecutable** - Create the in-memory executable by either 98 just-in-time compilation or cache loading 99 100* **bccGetFuncAddr** - Get the entry address of the function 101 102* **bccDisposeScript** - Destroy bcc script and release the resources 103 104* **bccGetError** - *deprecated* - Don't use this 105 106 107**Reflection:** 108 109* **bccGetExportVarCount** - Get the count of exported variables 110 111* **bccGetExportVarList** - Get the addresses of exported variables 112 113* **bccGetExportFuncCount** - Get the count of exported functions 114 115* **bccGetExportFuncList** - Get the addresses of exported functions 116 117* **bccGetPragmaCount** - Get the count of pragmas 118 119* **bccGetPragmaList** - Get the pragmas 120 121 122**Debug:** 123 124* **bccGetFuncCount** - Get the count of functions (including non-exported) 125 126* **bccGetFuncInfoList** - Get the function information (name, base, size) 127 128 129 130Cache File Format 131----------------- 132 133A cache file (denoted as \*.oBCC) for libbcc consists of several sections: 134header, string pool, dependencies table, relocation table, exported 135variable list, exported function list, pragma list, function information 136table, and bcc context. Every section should be aligned to a word size. 137Here is the brief description of each sections: 138 139* **Header** (OBCC_Header) - The header of a cache file. It contains the 140 magic word, version, machine integer type information (the endianness, 141 the size of off_t, size_t, and ptr_t), and the size 142 and offset of other sections. The header section is guaranteed 143 to be at the beginning of the cache file. 144 145* **String Pool** (OBCC_StringPool) - A collection of serialized variable 146 length strings. The strp_index in the other part of the cache file 147 represents the index of such string in this string pool. 148 149* **Dependencies Table** (OBCC_DependencyTable) - The dependencies table. 150 This table stores the resource name (or file path), the resource 151 type (rather in APK or on the file system), and the SHA1 checksum. 152 153* **Relocation Table** (OBCC_RelocationTable) - *not enabled* 154 155* **Exported Variable List** (OBCC_ExportVarList) - 156 The list of the addresses of exported variables. 157 158* **Exported Function List** (OBCC_ExportFuncList) - 159 The list of the addresses of exported functions. 160 161* **Pragma List** (OBCC_PragmaList) - The list of pragma key-value pair. 162 163* **Function Information Table** (OBCC_FuncTable) - This is a table of 164 function information, such as function name, function entry address, 165 and function binary size. Besides, the table should be ordered by 166 function name. 167 168* **Context** - The context of the in-memory executable, including 169 the code and the data. The offset of context should aligned to 170 a page size, so that we can mmap the context directly into memory. 171 172For furthur information, you may read `bcc_cache.h <include/bcc/bcc_cache.h>`_, 173`CacheReader.cpp <lib/bcc/CacheReader.cpp>`_, and 174`CacheWriter.cpp <lib/bcc/CacheWriter.cpp>`_ for details. 175 176 177 178JIT'ed Code Calling Conventions 179------------------------------- 180 1811. Calls from Execution Environment or from/to within script: 182 183 On ARM, the first 4 arguments will go into r0, r1, r2, and r3, in that order. 184 The remaining (if any) will go through stack. 185 186 For ext_vec_types such as float2, a set of registers will be used. In the case 187 of float2, a register pair will be used. Specifically, if float2 is the first 188 argument in the function prototype, float2.x will go into r0, and float2.y, 189 r1. 190 191 Note: stack will be aligned to the coarsest-grained argument. In the case of 192 float2 above as an argument, parameter stack will be aligned to an 8-byte 193 boundary (if the sizes of other arguments are no greater than 8.) 194 1952. Calls from/to a separate compilation unit: (E.g., calls to Execution 196 Environment if those runtime library callees are not compiled using LLVM.) 197 198 On ARM, we use hardfp. Note that double will be placed in a register pair. 199