README.rst revision be849444026ddd3b864164d96068b37c231440c1
1===============================================================
2libbcc: A Versatile Bitcode Execution Engine for Mobile Devices
3===============================================================
4
5
6Introduction
7------------
8
9libbcc is an LLVM bitcode execution engine that compiles the bitcode
10to an in-memory executable. libbcc is versatile because:
11
12* it implements both AOT (Ahead-of-Time) and JIT (Just-in-Time)
13  compilation.
14
15* Android devices demand fast start-up time, small size, and high
16  performance *at the same time*. libbcc attempts to address these
17  design constraints.
18
19* it supports on-device linking. Each device vendor can supply his or
20  her own runtime bitcode library (lib*.bc) that differentiates his or
21  her system. Specialization becomes ecosystem-friendly.
22
23libbcc provides:
24
25* a *just-in-time bitcode compiler*, which translates the LLVM bitcode
26  into machine code
27
28* a *caching mechanism*, which can:
29
30  * after each compilation, serialize the in-memory executable into a
31    cache file.  Note that the compilation is triggered by a cache
32    miss.
33  * load from the cache file upon cache-hit.
34
35Highlights of libbcc are:
36
37* libbcc supports bitcode from various language frontends, such as
38  RenderScript, GLSL (pixelflinger2).
39
40* libbcc strives to balance between library size, launch time and
41  steady-state performance:
42
43  * The size of libbcc is aggressively reduced for mobile devices. We
44    customize and improve upon the default Execution Engine from
45    upstream. Otherwise, libbcc's execution engine can easily become
46    at least 2 times bigger.
47
48  * To reduce launch time, we support caching of
49    binaries. Just-in-Time compilation are oftentimes Just-too-Late,
50    if the given apps are performance-sensitive. Thus, we implemented
51    AOT to get the best of both worlds: Fast launch time and high
52    steady-state performance.
53
54    AOT is also important for projects such as NDK on LLVM with
55    portability enhancement. Launch time reduction after we
56    implemented AOT is signficant::
57
58
59     Apps          libbcc without AOT       libbcc with AOT
60                   launch time in libbcc    launch time in libbcc
61     App_1            1218ms                   9ms
62     App_2            842ms                    4ms
63     Wallpaper:
64       MagicSmoke     182ms                    3ms
65       Halo           127ms                    3ms
66     Balls            149ms                    3ms
67     SceneGraph       146ms                    90ms
68     Model            104ms                    4ms
69     Fountain         57ms                     3ms
70
71    AOT also masks the launching time overhead of on-device linking
72    and helps it become reality.
73
74  * For steady-state performance, we enable VFP3 and aggressive
75    optimizations.
76
77* Currently we disable Lazy JITting.
78
79
80
81API
82---
83
84**Basic:**
85
86* **bccCreateScript** - Create new bcc script
87
88* **bccRegisterSymbolCallback** - Register the callback function for external
89  symbol lookup
90
91* **bccReadBC** - Set the source bitcode for compilation
92
93* **bccReadModule** - Set the llvm::Module for compilation
94
95* **bccLinkBC** - Set the library bitcode for linking
96
97* **bccPrepareExecutable** - Create the in-memory executable by either
98  just-in-time compilation or cache loading
99
100* **bccGetFuncAddr** - Get the entry address of the function
101
102* **bccDisposeScript** - Destroy bcc script and release the resources
103
104* **bccGetError** - *deprecated* - Don't use this
105
106
107**Reflection:**
108
109* **bccGetExportVarCount** - Get the count of exported variables
110
111* **bccGetExportVarList** - Get the addresses of exported variables
112
113* **bccGetExportFuncCount** - Get the count of exported functions
114
115* **bccGetExportFuncList** - Get the addresses of exported functions
116
117* **bccGetPragmaCount** - Get the count of pragmas
118
119* **bccGetPragmaList** - Get the pragmas
120
121
122**Debug:**
123
124* **bccGetFuncCount** - Get the count of functions (including non-exported)
125
126* **bccGetFuncInfoList** - Get the function information (name, base, size)
127
128
129
130Cache File Format
131-----------------
132
133A cache file (denoted as \*.oBCC) for libbcc consists of several sections:
134header, string pool, dependencies table, relocation table, exported
135variable list, exported function list, pragma list, function information
136table, and bcc context.  Every section should be aligned to a word size.
137Here is the brief description of each sections:
138
139* **Header** (OBCC_Header) - The header of a cache file. It contains the
140  magic word, version, machine integer type information (the endianness,
141  the size of off_t, size_t, and ptr_t), and the size
142  and offset of other sections.  The header section is guaranteed
143  to be at the beginning of the cache file.
144
145* **String Pool** (OBCC_StringPool) - A collection of serialized variable
146  length strings.  The strp_index in the other part of the cache file
147  represents the index of such string in this string pool.
148
149* **Dependencies Table** (OBCC_DependencyTable) - The dependencies table.
150  This table stores the resource name (or file path), the resource
151  type (rather in APK or on the file system), and the SHA1 checksum.
152
153* **Relocation Table** (OBCC_RelocationTable) - *not enabled*
154
155* **Exported Variable List** (OBCC_ExportVarList) -
156  The list of the addresses of exported variables.
157
158* **Exported Function List** (OBCC_ExportFuncList) -
159  The list of the addresses of exported functions.
160
161* **Pragma List** (OBCC_PragmaList) - The list of pragma key-value pair.
162
163* **Function Information Table** (OBCC_FuncTable) - This is a table of
164  function information, such as function name, function entry address,
165  and function binary size.  Besides, the table should be ordered by
166  function name.
167
168* **Context** - The context of the in-memory executable, including
169  the code and the data.  The offset of context should aligned to
170  a page size, so that we can mmap the context directly into memory.
171
172For furthur information, you may read `bcc_cache.h <include/bcc/bcc_cache.h>`_,
173`CacheReader.cpp <lib/bcc/CacheReader.cpp>`_, and
174`CacheWriter.cpp <lib/bcc/CacheWriter.cpp>`_ for details.
175
176
177
178JIT'ed Code Calling Conventions
179-------------------------------
180
1811. Calls from Execution Environment or from/to within script:
182
183   On ARM, the first 4 arguments will go into r0, r1, r2, and r3, in that order.
184   The remaining (if any) will go through stack.
185
186   For ext_vec_types such as float2, a set of registers will be used. In the case
187   of float2, a register pair will be used. Specifically, if float2 is the first
188   argument in the function prototype, float2.x will go into r0, and float2.y,
189   r1.
190
191   Note: stack will be aligned to the coarsest-grained argument. In the case of
192   float2 above as an argument, parameter stack will be aligned to an 8-byte
193   boundary (if the sizes of other arguments are no greater than 8.)
194
1952. Calls from/to a separate compilation unit: (E.g., calls to Execution
196   Environment if those runtime library callees are not compiled using LLVM.)
197
198   On ARM, we use hardfp.  Note that double will be placed in a register pair.
199