1==============================
2PNaCl Bitcode Reference Manual
3==============================
4
5.. contents::
6   :local:
7   :backlinks: none
8   :depth: 3
9
10Introduction
11============
12
13This document is a reference manual for the PNaCl bitcode format. It describes
14the bitcode on a *semantic* level; the physical encoding level will be described
15elsewhere. For the purpose of this document, the textual form of LLVM IR is
16used to describe instructions and other bitcode constructs.
17
18Since the PNaCl bitcode is based to a large extent on LLVM IR as of
19version 3.3, many sections in this document point to a relevant section
20of the LLVM language reference manual. Only the changes, restrictions
21and variations specific to PNaCl are described---full semantic
22descriptions are not duplicated from the LLVM reference manual.
23
24High Level Structure
25====================
26
27A PNaCl portable executable (**pexe** in short) is a single LLVM IR module.
28
29Data Model
30----------
31
32The data model for PNaCl bitcode is fixed at little-endian ILP32: pointers are
3332 bits in size. 64-bit integer types are also supported natively via the i64
34type (for example, a front-end can generate these from the C/C++ type
35``long long``).
36
37Floating point support is fixed at IEEE 754 32-bit and 64-bit values (f32 and
38f64, respectively).
39
40.. _bitcode_linkagetypes:
41
42Linkage Types
43-------------
44
45`LLVM LangRef: Linkage Types
46<http://llvm.org/releases/3.3/docs/LangRef.html#linkage>`_
47
48The linkage types supported by PNaCl bitcode are ``internal`` and ``external``.
49A single function in the pexe, named ``_start``, has the linkage type
50``external``. All the other functions and globals have the linkage type
51``internal``.
52
53Calling Conventions
54-------------------
55
56`LLVM LangRef: Calling Conventions
57<http://llvm.org/releases/3.3/docs/LangRef.html#callingconv>`_
58
59The only calling convention supported by PNaCl bitcode is ``ccc`` - the C
60calling convention.
61
62Visibility Styles
63-----------------
64
65`LLVM LangRef: Visibility Styles
66<http://llvm.org/releases/3.3/docs/LangRef.html#visibility-styles>`_
67
68PNaCl bitcode does not support visibility styles.
69
70.. _bitcode_globalvariables:
71
72Global Variables
73----------------
74
75`LLVM LangRef: Global Variables
76<http://llvm.org/releases/3.3/docs/LangRef.html#globalvars>`_
77
78Restrictions on global variables:
79
80* PNaCl bitcode does not support LLVM IR TLS models. See
81  :ref:`language_support_threading` for more details.
82* Restrictions on :ref:`linkage types <bitcode_linkagetypes>`.
83* The ``addrspace``, ``section``, ``unnamed_addr`` and
84  ``externally_initialized`` attributes are not supported.
85
86Every global variable must have an initializer. Each initializer must be
87either a *SimpleElement* or a *CompoundElement*, defined as follows.
88
89A *SimpleElement* is one of the following:
90
911) An i8 array literal or ``zeroinitializer``:
92
93.. naclcode::
94  :prettyprint: 0
95
96     [SIZE x i8] c"DATA"
97     [SIZE x i8] zeroinitializer
98
992) A reference to a *GlobalValue* (a function or global variable) with an
100   optional 32-bit byte offset added to it (the addend, which may be
101   negative):
102
103.. naclcode::
104  :prettyprint: 0
105
106     ptrtoint (TYPE* @GLOBAL to i32)
107     add (i32 ptrtoint (TYPE* @GLOBAL to i32), i32 ADDEND)
108
109A *CompoundElement* is a unnamed, packed struct containing more than one
110*SimpleElement*.
111
112Functions
113---------
114
115`LLVM LangRef: Functions
116<http://llvm.org/releases/3.3/docs/LangRef.html#functionstructure>`_
117
118The restrictions on :ref:`linkage types <bitcode_linkagetypes>`, calling
119conventions and visibility styles apply to functions. In addition, the following
120are not supported for functions:
121
122* Function attributes (either for the the function itself, its parameters or its
123  return type).
124* Garbage collector name (``gc``).
125* Functions with a variable number of arguments (*vararg*).
126* Alignment (``align``).
127
128Aliases
129-------
130
131`LLVM LangRef: Aliases
132<http://llvm.org/releases/3.3/docs/LangRef.html#aliases>`_
133
134PNaCl bitcode does not support aliases.
135
136Named Metadata
137--------------
138
139`LLVM LangRef: Named Metadata
140<http://llvm.org/releases/3.3/docs/LangRef.html#namedmetadatastructure>`_
141
142While PNaCl bitcode has provisions for debugging metadata, it is not considered
143part of the stable ABI. It exists for tool support and should not appear in
144distributed pexes.
145
146Other kinds of LLVM metadata are not supported.
147
148Module-Level Inline Assembly
149----------------------------
150
151`LLVM LangRef: Module-Level Inline Assembly
152<http://llvm.org/releases/3.3/docs/LangRef.html#moduleasm>`_
153
154PNaCl bitcode does not support inline assembly.
155
156Volatile Memory Accesses
157------------------------
158
159`LLVM LangRef: Volatile Memory Accesses
160<http://llvm.org/releases/3.3/docs/LangRef.html#volatile>`_
161
162PNaCl bitcode does not support volatile memory accesses. The
163``volatile`` attribute on loads and stores is not supported. See the
164:doc:`pnacl-c-cpp-language-support` for more details.
165
166Memory Model for Concurrent Operations
167--------------------------------------
168
169`LLVM LangRef: Memory Model for Concurrent Operations
170<http://llvm.org/releases/3.3/docs/LangRef.html#memmodel>`_
171
172See the `PNaCl Developer's Guide <PNaClDeveloperGuide.html>`_ for more
173details.
174
175Fast-Math Flags
176---------------
177
178`LLVM LangRef: Fast-Math Flags
179<http://llvm.org/releases/3.3/docs/LangRef.html#fastmath>`_
180
181Fast-math mode is not currently supported by the PNaCl bitcode.
182
183Type System
184===========
185
186`LLVM LangRef: Type System
187<http://llvm.org/releases/3.3/docs/LangRef.html#typesystem>`_
188
189The LLVM types allowed in PNaCl bitcode are restricted, as follows:
190
191Scalar types
192------------
193
194* The only scalar types allowed are integer, float (32-bit floating point),
195  double (64-bit floating point) and void.
196
197  * The only integer sizes allowed are i1, i8, i16, i32 and i64.
198  * The only integer sizes allowed for function arguments and function return
199    values are i32 and i64.
200
201Vector types
202------------
203
204The only vector types allowed are:
205
206* 128-bit vectors integers of elements size i8, i16, i32.
207* 128-bit vectors of float elements.
208* Vectors of i1 type with element counts corresponding to the allowed
209  element counts listed previously (their width is therefore not
210  128-bits).
211
212Array and struct types
213----------------------
214
215Array and struct types are only allowed in
216:ref:`global variable initializers <bitcode_globalvariables>`.
217
218.. _bitcode_pointertypes:
219
220Pointer types
221-------------
222
223Only the following pointer types are allowed:
224
225* Pointers to valid PNaCl bitcode scalar types, as specified above, except for
226  ``i1``.
227* Pointers to valid PNaCl bitcode vector types, as specified above, except for
228  ``<? x i1>``.
229* Pointers to functions.
230
231In addition, the address space for all pointers must be 0.
232
233A pointer is *inherent* when it represents the return value of an ``alloca``
234instruction, or is an address of a global value.
235
236A pointer is *normalized* if it's either:
237
238* *inherent*
239* Is the return value of a ``bitcast`` instruction.
240* Is the return value of a ``inttoptr`` instruction.
241
242Undefined Values
243----------------
244
245`LLVM LangRef: Undefined Values
246<http://llvm.org/releases/3.3/docs/LangRef.html#undefvalues>`_
247
248``undef`` is only allowed within functions, not in global variable initializers.
249
250Constant Expressions
251--------------------
252
253`LLVM LangRef: Constant Expressions
254<http://llvm.org/releases/3.3/docs/LangRef.html#constant-expressions>`_
255
256Constant expressions are only allowed in
257:ref:`global variable initializers <bitcode_globalvariables>`.
258
259Other Values
260============
261
262Metadata Nodes and Metadata Strings
263-----------------------------------
264
265`LLVM LangRef: Metadata Nodes and Metadata Strings
266<http://llvm.org/releases/3.3/docs/LangRef.html#metadata>`_
267
268While PNaCl bitcode has provisions for debugging metadata, it is not considered
269part of the stable ABI. It exists for tool support and should not appear in
270distributed pexes.
271
272Other kinds of LLVM metadata are not supported.
273
274Intrinsic Global Variables
275==========================
276
277`LLVM LangRef: Intrinsic Global Variables
278<http://llvm.org/releases/3.3/docs/LangRef.html#intrinsic-global-variables>`_
279
280PNaCl bitcode does not support intrinsic global variables.
281
282.. _ir_and_errno:
283
284Errno and errors in arithmetic instructions
285===========================================
286
287Some arithmetic instructions and intrinsics have the similar semantics to
288libc math functions, but differ in the treatment of ``errno``. While the
289libc functions may set ``errno`` for domain errors, the instructions and
290intrinsics do not. This is because the variable ``errno`` is not special
291and is not required to be part of the program.
292
293Instruction Reference
294=====================
295
296List of allowed instructions
297----------------------------
298
299This is a list of LLVM instructions supported by PNaCl bitcode. Where
300applicable, PNaCl-specific restrictions are provided.
301
302.. TODO: explain instructions or link in the future
303
304The following attributes are disallowed for all instructions:
305
306* ``nsw`` and ``nuw``
307* ``exact``
308
309Only the LLVM instructions listed here are supported by PNaCl bitcode.
310
311* ``ret``
312* ``br``
313* ``switch``
314
315  i1 values are disallowed for ``switch``.
316
317* ``add``, ``sub``, ``mul``, ``shl``,  ``udiv``, ``sdiv``, ``urem``, ``srem``,
318  ``lshr``, ``ashr``
319
320  These arithmetic operations are disallowed on values of type ``i1``.
321
322  Integer division (``udiv``, ``sdiv``, ``urem``, ``srem``) by zero is
323  guaranteed to trap in PNaCl bitcode.
324
325* ``and``
326* ``or``
327* ``xor``
328* ``fadd``
329* ``fsub``
330* ``fmul``
331* ``fdiv``
332* ``frem``
333
334  The frem instruction has the semantics of the libc fmod function for
335  computing the floating point remainder. If the numerator is infinity, or
336  denominator is zero, or either are NaN, then the result is NaN.
337  Unlike the libc fmod function, this does not set ``errno`` when the
338  result is NaN (see the :ref:`instructions and errno <ir_and_errno>`
339  section).
340
341* ``alloca``
342
343  See :ref:`alloca instructions <bitcode_allocainst>`.
344
345* ``load``, ``store``
346
347  The pointer argument of these instructions must be a *normalized* pointer (see
348  :ref:`pointer types <bitcode_pointertypes>`). The ``volatile`` and ``atomic``
349  attributes are not supported. Loads and stores of the type ``i1`` and ``<? x
350  i1>`` are not supported.
351
352  These instructions must follow the following alignment restrictions:
353
354  * On integer memory accesses: ``align 1``.
355  * On ``float`` memory accesses: ``align 1`` or ``align 4``.
356  * On ``double`` memory accesses: ``align 1`` or ``align 8``.
357  * On vector memory accesses: alignment at the vector's element width, for
358    example ``<4 x i32>`` must be ``align 4``.
359
360* ``trunc``
361* ``zext``
362* ``sext``
363* ``fptrunc``
364* ``fpext``
365* ``fptoui``
366* ``fptosi``
367* ``uitofp``
368* ``sitofp``
369
370* ``ptrtoint``
371
372  The pointer argument of a ``ptrtoint`` instruction must be a *normalized*
373  pointer (see :ref:`pointer types <bitcode_pointertypes>`) and the integer
374  argument must be an i32.
375
376* ``inttoptr``
377
378  The integer argument of a ``inttoptr`` instruction must be an i32.
379
380* ``bitcast``
381
382  The pointer argument of a ``bitcast`` instruction must be a *inherent* pointer
383  (see :ref:`pointer types <bitcode_pointertypes>`).
384
385* ``icmp``
386* ``fcmp``
387* ``phi``
388* ``select``
389* ``call``
390* ``unreachable``
391* ``insertelement``
392* ``extractelement``
393
394.. _bitcode_allocainst:
395
396``alloca``
397----------
398
399The only allowed type for ``alloca`` instructions in PNaCl bitcode is i8. The
400size argument must be an i32. For example:
401
402.. naclcode::
403  :prettyprint: 0
404
405    %buf = alloca i8, i32 8, align 4
406
407Intrinsic Functions
408===================
409
410`LLVM LangRef: Intrinsic Functions
411<http://llvm.org/releases/3.3/docs/LangRef.html#intrinsics>`_
412
413List of allowed intrinsics
414--------------------------
415
416The only intrinsics supported by PNaCl bitcode are the following.
417
418* ``llvm.memcpy``
419* ``llvm.memmove``
420* ``llvm.memset``
421
422  These intrinsics are only supported with an i32 ``len`` argument.
423
424* ``llvm.bswap``
425
426  The overloaded ``llvm.bswap`` intrinsic is only supported with the following
427  argument types: i16, i32, i64 (the types supported by C-style GCC builtins).
428
429* ``llvm.ctlz``
430* ``llvm.cttz``
431* ``llvm.ctpop``
432
433  The overloaded llvm.ctlz, llvm.cttz, and llvm.ctpop intrinsics are only
434  supported with the i32 and i64 argument types (the types supported by
435  C-style GCC builtins).
436
437* ``llvm.sqrt``
438
439  The overloaded ``llvm.sqrt`` intrinsic is only supported for float
440  and double arguments types. This has the same semantics as the libc
441  sqrt function, returning NaN for values less than -0.0. However, this
442  does not set ``errno`` when the result is NaN (see the
443  :ref:`instructions and errno <ir_and_errno>` section).
444
445* ``llvm.stacksave``
446* ``llvm.stackrestore``
447
448  These intrinsics are used to implement language features like scoped automatic
449  variable sized arrays in C99. ``llvm.stacksave`` returns a value that
450  represents the current state of the stack. This value may only be used as the
451  argument to ``llvm.stackrestore``, which restores the stack to the given
452  state.
453
454* ``llvm.trap``
455
456  This intrinsic is lowered to a target dependent trap instruction, which aborts
457  execution.
458
459* ``llvm.nacl.read.tp``
460
461  See :ref:`thread pointer related intrinsics
462  <bitcode_threadpointerintrinsics>`.
463
464* ``llvm.nacl.longjmp``
465* ``llvm.nacl.setjmp``
466
467  See :ref:`Setjmp and Longjmp <bitcode_setjmplongjmp>`.
468
469* ``llvm.nacl.atomic.store``
470* ``llvm.nacl.atomic.load``
471* ``llvm.nacl.atomic.rmw``
472* ``llvm.nacl.atomic.cmpxchg``
473* ``llvm.nacl.atomic.fence``
474* ``llvm.nacl.atomic.fence.all``
475* ``llvm.nacl.atomic.is.lock.free``
476
477  See :ref:`atomic intrinsics <bitcode_atomicintrinsics>`.
478
479.. _bitcode_threadpointerintrinsics:
480
481Thread pointer related intrinsics
482---------------------------------
483
484.. naclcode::
485  :prettyprint: 0
486
487    declare i8* @llvm.nacl.read.tp()
488
489Returns a read-only thread pointer. The value is controlled by the embedding
490sandbox's runtime.
491
492.. _bitcode_setjmplongjmp:
493
494Setjmp and Longjmp
495------------------
496
497.. naclcode::
498  :prettyprint: 0
499
500    declare void @llvm.nacl.longjmp(i8* %jmpbuf, i32)
501    declare i32 @llvm.nacl.setjmp(i8* %jmpbuf)
502
503These intrinsics implement the semantics of C11 ``setjmp`` and ``longjmp``. The
504``jmpbuf`` pointer must be 64-bit aligned and point to at least 1024 bytes of
505allocated memory.
506
507.. _bitcode_atomicintrinsics:
508
509Atomic intrinsics
510-----------------
511
512.. naclcode::
513  :prettyprint: 0
514
515    declare iN @llvm.nacl.atomic.load.<size>(
516            iN* <source>, i32 <memory_order>)
517    declare void @llvm.nacl.atomic.store.<size>(
518            iN <operand>, iN* <destination>, i32 <memory_order>)
519    declare iN @llvm.nacl.atomic.rmw.<size>(
520            i32 <computation>, iN* <object>, iN <operand>, i32 <memory_order>)
521    declare iN @llvm.nacl.atomic.cmpxchg.<size>(
522            iN* <object>, iN <expected>, iN <desired>,
523            i32 <memory_order_success>, i32 <memory_order_failure>)
524    declare void @llvm.nacl.atomic.fence(i32 <memory_order>)
525    declare void @llvm.nacl.atomic.fence.all()
526
527Each of these intrinsics is overloaded on the ``iN`` argument, which is
528reflected through ``<size>`` in the overload's name. Integral types of
5298, 16, 32 and 64-bit width are supported for these arguments.
530
531The ``@llvm.nacl.atomic.rmw`` intrinsic implements the following
532read-modify-write operations, from the general and arithmetic sections
533of the C11/C++11 standards:
534
535 - ``add``
536 - ``sub``
537 - ``or``
538 - ``and``
539 - ``xor``
540 - ``exchange``
541
542For all of these read-modify-write operations, the returned value is
543that at ``object`` before the computation. The ``computation`` argument
544must be a compile-time constant.
545
546All atomic intrinsics also support C11/C++11 memory orderings, which
547must be compile-time constants.
548
549Integer values for these computations and memory orderings are defined
550in ``"llvm/IR/NaClAtomicIntrinsics.h"``.
551
552The ``@llvm.nacl.atomic.fence.all`` intrinsic is equivalent to the
553``@llvm.nacl.atomic.fence`` intrinsic with sequentially consistent
554ordering and compiler barriers preventing most non-atomic memory
555accesses from reordering around it.
556
557.. Note::
558  :class: note
559
560    These intrinsics allow PNaCl to support C11/C++11 style atomic
561    operations as well as some legacy GCC-style ``__sync_*`` builtins
562    while remaining stable as the LLVM codebase changes. The user isn't
563    expected to use these intrinsics directly.
564
565.. naclcode::
566  :prettyprint: 0
567
568    declare i1 @llvm.nacl.atomic.is.lock.free(i32 <byte_size>, i8* <address>)
569
570The ``llvm.nacl.atomic.is.lock.free`` intrinsic is designed to
571determine at translation time whether atomic operations of a certain
572``byte_size`` (a compile-time constant), at a particular ``address``,
573are lock-free or not. This reflects the C11 ``atomic_is_lock_free``
574function from header ``<stdatomic.h>`` and the C++11 ``is_lock_free``
575member function in header ``<atomic>``. It can be used through the
576``__nacl_atomic_is_lock_free`` builtin.
577