1============================== 2PNaCl Bitcode Reference Manual 3============================== 4 5.. contents:: 6 :local: 7 :backlinks: none 8 :depth: 3 9 10Introduction 11============ 12 13This document is a reference manual for the PNaCl bitcode format. It describes 14the bitcode on a *semantic* level; the physical encoding level will be described 15elsewhere. For the purpose of this document, the textual form of LLVM IR is 16used to describe instructions and other bitcode constructs. 17 18Since the PNaCl bitcode is based to a large extent on LLVM IR as of 19version 3.3, many sections in this document point to a relevant section 20of the LLVM language reference manual. Only the changes, restrictions 21and variations specific to PNaCl are described---full semantic 22descriptions are not duplicated from the LLVM reference manual. 23 24High Level Structure 25==================== 26 27A PNaCl portable executable (**pexe** in short) is a single LLVM IR module. 28 29Data Model 30---------- 31 32The data model for PNaCl bitcode is fixed at little-endian ILP32: pointers are 3332 bits in size. 64-bit integer types are also supported natively via the i64 34type (for example, a front-end can generate these from the C/C++ type 35``long long``). 36 37Floating point support is fixed at IEEE 754 32-bit and 64-bit values (f32 and 38f64, respectively). 39 40.. _bitcode_linkagetypes: 41 42Linkage Types 43------------- 44 45`LLVM LangRef: Linkage Types 46<http://llvm.org/releases/3.3/docs/LangRef.html#linkage>`_ 47 48The linkage types supported by PNaCl bitcode are ``internal`` and ``external``. 49A single function in the pexe, named ``_start``, has the linkage type 50``external``. All the other functions and globals have the linkage type 51``internal``. 52 53Calling Conventions 54------------------- 55 56`LLVM LangRef: Calling Conventions 57<http://llvm.org/releases/3.3/docs/LangRef.html#callingconv>`_ 58 59The only calling convention supported by PNaCl bitcode is ``ccc`` - the C 60calling convention. 61 62Visibility Styles 63----------------- 64 65`LLVM LangRef: Visibility Styles 66<http://llvm.org/releases/3.3/docs/LangRef.html#visibility-styles>`_ 67 68PNaCl bitcode does not support visibility styles. 69 70.. _bitcode_globalvariables: 71 72Global Variables 73---------------- 74 75`LLVM LangRef: Global Variables 76<http://llvm.org/releases/3.3/docs/LangRef.html#globalvars>`_ 77 78Restrictions on global variables: 79 80* PNaCl bitcode does not support LLVM IR TLS models. See 81 :ref:`language_support_threading` for more details. 82* Restrictions on :ref:`linkage types <bitcode_linkagetypes>`. 83* The ``addrspace``, ``section``, ``unnamed_addr`` and 84 ``externally_initialized`` attributes are not supported. 85 86Every global variable must have an initializer. Each initializer must be 87either a *SimpleElement* or a *CompoundElement*, defined as follows. 88 89A *SimpleElement* is one of the following: 90 911) An i8 array literal or ``zeroinitializer``: 92 93.. naclcode:: 94 :prettyprint: 0 95 96 [SIZE x i8] c"DATA" 97 [SIZE x i8] zeroinitializer 98 992) A reference to a *GlobalValue* (a function or global variable) with an 100 optional 32-bit byte offset added to it (the addend, which may be 101 negative): 102 103.. naclcode:: 104 :prettyprint: 0 105 106 ptrtoint (TYPE* @GLOBAL to i32) 107 add (i32 ptrtoint (TYPE* @GLOBAL to i32), i32 ADDEND) 108 109A *CompoundElement* is a unnamed, packed struct containing more than one 110*SimpleElement*. 111 112Functions 113--------- 114 115`LLVM LangRef: Functions 116<http://llvm.org/releases/3.3/docs/LangRef.html#functionstructure>`_ 117 118The restrictions on :ref:`linkage types <bitcode_linkagetypes>`, calling 119conventions and visibility styles apply to functions. In addition, the following 120are not supported for functions: 121 122* Function attributes (either for the the function itself, its parameters or its 123 return type). 124* Garbage collector name (``gc``). 125* Functions with a variable number of arguments (*vararg*). 126* Alignment (``align``). 127 128Aliases 129------- 130 131`LLVM LangRef: Aliases 132<http://llvm.org/releases/3.3/docs/LangRef.html#aliases>`_ 133 134PNaCl bitcode does not support aliases. 135 136Named Metadata 137-------------- 138 139`LLVM LangRef: Named Metadata 140<http://llvm.org/releases/3.3/docs/LangRef.html#namedmetadatastructure>`_ 141 142While PNaCl bitcode has provisions for debugging metadata, it is not considered 143part of the stable ABI. It exists for tool support and should not appear in 144distributed pexes. 145 146Other kinds of LLVM metadata are not supported. 147 148Module-Level Inline Assembly 149---------------------------- 150 151`LLVM LangRef: Module-Level Inline Assembly 152<http://llvm.org/releases/3.3/docs/LangRef.html#moduleasm>`_ 153 154PNaCl bitcode does not support inline assembly. 155 156Volatile Memory Accesses 157------------------------ 158 159`LLVM LangRef: Volatile Memory Accesses 160<http://llvm.org/releases/3.3/docs/LangRef.html#volatile>`_ 161 162PNaCl bitcode does not support volatile memory accesses. The 163``volatile`` attribute on loads and stores is not supported. See the 164:doc:`pnacl-c-cpp-language-support` for more details. 165 166Memory Model for Concurrent Operations 167-------------------------------------- 168 169`LLVM LangRef: Memory Model for Concurrent Operations 170<http://llvm.org/releases/3.3/docs/LangRef.html#memmodel>`_ 171 172See the `PNaCl Developer's Guide <PNaClDeveloperGuide.html>`_ for more 173details. 174 175Fast-Math Flags 176--------------- 177 178`LLVM LangRef: Fast-Math Flags 179<http://llvm.org/releases/3.3/docs/LangRef.html#fastmath>`_ 180 181Fast-math mode is not currently supported by the PNaCl bitcode. 182 183Type System 184=========== 185 186`LLVM LangRef: Type System 187<http://llvm.org/releases/3.3/docs/LangRef.html#typesystem>`_ 188 189The LLVM types allowed in PNaCl bitcode are restricted, as follows: 190 191Scalar types 192------------ 193 194* The only scalar types allowed are integer, float (32-bit floating point), 195 double (64-bit floating point) and void. 196 197 * The only integer sizes allowed are i1, i8, i16, i32 and i64. 198 * The only integer sizes allowed for function arguments and function return 199 values are i32 and i64. 200 201Vector types 202------------ 203 204The only vector types allowed are: 205 206* 128-bit vectors integers of elements size i8, i16, i32. 207* 128-bit vectors of float elements. 208* Vectors of i1 type with element counts corresponding to the allowed 209 element counts listed previously (their width is therefore not 210 128-bits). 211 212Array and struct types 213---------------------- 214 215Array and struct types are only allowed in 216:ref:`global variable initializers <bitcode_globalvariables>`. 217 218.. _bitcode_pointertypes: 219 220Pointer types 221------------- 222 223Only the following pointer types are allowed: 224 225* Pointers to valid PNaCl bitcode scalar types, as specified above, except for 226 ``i1``. 227* Pointers to valid PNaCl bitcode vector types, as specified above, except for 228 ``<? x i1>``. 229* Pointers to functions. 230 231In addition, the address space for all pointers must be 0. 232 233A pointer is *inherent* when it represents the return value of an ``alloca`` 234instruction, or is an address of a global value. 235 236A pointer is *normalized* if it's either: 237 238* *inherent* 239* Is the return value of a ``bitcast`` instruction. 240* Is the return value of a ``inttoptr`` instruction. 241 242Undefined Values 243---------------- 244 245`LLVM LangRef: Undefined Values 246<http://llvm.org/releases/3.3/docs/LangRef.html#undefvalues>`_ 247 248``undef`` is only allowed within functions, not in global variable initializers. 249 250Constant Expressions 251-------------------- 252 253`LLVM LangRef: Constant Expressions 254<http://llvm.org/releases/3.3/docs/LangRef.html#constant-expressions>`_ 255 256Constant expressions are only allowed in 257:ref:`global variable initializers <bitcode_globalvariables>`. 258 259Other Values 260============ 261 262Metadata Nodes and Metadata Strings 263----------------------------------- 264 265`LLVM LangRef: Metadata Nodes and Metadata Strings 266<http://llvm.org/releases/3.3/docs/LangRef.html#metadata>`_ 267 268While PNaCl bitcode has provisions for debugging metadata, it is not considered 269part of the stable ABI. It exists for tool support and should not appear in 270distributed pexes. 271 272Other kinds of LLVM metadata are not supported. 273 274Intrinsic Global Variables 275========================== 276 277`LLVM LangRef: Intrinsic Global Variables 278<http://llvm.org/releases/3.3/docs/LangRef.html#intrinsic-global-variables>`_ 279 280PNaCl bitcode does not support intrinsic global variables. 281 282.. _ir_and_errno: 283 284Errno and errors in arithmetic instructions 285=========================================== 286 287Some arithmetic instructions and intrinsics have the similar semantics to 288libc math functions, but differ in the treatment of ``errno``. While the 289libc functions may set ``errno`` for domain errors, the instructions and 290intrinsics do not. This is because the variable ``errno`` is not special 291and is not required to be part of the program. 292 293Instruction Reference 294===================== 295 296List of allowed instructions 297---------------------------- 298 299This is a list of LLVM instructions supported by PNaCl bitcode. Where 300applicable, PNaCl-specific restrictions are provided. 301 302.. TODO: explain instructions or link in the future 303 304The following attributes are disallowed for all instructions: 305 306* ``nsw`` and ``nuw`` 307* ``exact`` 308 309Only the LLVM instructions listed here are supported by PNaCl bitcode. 310 311* ``ret`` 312* ``br`` 313* ``switch`` 314 315 i1 values are disallowed for ``switch``. 316 317* ``add``, ``sub``, ``mul``, ``shl``, ``udiv``, ``sdiv``, ``urem``, ``srem``, 318 ``lshr``, ``ashr`` 319 320 These arithmetic operations are disallowed on values of type ``i1``. 321 322 Integer division (``udiv``, ``sdiv``, ``urem``, ``srem``) by zero is 323 guaranteed to trap in PNaCl bitcode. 324 325* ``and`` 326* ``or`` 327* ``xor`` 328* ``fadd`` 329* ``fsub`` 330* ``fmul`` 331* ``fdiv`` 332* ``frem`` 333 334 The frem instruction has the semantics of the libc fmod function for 335 computing the floating point remainder. If the numerator is infinity, or 336 denominator is zero, or either are NaN, then the result is NaN. 337 Unlike the libc fmod function, this does not set ``errno`` when the 338 result is NaN (see the :ref:`instructions and errno <ir_and_errno>` 339 section). 340 341* ``alloca`` 342 343 See :ref:`alloca instructions <bitcode_allocainst>`. 344 345* ``load``, ``store`` 346 347 The pointer argument of these instructions must be a *normalized* pointer (see 348 :ref:`pointer types <bitcode_pointertypes>`). The ``volatile`` and ``atomic`` 349 attributes are not supported. Loads and stores of the type ``i1`` and ``<? x 350 i1>`` are not supported. 351 352 These instructions must follow the following alignment restrictions: 353 354 * On integer memory accesses: ``align 1``. 355 * On ``float`` memory accesses: ``align 1`` or ``align 4``. 356 * On ``double`` memory accesses: ``align 1`` or ``align 8``. 357 * On vector memory accesses: alignment at the vector's element width, for 358 example ``<4 x i32>`` must be ``align 4``. 359 360* ``trunc`` 361* ``zext`` 362* ``sext`` 363* ``fptrunc`` 364* ``fpext`` 365* ``fptoui`` 366* ``fptosi`` 367* ``uitofp`` 368* ``sitofp`` 369 370* ``ptrtoint`` 371 372 The pointer argument of a ``ptrtoint`` instruction must be a *normalized* 373 pointer (see :ref:`pointer types <bitcode_pointertypes>`) and the integer 374 argument must be an i32. 375 376* ``inttoptr`` 377 378 The integer argument of a ``inttoptr`` instruction must be an i32. 379 380* ``bitcast`` 381 382 The pointer argument of a ``bitcast`` instruction must be a *inherent* pointer 383 (see :ref:`pointer types <bitcode_pointertypes>`). 384 385* ``icmp`` 386* ``fcmp`` 387* ``phi`` 388* ``select`` 389* ``call`` 390* ``unreachable`` 391* ``insertelement`` 392* ``extractelement`` 393 394.. _bitcode_allocainst: 395 396``alloca`` 397---------- 398 399The only allowed type for ``alloca`` instructions in PNaCl bitcode is i8. The 400size argument must be an i32. For example: 401 402.. naclcode:: 403 :prettyprint: 0 404 405 %buf = alloca i8, i32 8, align 4 406 407Intrinsic Functions 408=================== 409 410`LLVM LangRef: Intrinsic Functions 411<http://llvm.org/releases/3.3/docs/LangRef.html#intrinsics>`_ 412 413List of allowed intrinsics 414-------------------------- 415 416The only intrinsics supported by PNaCl bitcode are the following. 417 418* ``llvm.memcpy`` 419* ``llvm.memmove`` 420* ``llvm.memset`` 421 422 These intrinsics are only supported with an i32 ``len`` argument. 423 424* ``llvm.bswap`` 425 426 The overloaded ``llvm.bswap`` intrinsic is only supported with the following 427 argument types: i16, i32, i64 (the types supported by C-style GCC builtins). 428 429* ``llvm.ctlz`` 430* ``llvm.cttz`` 431* ``llvm.ctpop`` 432 433 The overloaded llvm.ctlz, llvm.cttz, and llvm.ctpop intrinsics are only 434 supported with the i32 and i64 argument types (the types supported by 435 C-style GCC builtins). 436 437* ``llvm.sqrt`` 438 439 The overloaded ``llvm.sqrt`` intrinsic is only supported for float 440 and double arguments types. This has the same semantics as the libc 441 sqrt function, returning NaN for values less than -0.0. However, this 442 does not set ``errno`` when the result is NaN (see the 443 :ref:`instructions and errno <ir_and_errno>` section). 444 445* ``llvm.stacksave`` 446* ``llvm.stackrestore`` 447 448 These intrinsics are used to implement language features like scoped automatic 449 variable sized arrays in C99. ``llvm.stacksave`` returns a value that 450 represents the current state of the stack. This value may only be used as the 451 argument to ``llvm.stackrestore``, which restores the stack to the given 452 state. 453 454* ``llvm.trap`` 455 456 This intrinsic is lowered to a target dependent trap instruction, which aborts 457 execution. 458 459* ``llvm.nacl.read.tp`` 460 461 See :ref:`thread pointer related intrinsics 462 <bitcode_threadpointerintrinsics>`. 463 464* ``llvm.nacl.longjmp`` 465* ``llvm.nacl.setjmp`` 466 467 See :ref:`Setjmp and Longjmp <bitcode_setjmplongjmp>`. 468 469* ``llvm.nacl.atomic.store`` 470* ``llvm.nacl.atomic.load`` 471* ``llvm.nacl.atomic.rmw`` 472* ``llvm.nacl.atomic.cmpxchg`` 473* ``llvm.nacl.atomic.fence`` 474* ``llvm.nacl.atomic.fence.all`` 475* ``llvm.nacl.atomic.is.lock.free`` 476 477 See :ref:`atomic intrinsics <bitcode_atomicintrinsics>`. 478 479.. _bitcode_threadpointerintrinsics: 480 481Thread pointer related intrinsics 482--------------------------------- 483 484.. naclcode:: 485 :prettyprint: 0 486 487 declare i8* @llvm.nacl.read.tp() 488 489Returns a read-only thread pointer. The value is controlled by the embedding 490sandbox's runtime. 491 492.. _bitcode_setjmplongjmp: 493 494Setjmp and Longjmp 495------------------ 496 497.. naclcode:: 498 :prettyprint: 0 499 500 declare void @llvm.nacl.longjmp(i8* %jmpbuf, i32) 501 declare i32 @llvm.nacl.setjmp(i8* %jmpbuf) 502 503These intrinsics implement the semantics of C11 ``setjmp`` and ``longjmp``. The 504``jmpbuf`` pointer must be 64-bit aligned and point to at least 1024 bytes of 505allocated memory. 506 507.. _bitcode_atomicintrinsics: 508 509Atomic intrinsics 510----------------- 511 512.. naclcode:: 513 :prettyprint: 0 514 515 declare iN @llvm.nacl.atomic.load.<size>( 516 iN* <source>, i32 <memory_order>) 517 declare void @llvm.nacl.atomic.store.<size>( 518 iN <operand>, iN* <destination>, i32 <memory_order>) 519 declare iN @llvm.nacl.atomic.rmw.<size>( 520 i32 <computation>, iN* <object>, iN <operand>, i32 <memory_order>) 521 declare iN @llvm.nacl.atomic.cmpxchg.<size>( 522 iN* <object>, iN <expected>, iN <desired>, 523 i32 <memory_order_success>, i32 <memory_order_failure>) 524 declare void @llvm.nacl.atomic.fence(i32 <memory_order>) 525 declare void @llvm.nacl.atomic.fence.all() 526 527Each of these intrinsics is overloaded on the ``iN`` argument, which is 528reflected through ``<size>`` in the overload's name. Integral types of 5298, 16, 32 and 64-bit width are supported for these arguments. 530 531The ``@llvm.nacl.atomic.rmw`` intrinsic implements the following 532read-modify-write operations, from the general and arithmetic sections 533of the C11/C++11 standards: 534 535 - ``add`` 536 - ``sub`` 537 - ``or`` 538 - ``and`` 539 - ``xor`` 540 - ``exchange`` 541 542For all of these read-modify-write operations, the returned value is 543that at ``object`` before the computation. The ``computation`` argument 544must be a compile-time constant. 545 546All atomic intrinsics also support C11/C++11 memory orderings, which 547must be compile-time constants. 548 549Integer values for these computations and memory orderings are defined 550in ``"llvm/IR/NaClAtomicIntrinsics.h"``. 551 552The ``@llvm.nacl.atomic.fence.all`` intrinsic is equivalent to the 553``@llvm.nacl.atomic.fence`` intrinsic with sequentially consistent 554ordering and compiler barriers preventing most non-atomic memory 555accesses from reordering around it. 556 557.. Note:: 558 :class: note 559 560 These intrinsics allow PNaCl to support C11/C++11 style atomic 561 operations as well as some legacy GCC-style ``__sync_*`` builtins 562 while remaining stable as the LLVM codebase changes. The user isn't 563 expected to use these intrinsics directly. 564 565.. naclcode:: 566 :prettyprint: 0 567 568 declare i1 @llvm.nacl.atomic.is.lock.free(i32 <byte_size>, i8* <address>) 569 570The ``llvm.nacl.atomic.is.lock.free`` intrinsic is designed to 571determine at translation time whether atomic operations of a certain 572``byte_size`` (a compile-time constant), at a particular ``address``, 573are lock-free or not. This reflects the C11 ``atomic_is_lock_free`` 574function from header ``<stdatomic.h>`` and the C++11 ``is_lock_free`` 575member function in header ``<atomic>``. It can be used through the 576``__nacl_atomic_is_lock_free`` builtin. 577