pickletools.py revision 10085c656f864763237e30194d61041cadc0143b
15445594e20baf1ac0c0c1aef30d55d789c4b9694Skip Montanaro'''"Executable documentation" for the pickle module. 28ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 38ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersExtensive comments about the pickle protocols and pickle-machine opcodes 48ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterscan be found here. Some functions meant for external use: 58ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 68ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersgenops(pickle) 78ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Generate all the opcodes in a pickle, as (opcode, arg, position) triples. 88ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 9d0c53fedd039714451f3022b46dc7f18fab6336fAndrew M. Kuchlingdis(pickle, out=None, memo=None, indentlevel=4) 108ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Print a symbolic disassembly of a pickle. 115445594e20baf1ac0c0c1aef30d55d789c4b9694Skip Montanaro''' 128ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger__all__ = ['dis', 'genops', 'optimize'] 1490cf212cefd270615a6d7ffa043e1cf3bd9ec9f2Tim Peters 158ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# Other ideas: 168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# 178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# - A pickle verifier: read a pickle and check it exhaustively for 18c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters# well-formedness. dis() does a lot of this already. 198ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# 208ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# - A protocol identifier: examine a pickle and return its protocol number 218ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# (== the highest .proto attr value among all the opcodes in the pickle). 22c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters# dis() already prints this info at the end. 238ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# 248ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# - A pickle optimizer: for example, tuple-building code is sometimes more 258ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# elaborate than necessary, catering for the possibility that the tuple 268ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# is recursive. Or lots of times a PUT is generated that's never accessed 278ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# by a later GET. 288ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 298ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 308ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters""" 318ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters"A pickle" is a program for a virtual pickle machine (PM, but more accurately 328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterscalled an unpickling machine). It's a sequence of opcodes, interpreted by the 338ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersPM, building an arbitrarily complex Python object. 348ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 358ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersFor the most part, the PM is very simple: there are no looping, testing, or 368ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersconditional instructions, no arithmetic and no function calls. Opcodes are 378ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersexecuted once each, from first to last, until a STOP opcode is reached. 388ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 398ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersThe PM has two data areas, "the stack" and "the memo". 408ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 418ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersMany opcodes push Python objects onto the stack; e.g., INT pushes a Python 428ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersinteger object on the stack, whose value is gotten from a decimal string 438ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersliteral immediately following the INT opcode in the pickle bytestream. Other 448ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersopcodes take Python objects off the stack. The result of unpickling is 458ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterswhatever object is left on the stack when the final STOP opcode is executed. 468ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 478ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersThe memo is simply an array of objects, or it can be implemented as a dict 488ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersmapping little integers to objects. The memo serves as the PM's "long term 498ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersmemory", and the little integers indexing the memo are akin to variable 508ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersnames. Some opcodes pop a stack object into the memo at a given index, 518ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersand others push a memo object at a given index onto the stack again. 528ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 538ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersAt heart, that's all the PM has. Subtleties arise for these reasons: 548ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 558ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters+ Object identity. Objects can be arbitrarily complex, and subobjects 568ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters may be shared (for example, the list [a, a] refers to the same object a 578ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters twice). It can be vital that unpickling recreate an isomorphic object 588ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters graph, faithfully reproducing sharing. 598ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 608ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters+ Recursive objects. For example, after "L = []; L.append(L)", L is a 618ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters list, and L[0] is the same list. This is related to the object identity 628ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters point, and some sequences of pickle opcodes are subtle in order to 638ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters get the right result in all cases. 648ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 658ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters+ Things pickle doesn't know everything about. Examples of things pickle 668ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters does know everything about are Python's builtin scalar and container 678ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters types, like ints and tuples. They generally have opcodes dedicated to 688ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters them. For things like module references and instances of user-defined 698ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters classes, pickle's knowledge is limited. Historically, many enhancements 708ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters have been made to the pickle protocol in order to do a better (faster, 718ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters and/or more compact) job on those. 728ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 738ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters+ Backward compatibility and micro-optimization. As explained below, 748ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters pickle opcodes never go away, not even when better ways to do a thing 758ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters get invented. The repertoire of the PM just keeps growing over time. 76fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters For example, protocol 0 had two opcodes for building Python integers (INT 77fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters and LONG), protocol 1 added three more for more-efficient pickling of short 78fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters integers, and protocol 2 added two more for more-efficient pickling of 79fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters long integers (before protocol 2, the only ways to pickle a Python long 80fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters took time quadratic in the number of digits, for both pickling and 81fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters unpickling). "Opcode bloat" isn't so much a subtlety as a source of 828ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters wearying complication. 838ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 848ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 858ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersPickle protocols: 868ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 878ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersFor compatibility, the meaning of a pickle opcode never changes. Instead new 888ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterspickle opcodes get added, and each version's unpickler can handle all the 898ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterspickle opcodes in all protocol versions to date. So old pickles continue to 908ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersbe readable forever. The pickler can generally be told to restrict itself to 918ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersthe subset of opcodes available under previous protocol versions too, so that 928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersusers can create pickles under the current version readable by older 938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersversions. However, a pickle does not contain its version number embedded 948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterswithin it. If an older unpickler tries to read a pickle using a later 958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersprotocol, the result is most likely an exception due to seeing an unknown (in 968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersthe older unpickler) opcode. 978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersThe original pickle used what's now called "protocol 0", and what was called 998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters"text mode" before Python 2.3. The entire pickle bytestream is made up of 1008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersprintable 7-bit ASCII characters, plus the newline character, in protocol 0. 101fdc03462b3e0796ae6474da6f0f9844773d1da8fTim PetersThat's why it was called text mode. Protocol 0 is small and elegant, but 102fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peterssometimes painfully inefficient. 1038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersThe second major set of additions is now called "protocol 1", and was called 1058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters"binary mode" before Python 2.3. This added many opcodes with arguments 1068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersconsisting of arbitrary bytes, including NUL bytes and unprintable "high bit" 1078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersbytes. Binary mode pickles can be substantially smaller than equivalent 1088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterstext mode pickles, and sometimes faster too; e.g., BININT represents a 4-byte 1098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersint as 4 bytes following the opcode, which is cheaper to unpickle than the 110fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters(perhaps) 11-character decimal string attached to INT. Protocol 1 also added 111fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Petersa number of opcodes that operate on many stack elements at once (like APPENDS 11281098ac1c8c88f9f1dcf7511555d2ad1f7124575Tim Petersand SETITEMS), and "shortcut" opcodes (like EMPTY_DICT and EMPTY_TUPLE). 1138ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1148ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersThe third major set of additions came in Python 2.3, and is called "protocol 115fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters2". This added: 116fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 117fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters- A better way to pickle instances of new-style classes (NEWOBJ). 118fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 119fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters- A way for a pickle to identify its protocol (PROTO). 120fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 121fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters- Time- and space- efficient pickling of long ints (LONG{1,4}). 122fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 123fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters- Shortcuts for small tuples (TUPLE{1,2,3}}. 124fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 125fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters- Dedicated opcodes for bools (NEWTRUE, NEWFALSE). 126fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 127fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters- The "extension registry", a vector of popular objects that can be pushed 128fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters efficiently by index (EXT{1,2,4}). This is akin to the memo and GET, but 129fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters the registry contents are predefined (there's nothing akin to the memo's 130fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters PUT). 131ecb1104342af0e2dca191f7666c60d5ca65069a8Guido van Rossum 1325445594e20baf1ac0c0c1aef30d55d789c4b9694Skip MontanaroAnother independent change with Python 2.3 is the abandonment of any 1335445594e20baf1ac0c0c1aef30d55d789c4b9694Skip Montanaropretense that it might be safe to load pickles received from untrusted 134ecb1104342af0e2dca191f7666c60d5ca65069a8Guido van Rossumparties -- no sufficient security analysis has been done to guarantee 1355445594e20baf1ac0c0c1aef30d55d789c4b9694Skip Montanarothis and there isn't a use case that warrants the expense of such an 136ecb1104342af0e2dca191f7666c60d5ca65069a8Guido van Rossumanalysis. 137ecb1104342af0e2dca191f7666c60d5ca65069a8Guido van Rossum 138ecb1104342af0e2dca191f7666c60d5ca65069a8Guido van RossumTo this end, all tests for __safe_for_unpickling__ or for 139dffbf5f5421cbeb20237280c0bd70f989269f844Georg Brandlcopy_reg.safe_constructors are removed from the unpickling code. 140ecb1104342af0e2dca191f7666c60d5ca65069a8Guido van RossumReferences to these variables in the descriptions below are to be seen 141ecb1104342af0e2dca191f7666c60d5ca65069a8Guido van Rossumas describing unpickling in Python 2.2 and before. 1428ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters""" 1438ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1448ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# Meta-rule: Descriptions are stored in instances of descriptor objects, 1458ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# with plain constructors. No meta-language is defined from which 1468ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# descriptors could be constructed. If you want, e.g., XML, write a little 1478ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# program to generate XML from the objects. 1488ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1498ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters############################################################################## 1508ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# Some pickle opcodes have an argument, following the opcode in the 1518ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# bytestream. An argument is of a specific type, described by an instance 1528ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# of ArgumentDescriptor. These are not to be confused with arguments taken 1538ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# off the stack -- ArgumentDescriptor applies only to arguments embedded in 1548ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# the opcode stream, immediately following an opcode. 1558ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1568ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# Represents the number of bytes consumed by an argument delimited by the 1578ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# next newline character. 1588ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersUP_TO_NEWLINE = -1 1598ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1608ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# Represents the number of bytes consumed by a two-argument opcode where 1618ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# the first argument gives the number of bytes in the second argument. 162fdb8cfab085d0f412411b090796e9f856ee0cab5Tim PetersTAKEN_FROM_ARGUMENT1 = -2 # num bytes is 1-byte unsigned int 163fdb8cfab085d0f412411b090796e9f856ee0cab5Tim PetersTAKEN_FROM_ARGUMENT4 = -3 # num bytes is 4-byte signed little-endian int 1648ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1658ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersclass ArgumentDescriptor(object): 1668ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters __slots__ = ( 1678ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # name of descriptor record, also a module global name; a string 1688ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'name', 1698ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1708ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # length of argument, in bytes; an int; UP_TO_NEWLINE and 171fdb8cfab085d0f412411b090796e9f856ee0cab5Tim Peters # TAKEN_FROM_ARGUMENT{1,4} are negative values for variable-length 172fdb8cfab085d0f412411b090796e9f856ee0cab5Tim Peters # cases 1738ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'n', 1748ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1758ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # a function taking a file-like object, reading this kind of argument 1768ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # from the object at the current position, advancing the current 1778ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # position by n bytes, and returning the value of the argument 1788ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'reader', 1798ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1808ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # human-readable docs for this arg descriptor; a string 1818ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'doc', 1828ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters ) 1838ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1848ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters def __init__(self, name, n, reader, doc): 1858ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert isinstance(name, str) 1868ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters self.name = name 1878ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1888ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert isinstance(n, int) and (n >= 0 or 189fdb8cfab085d0f412411b090796e9f856ee0cab5Tim Peters n in (UP_TO_NEWLINE, 190fdb8cfab085d0f412411b090796e9f856ee0cab5Tim Peters TAKEN_FROM_ARGUMENT1, 191fdb8cfab085d0f412411b090796e9f856ee0cab5Tim Peters TAKEN_FROM_ARGUMENT4)) 1928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters self.n = n 1938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters self.reader = reader 1958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert isinstance(doc, str) 1978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters self.doc = doc 1988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersfrom struct import unpack as _unpack 2008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdef read_uint1(f): 20255762f5f804c4848bbce323b085101d450f89ff6Tim Peters r""" 2038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> import StringIO 20455762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_uint1(StringIO.StringIO('\xff')) 2058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 255 2068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """ 2078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters data = f.read(1) 2098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if data: 2108ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters return ord(data) 2118ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("not enough data in stream to read uint1") 2128ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2138ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersuint1 = ArgumentDescriptor( 2148ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='uint1', 2158ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters n=1, 2168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters reader=read_uint1, 2178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="One-byte unsigned integer.") 2188ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2198ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2208ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdef read_uint2(f): 22155762f5f804c4848bbce323b085101d450f89ff6Tim Peters r""" 2228ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> import StringIO 22355762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_uint2(StringIO.StringIO('\xff\x00')) 2248ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 255 22555762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_uint2(StringIO.StringIO('\xff\xff')) 2268ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 65535 2278ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """ 2288ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2298ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters data = f.read(2) 2308ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if len(data) == 2: 2318ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters return _unpack("<H", data)[0] 2328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("not enough data in stream to read uint2") 2338ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2348ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersuint2 = ArgumentDescriptor( 2358ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='uint2', 2368ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters n=2, 2378ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters reader=read_uint2, 2388ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="Two-byte unsigned integer, little-endian.") 2398ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2408ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2418ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdef read_int4(f): 24255762f5f804c4848bbce323b085101d450f89ff6Tim Peters r""" 2438ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> import StringIO 24455762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_int4(StringIO.StringIO('\xff\x00\x00\x00')) 2458ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 255 24655762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_int4(StringIO.StringIO('\x00\x00\x00\x80')) == -(2**31) 2478ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters True 2488ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """ 2498ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2508ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters data = f.read(4) 2518ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if len(data) == 4: 2528ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters return _unpack("<i", data)[0] 2538ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("not enough data in stream to read int4") 2548ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2558ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersint4 = ArgumentDescriptor( 2568ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='int4', 2578ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters n=4, 2588ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters reader=read_int4, 2598ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="Four-byte signed integer, little-endian, 2's complement.") 2608ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2618ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2628ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdef read_stringnl(f, decode=True, stripquotes=True): 26355762f5f804c4848bbce323b085101d450f89ff6Tim Peters r""" 2648ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> import StringIO 26555762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_stringnl(StringIO.StringIO("'abcd'\nefg\n")) 2668ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'abcd' 2678ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 26855762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_stringnl(StringIO.StringIO("\n")) 2698ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Traceback (most recent call last): 2708ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters ... 2718ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters ValueError: no string quotes around '' 2728ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 27355762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_stringnl(StringIO.StringIO("\n"), stripquotes=False) 2748ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters '' 2758ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 27655762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_stringnl(StringIO.StringIO("''\n")) 2778ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters '' 2788ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2798ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> read_stringnl(StringIO.StringIO('"abcd"')) 2808ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Traceback (most recent call last): 2818ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters ... 2828ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters ValueError: no newline found when trying to read stringnl 2838ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2848ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Embedded escapes are undone in the result. 28555762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_stringnl(StringIO.StringIO(r"'a\n\\b\x00c\td'" + "\n'e'")) 28655762f5f804c4848bbce323b085101d450f89ff6Tim Peters 'a\n\\b\x00c\td' 2878ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """ 2888ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2898ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters data = f.readline() 2908ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if not data.endswith('\n'): 2918ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("no newline found when trying to read stringnl") 2928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters data = data[:-1] # lose the newline 2938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if stripquotes: 2958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters for q in "'\"": 2968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if data.startswith(q): 2978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if not data.endswith(q): 2988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("strinq quote %r not found at both " 2998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters "ends of %r" % (q, data)) 3008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters data = data[1:-1] 3018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters break 3028ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters else: 3038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("no string quotes around %r" % data) 3048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 3058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # I'm not sure when 'string_escape' was added to the std codecs; it's 3068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # crazy not to use it if it's there. 3078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if decode: 3088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters data = data.decode('string_escape') 3098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters return data 3108ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 3118ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersstringnl = ArgumentDescriptor( 3128ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='stringnl', 3138ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters n=UP_TO_NEWLINE, 3148ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters reader=read_stringnl, 3158ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""A newline-terminated string. 3168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 3178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters This is a repr-style string, with embedded escapes, and 3188ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters bracketing quotes. 3198ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """) 3208ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 3218ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdef read_stringnl_noescape(f): 3228ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters return read_stringnl(f, decode=False, stripquotes=False) 3238ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 3248ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersstringnl_noescape = ArgumentDescriptor( 3258ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='stringnl_noescape', 3268ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters n=UP_TO_NEWLINE, 3278ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters reader=read_stringnl_noescape, 3288ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""A newline-terminated string. 3298ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 3308ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters This is a str-style string, without embedded escapes, 3318ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters or bracketing quotes. It should consist solely of 3328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters printable ASCII characters. 3338ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """) 3348ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 3358ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdef read_stringnl_noescape_pair(f): 33655762f5f804c4848bbce323b085101d450f89ff6Tim Peters r""" 3378ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> import StringIO 33855762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_stringnl_noescape_pair(StringIO.StringIO("Queue\nEmpty\njunk")) 339d916cf4ec7014e9f6b25b8d63728bda01a17d3f9Tim Peters 'Queue Empty' 3408ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """ 3418ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 342d916cf4ec7014e9f6b25b8d63728bda01a17d3f9Tim Peters return "%s %s" % (read_stringnl_noescape(f), read_stringnl_noescape(f)) 3438ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 3448ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersstringnl_noescape_pair = ArgumentDescriptor( 3458ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='stringnl_noescape_pair', 3468ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters n=UP_TO_NEWLINE, 3478ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters reader=read_stringnl_noescape_pair, 3488ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""A pair of newline-terminated strings. 3498ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 3508ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters These are str-style strings, without embedded 3518ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters escapes, or bracketing quotes. They should 3528ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters consist solely of printable ASCII characters. 3538ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The pair is returned as a single string, with 354d916cf4ec7014e9f6b25b8d63728bda01a17d3f9Tim Peters a single blank separating the two strings. 3558ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """) 3568ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 3578ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdef read_string4(f): 35855762f5f804c4848bbce323b085101d450f89ff6Tim Peters r""" 3598ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> import StringIO 36055762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_string4(StringIO.StringIO("\x00\x00\x00\x00abc")) 3618ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters '' 36255762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_string4(StringIO.StringIO("\x03\x00\x00\x00abcdef")) 3638ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'abc' 36455762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_string4(StringIO.StringIO("\x00\x00\x00\x03abcdef")) 3658ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Traceback (most recent call last): 3668ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters ... 3678ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters ValueError: expected 50331648 bytes in a string4, but only 6 remain 3688ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """ 3698ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 3708ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters n = read_int4(f) 3718ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if n < 0: 3728ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("string4 byte count < 0: %d" % n) 3738ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters data = f.read(n) 3748ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if len(data) == n: 3758ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters return data 3768ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("expected %d bytes in a string4, but only %d remain" % 3778ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters (n, len(data))) 3788ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 3798ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersstring4 = ArgumentDescriptor( 3808ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name="string4", 381fdb8cfab085d0f412411b090796e9f856ee0cab5Tim Peters n=TAKEN_FROM_ARGUMENT4, 3828ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters reader=read_string4, 3838ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""A counted string. 3848ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 3858ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The first argument is a 4-byte little-endian signed int giving 3868ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters the number of bytes in the string, and the second argument is 3878ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters that many bytes. 3888ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """) 3898ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 3908ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 3918ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdef read_string1(f): 39255762f5f804c4848bbce323b085101d450f89ff6Tim Peters r""" 3938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> import StringIO 39455762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_string1(StringIO.StringIO("\x00")) 3958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters '' 39655762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_string1(StringIO.StringIO("\x03abcdef")) 3978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'abc' 3988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """ 3998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 4008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters n = read_uint1(f) 4018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert n >= 0 4028ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters data = f.read(n) 4038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if len(data) == n: 4048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters return data 4058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("expected %d bytes in a string1, but only %d remain" % 4068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters (n, len(data))) 4078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 4088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersstring1 = ArgumentDescriptor( 4098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name="string1", 410fdb8cfab085d0f412411b090796e9f856ee0cab5Tim Peters n=TAKEN_FROM_ARGUMENT1, 4118ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters reader=read_string1, 4128ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""A counted string. 4138ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 4148ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The first argument is a 1-byte unsigned int giving the number 4158ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters of bytes in the string, and the second argument is that many 4168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters bytes. 4178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """) 4188ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 4198ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 4208ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdef read_unicodestringnl(f): 42155762f5f804c4848bbce323b085101d450f89ff6Tim Peters r""" 4228ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> import StringIO 42355762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_unicodestringnl(StringIO.StringIO("abc\uabcd\njunk")) 42455762f5f804c4848bbce323b085101d450f89ff6Tim Peters u'abc\uabcd' 4258ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """ 4268ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 4278ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters data = f.readline() 4288ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if not data.endswith('\n'): 4298ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("no newline found when trying to read " 4308ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters "unicodestringnl") 4318ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters data = data[:-1] # lose the newline 4328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters return unicode(data, 'raw-unicode-escape') 4338ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 4348ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersunicodestringnl = ArgumentDescriptor( 4358ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='unicodestringnl', 4368ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters n=UP_TO_NEWLINE, 4378ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters reader=read_unicodestringnl, 4388ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""A newline-terminated Unicode string. 4398ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 4408ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters This is raw-unicode-escape encoded, so consists of 4418ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters printable ASCII characters, and may contain embedded 4428ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters escape sequences. 4438ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """) 4448ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 4458ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdef read_unicodestring4(f): 44655762f5f804c4848bbce323b085101d450f89ff6Tim Peters r""" 4478ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> import StringIO 44855762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> s = u'abcd\uabcd' 4498ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> enc = s.encode('utf-8') 4508ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> enc 45155762f5f804c4848bbce323b085101d450f89ff6Tim Peters 'abcd\xea\xaf\x8d' 4528ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> n = chr(len(enc)) + chr(0) * 3 # little-endian 4-byte length 4538ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> t = read_unicodestring4(StringIO.StringIO(n + enc + 'junk')) 4548ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> s == t 4558ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters True 4568ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 4578ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> read_unicodestring4(StringIO.StringIO(n + enc[:-1])) 4588ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Traceback (most recent call last): 4598ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters ... 4608ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters ValueError: expected 7 bytes in a unicodestring4, but only 6 remain 4618ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """ 4628ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 4638ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters n = read_int4(f) 4648ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if n < 0: 4658ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("unicodestring4 byte count < 0: %d" % n) 4668ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters data = f.read(n) 4678ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if len(data) == n: 4688ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters return unicode(data, 'utf-8') 4698ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("expected %d bytes in a unicodestring4, but only %d " 4708ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters "remain" % (n, len(data))) 4718ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 4728ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersunicodestring4 = ArgumentDescriptor( 4738ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name="unicodestring4", 474fdb8cfab085d0f412411b090796e9f856ee0cab5Tim Peters n=TAKEN_FROM_ARGUMENT4, 4758ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters reader=read_unicodestring4, 4768ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""A counted Unicode string. 4778ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 4788ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The first argument is a 4-byte little-endian signed int 4798ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters giving the number of bytes in the string, and the second 4808ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters argument-- the UTF-8 encoding of the Unicode string -- 4818ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters contains that many bytes. 4828ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """) 4838ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 4848ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 4858ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdef read_decimalnl_short(f): 48655762f5f804c4848bbce323b085101d450f89ff6Tim Peters r""" 4878ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> import StringIO 48855762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_decimalnl_short(StringIO.StringIO("1234\n56")) 4898ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1234 4908ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 49155762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_decimalnl_short(StringIO.StringIO("1234L\n56")) 4928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Traceback (most recent call last): 4938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters ... 4948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters ValueError: trailing 'L' not allowed in '1234L' 4958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """ 4968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 4978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters s = read_stringnl(f, decode=False, stripquotes=False) 4988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if s.endswith("L"): 4998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("trailing 'L' not allowed in %r" % s) 5008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 5018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # It's not necessarily true that the result fits in a Python short int: 5028ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # the pickle may have been written on a 64-bit box. There's also a hack 5038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # for True and False here. 5048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if s == "00": 5058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters return False 5068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters elif s == "01": 5078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters return True 5088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 5098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters try: 5108ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters return int(s) 5118ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters except OverflowError: 5128ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters return long(s) 5138ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 5148ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdef read_decimalnl_long(f): 51555762f5f804c4848bbce323b085101d450f89ff6Tim Peters r""" 5168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> import StringIO 5178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 51855762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_decimalnl_long(StringIO.StringIO("1234\n56")) 5198ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Traceback (most recent call last): 5208ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters ... 5218ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters ValueError: trailing 'L' required in '1234' 5228ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 5238ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Someday the trailing 'L' will probably go away from this output. 5248ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 52555762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_decimalnl_long(StringIO.StringIO("1234L\n56")) 5268ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1234L 5278ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 52855762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_decimalnl_long(StringIO.StringIO("123456789012345678901234L\n6")) 5298ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 123456789012345678901234L 5308ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """ 5318ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 5328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters s = read_stringnl(f, decode=False, stripquotes=False) 5338ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if not s.endswith("L"): 5348ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("trailing 'L' required in %r" % s) 5358ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters return long(s) 5368ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 5378ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 5388ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdecimalnl_short = ArgumentDescriptor( 5398ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='decimalnl_short', 5408ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters n=UP_TO_NEWLINE, 5418ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters reader=read_decimalnl_short, 5428ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""A newline-terminated decimal integer literal. 5438ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 5448ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters This never has a trailing 'L', and the integer fit 5458ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters in a short Python int on the box where the pickle 5468ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters was written -- but there's no guarantee it will fit 5478ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters in a short Python int on the box where the pickle 5488ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters is read. 5498ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """) 5508ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 5518ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdecimalnl_long = ArgumentDescriptor( 5528ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='decimalnl_long', 5538ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters n=UP_TO_NEWLINE, 5548ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters reader=read_decimalnl_long, 5558ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""A newline-terminated decimal integer literal. 5568ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 5578ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters This has a trailing 'L', and can represent integers 5588ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters of any size. 5598ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """) 5608ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 5618ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 5628ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdef read_floatnl(f): 56355762f5f804c4848bbce323b085101d450f89ff6Tim Peters r""" 5648ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> import StringIO 56555762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_floatnl(StringIO.StringIO("-1.25\n6")) 5668ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters -1.25 5678ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """ 5688ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters s = read_stringnl(f, decode=False, stripquotes=False) 5698ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters return float(s) 5708ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 5718ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersfloatnl = ArgumentDescriptor( 5728ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='floatnl', 5738ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters n=UP_TO_NEWLINE, 5748ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters reader=read_floatnl, 5758ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""A newline-terminated decimal floating literal. 5768ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 5778ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters In general this requires 17 significant digits for roundtrip 5788ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters identity, and pickling then unpickling infinities, NaNs, and 5798ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters minus zero doesn't work across boxes, or on some boxes even 5808ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters on itself (e.g., Windows can't read the strings it produces 5818ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters for infinities or NaNs). 5828ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """) 5838ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 5848ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdef read_float8(f): 58555762f5f804c4848bbce323b085101d450f89ff6Tim Peters r""" 5868ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> import StringIO, struct 5878ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> raw = struct.pack(">d", -1.25) 5888ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters >>> raw 58955762f5f804c4848bbce323b085101d450f89ff6Tim Peters '\xbf\xf4\x00\x00\x00\x00\x00\x00' 59055762f5f804c4848bbce323b085101d450f89ff6Tim Peters >>> read_float8(StringIO.StringIO(raw + "\n")) 5918ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters -1.25 5928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """ 5938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 5948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters data = f.read(8) 5958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if len(data) == 8: 5968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters return _unpack(">d", data)[0] 5978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("not enough data in stream to read float8") 5988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 5998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 6008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersfloat8 = ArgumentDescriptor( 6018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='float8', 6028ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters n=8, 6038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters reader=read_float8, 6048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""An 8-byte binary representation of a float, big-endian. 6058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 6068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The format is unique to Python, and shared with the struct 6078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters module (format string '>d') "in theory" (the struct and cPickle 6088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters implementations don't share the code -- they should). It's 6098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters strongly related to the IEEE-754 double format, and, in normal 6108ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters cases, is in fact identical to the big-endian 754 double format. 6118ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters On other boxes the dynamic range is limited to that of a 754 6128ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters double, and "add a half and chop" rounding is used to reduce 6138ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters the precision to 53 bits. However, even on a 754 box, 6148ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters infinities, NaNs, and minus zero may not be handled correctly 6158ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters (may not survive roundtrip pickling intact). 6168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """) 6178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 6185a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum# Protocol 2 formats 6195a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum 620c0c12b57070a5b494662bebc418e3958bf5bdbeeTim Petersfrom pickle import decode_long 6215a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum 6225a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossumdef read_long1(f): 6235a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum r""" 6245a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum >>> import StringIO 6254b23f2b44bdd13758eab6808d6a08b951fbfc4ddTim Peters >>> read_long1(StringIO.StringIO("\x00")) 6264b23f2b44bdd13758eab6808d6a08b951fbfc4ddTim Peters 0L 6275a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum >>> read_long1(StringIO.StringIO("\x02\xff\x00")) 6285a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum 255L 6295a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum >>> read_long1(StringIO.StringIO("\x02\xff\x7f")) 6305a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum 32767L 6315a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum >>> read_long1(StringIO.StringIO("\x02\x00\xff")) 6325a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum -256L 6335a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum >>> read_long1(StringIO.StringIO("\x02\x00\x80")) 6345a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum -32768L 6355a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum """ 6365a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum 6375a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum n = read_uint1(f) 6385a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum data = f.read(n) 6395a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum if len(data) != n: 6405a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum raise ValueError("not enough data in stream to read long1") 6415a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum return decode_long(data) 6425a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum 6435a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossumlong1 = ArgumentDescriptor( 6445a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum name="long1", 645fdb8cfab085d0f412411b090796e9f856ee0cab5Tim Peters n=TAKEN_FROM_ARGUMENT1, 6465a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum reader=read_long1, 6475a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum doc="""A binary long, little-endian, using 1-byte size. 6485a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum 6495a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum This first reads one byte as an unsigned size, then reads that 650bdbe74183ceb7aedd71dd1afdf8755ee93e8c3caTim Peters many bytes and interprets them as a little-endian 2's-complement long. 6514b23f2b44bdd13758eab6808d6a08b951fbfc4ddTim Peters If the size is 0, that's taken as a shortcut for the long 0L. 6525a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum """) 6535a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum 6545a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossumdef read_long4(f): 6555a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum r""" 6565a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum >>> import StringIO 6575a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum >>> read_long4(StringIO.StringIO("\x02\x00\x00\x00\xff\x00")) 6585a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum 255L 6595a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum >>> read_long4(StringIO.StringIO("\x02\x00\x00\x00\xff\x7f")) 6605a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum 32767L 6615a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum >>> read_long4(StringIO.StringIO("\x02\x00\x00\x00\x00\xff")) 6625a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum -256L 6635a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum >>> read_long4(StringIO.StringIO("\x02\x00\x00\x00\x00\x80")) 6645a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum -32768L 6654b23f2b44bdd13758eab6808d6a08b951fbfc4ddTim Peters >>> read_long1(StringIO.StringIO("\x00\x00\x00\x00")) 6664b23f2b44bdd13758eab6808d6a08b951fbfc4ddTim Peters 0L 6675a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum """ 6685a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum 6695a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum n = read_int4(f) 6705a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum if n < 0: 671784a3f53a8f78995d4c8ca22f612a68828bc4838Neal Norwitz raise ValueError("long4 byte count < 0: %d" % n) 6725a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum data = f.read(n) 6735a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum if len(data) != n: 674784a3f53a8f78995d4c8ca22f612a68828bc4838Neal Norwitz raise ValueError("not enough data in stream to read long4") 6755a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum return decode_long(data) 6765a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum 6775a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossumlong4 = ArgumentDescriptor( 6785a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum name="long4", 679fdb8cfab085d0f412411b090796e9f856ee0cab5Tim Peters n=TAKEN_FROM_ARGUMENT4, 6805a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum reader=read_long4, 6815a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum doc="""A binary representation of a long, little-endian. 6825a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum 6835a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum This first reads four bytes as a signed size (but requires the 6845a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum size to be >= 0), then reads that many bytes and interprets them 6854b23f2b44bdd13758eab6808d6a08b951fbfc4ddTim Peters as a little-endian 2's-complement long. If the size is 0, that's taken 6864b23f2b44bdd13758eab6808d6a08b951fbfc4ddTim Peters as a shortcut for the long 0L, although LONG1 should really be used 6874b23f2b44bdd13758eab6808d6a08b951fbfc4ddTim Peters then instead (and in any case where # of bytes < 256). 6885a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum """) 6895a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum 6905a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum 6918ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters############################################################################## 6928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# Object descriptors. The stack used by the pickle machine holds objects, 6938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# and in the stack_before and stack_after attributes of OpcodeInfo 6948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# descriptors we need names to describe the various types of objects that can 6958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# appear on the stack. 6968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 6978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersclass StackObject(object): 6988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters __slots__ = ( 6998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # name of descriptor record, for info only 7008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'name', 7018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7028ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # type of object, or tuple of type objects (meaning the object can 7038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # be of any type in the tuple) 7048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'obtype', 7058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # human-readable docs for this kind of stack object; a string 7078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'doc', 7088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters ) 7098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7108ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters def __init__(self, name, obtype, doc): 7118ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert isinstance(name, str) 7128ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters self.name = name 7138ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7148ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert isinstance(obtype, type) or isinstance(obtype, tuple) 7158ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if isinstance(obtype, tuple): 7168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters for contained in obtype: 7178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert isinstance(contained, type) 7188ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters self.obtype = obtype 7198ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7208ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert isinstance(doc, str) 7218ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters self.doc = doc 7228ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 723c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters def __repr__(self): 724c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters return self.name 725c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 7268ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7278ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterspyint = StackObject( 7288ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='int', 7298ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters obtype=int, 7308ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="A short (as opposed to long) Python integer object.") 7318ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterspylong = StackObject( 7338ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='long', 7348ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters obtype=long, 7358ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="A long (as opposed to short) Python integer object.") 7368ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7378ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterspyinteger_or_bool = StackObject( 7388ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='int_or_bool', 7398ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters obtype=(int, long, bool), 7408ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="A Python integer object (short or long), or " 7418ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters "a Python bool.") 7428ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7435a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossumpybool = StackObject( 7445a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum name='bool', 7455a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum obtype=(bool,), 7465a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum doc="A Python bool object.") 7475a2d8f5e9af0cbd513f02eb5576ff497e3693ffeGuido van Rossum 7488ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterspyfloat = StackObject( 7498ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='float', 7508ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters obtype=float, 7518ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="A Python float object.") 7528ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7538ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterspystring = StackObject( 7548ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='str', 7558ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters obtype=str, 7568ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="A Python string object.") 7578ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7588ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterspyunicode = StackObject( 7598ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='unicode', 7608ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters obtype=unicode, 7618ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="A Python Unicode string object.") 7628ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7638ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterspynone = StackObject( 7648ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name="None", 7658ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters obtype=type(None), 7668ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="The Python None object.") 7678ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7688ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterspytuple = StackObject( 7698ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name="tuple", 7708ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters obtype=tuple, 7718ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="A Python tuple object.") 7728ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7738ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterspylist = StackObject( 7748ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name="list", 7758ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters obtype=list, 7768ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="A Python list object.") 7778ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7788ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterspydict = StackObject( 7798ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name="dict", 7808ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters obtype=dict, 7818ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="A Python dict object.") 7828ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7838ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersanyobject = StackObject( 7848ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name='any', 7858ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters obtype=object, 7868ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="Any kind of object whatsoever.") 7878ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7888ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersmarkobject = StackObject( 7898ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name="mark", 7908ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters obtype=StackObject, 7918ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""'The mark' is a unique object. 7928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 7938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Opcodes that operate on a variable number of objects 7948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters generally don't embed the count of objects in the opcode, 7958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters or pull it off the stack. Instead the MARK opcode is used 7968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters to push a special marker object on the stack, and then 7978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters some other opcodes grab all the objects from the top of 7988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters the stack down to (but not including) the topmost marker 7998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters object. 8008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """) 8018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8028ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersstackslice = StackObject( 8038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name="stackslice", 8048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters obtype=StackObject, 8058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""An object representing a contiguous slice of the stack. 8068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters This is used in conjuction with markobject, to represent all 8088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters of the stack following the topmost markobject. For example, 8098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters the POP_MARK opcode changes the stack from 8108ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8118ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters [..., markobject, stackslice] 8128ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters to 8138ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters [...] 8148ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8158ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters No matter how many object are on the stack after the topmost 8168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters markobject, POP_MARK gets rid of all of them (including the 8178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters topmost markobject too). 8188ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """) 8198ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8208ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters############################################################################## 8218ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# Descriptors for pickle opcodes. 8228ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8238ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersclass OpcodeInfo(object): 8248ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8258ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters __slots__ = ( 8268ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # symbolic name of opcode; a string 8278ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'name', 8288ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8298ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # the code used in a bytestream to represent the opcode; a 8308ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # one-character string 8318ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'code', 8328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8338ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # If the opcode has an argument embedded in the byte string, an 8348ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # instance of ArgumentDescriptor specifying its type. Note that 8358ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # arg.reader(s) can be used to read and decode the argument from 8368ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # the bytestream s, and arg.doc documents the format of the raw 8378ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # argument bytes. If the opcode doesn't have an argument embedded 8388ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # in the bytestream, arg should be None. 8398ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'arg', 8408ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8418ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # what the stack looks like before this opcode runs; a list 8428ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'stack_before', 8438ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8448ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # what the stack looks like after this opcode runs; a list 8458ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'stack_after', 8468ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8478ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # the protocol number in which this opcode was introduced; an int 8488ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'proto', 8498ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8508ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # human-readable docs for this opcode; a string 8518ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'doc', 8528ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters ) 8538ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8548ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters def __init__(self, name, code, arg, 8558ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before, stack_after, proto, doc): 8568ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert isinstance(name, str) 8578ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters self.name = name 8588ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8598ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert isinstance(code, str) 8608ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert len(code) == 1 8618ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters self.code = code 8628ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8638ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert arg is None or isinstance(arg, ArgumentDescriptor) 8648ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters self.arg = arg 8658ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8668ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert isinstance(stack_before, list) 8678ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters for x in stack_before: 8688ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert isinstance(x, StackObject) 8698ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters self.stack_before = stack_before 8708ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8718ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert isinstance(stack_after, list) 8728ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters for x in stack_after: 8738ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert isinstance(x, StackObject) 8748ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters self.stack_after = stack_after 8758ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8768ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert isinstance(proto, int) and 0 <= proto <= 2 8778ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters self.proto = proto 8788ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8798ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert isinstance(doc, str) 8808ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters self.doc = doc 8818ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8828ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersI = OpcodeInfo 8838ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersopcodes = [ 8848ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8858ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # Ways to spell integers. 8868ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8878ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='INT', 8888ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='I', 8898ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=decimalnl_short, 8908ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 8918ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pyinteger_or_bool], 8928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 8938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Push an integer or bool. 8948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The argument is a newline-terminated decimal literal string. 8968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 8978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The intent may have been that this always fit in a short Python int, 8988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters but INT can be generated in pickles written on a 64-bit box that 8998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters require a Python long on a 32-bit box. The difference between this 9008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters and LONG then is that INT skips a trailing 'L', and produces a short 9018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters int whenever possible. 9028ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 9038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Another difference is due to that, when bool was introduced as a 9048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters distinct type in 2.3, builtin names True and False were also added to 9058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2.2.2, mapping to ints 1 and 0. For compatibility in both directions, 9068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters True gets pickled as INT + "I01\\n", and False as INT + "I00\\n". 9078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Leading zeroes are never produced for a genuine integer. The 2.3 9088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters (and later) unpicklers special-case these and return bool instead; 9098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters earlier unpicklers ignore the leading "0" and return the int. 9108ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 9118ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 9128ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='BININT', 9138ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='J', 9148ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=int4, 9158ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 9168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pyint], 9178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 9188ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Push a four-byte signed integer. 9198ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 9208ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters This handles the full range of Python (short) integers on a 32-bit 9218ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters box, directly as binary bytes (1 for the opcode and 4 for the integer). 9228ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters If the integer is non-negative and fits in 1 or 2 bytes, pickling via 9238ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters BININT1 or BININT2 saves space. 9248ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 9258ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 9268ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='BININT1', 9278ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='K', 9288ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=uint1, 9298ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 9308ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pyint], 9318ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 9328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Push a one-byte unsigned integer. 9338ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 9348ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters This is a space optimization for pickling very small non-negative ints, 9358ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters in range(256). 9368ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 9378ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 9388ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='BININT2', 9398ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='M', 9408ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=uint2, 9418ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 9428ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pyint], 9438ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 9448ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Push a two-byte unsigned integer. 9458ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 9468ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters This is a space optimization for pickling small positive ints, in 9478ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters range(256, 2**16). Integers in range(256) can also be pickled via 9488ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters BININT2, but BININT1 instead saves a byte. 9498ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 9508ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 951fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters I(name='LONG', 952fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters code='L', 953fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters arg=decimalnl_long, 954fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_before=[], 955fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_after=[pylong], 956fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters proto=0, 957fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters doc="""Push a long integer. 958fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 959fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters The same as INT, except that the literal ends with 'L', and always 960fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters unpickles to a Python long. There doesn't seem a real purpose to the 961fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters trailing 'L'. 962fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 963fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters Note that LONG takes time quadratic in the number of digits when 964fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters unpickling (this is simply due to the nature of decimal->binary 965fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters conversion). Proto 2 added linear-time (in C; still quadratic-time 966fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters in Python) LONG1 and LONG4 opcodes. 967fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters """), 968fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 969fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters I(name="LONG1", 970fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters code='\x8a', 971fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters arg=long1, 972fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_before=[], 973fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_after=[pylong], 974fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters proto=2, 975fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters doc="""Long integer using one-byte length. 976fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 977fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters A more efficient encoding of a Python long; the long1 encoding 978fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters says it all."""), 979fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 980fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters I(name="LONG4", 981fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters code='\x8b', 982fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters arg=long4, 983fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_before=[], 984fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_after=[pylong], 985fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters proto=2, 986fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters doc="""Long integer using found-byte length. 987fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 988fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters A more efficient encoding of a Python long; the long4 encoding 989fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters says it all."""), 990fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 9918ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # Ways to spell strings (8-bit, not Unicode). 9928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 9938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='STRING', 9948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='S', 9958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=stringnl, 9968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 9978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pystring], 9988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 9998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Push a Python string object. 10008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 10018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The argument is a repr-style string, with bracketing quote characters, 10028ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters and perhaps embedded escapes. The argument extends until the next 10038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters newline character. 10048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 10058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 10068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='BINSTRING', 10078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='T', 10088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=string4, 10098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 10108ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pystring], 10118ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 10128ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Push a Python string object. 10138ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 10148ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters There are two arguments: the first is a 4-byte little-endian signed int 10158ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters giving the number of bytes in the string, and the second is that many 10168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters bytes, which are taken literally as the string content. 10178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 10188ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 10198ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='SHORT_BINSTRING', 10208ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='U', 10218ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=string1, 10228ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 10238ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pystring], 10248ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 10258ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Push a Python string object. 10268ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 10278ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters There are two arguments: the first is a 1-byte unsigned int giving 10288ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters the number of bytes in the string, and the second is that many bytes, 10298ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters which are taken literally as the string content. 10308ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 10318ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 10328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # Ways to spell None. 10338ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 10348ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='NONE', 10358ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='N', 10368ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 10378ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 10388ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pynone], 10398ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 10408ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="Push None on the stack."), 10418ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1042fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters # Ways to spell bools, starting with proto 2. See INT for how this was 1043fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters # done before proto 2. 1044fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1045fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters I(name='NEWTRUE', 1046fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters code='\x88', 1047fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters arg=None, 1048fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_before=[], 1049fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_after=[pybool], 1050fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters proto=2, 1051fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters doc="""True. 1052fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1053fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters Push True onto the stack."""), 1054fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1055fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters I(name='NEWFALSE', 1056fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters code='\x89', 1057fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters arg=None, 1058fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_before=[], 1059fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_after=[pybool], 1060fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters proto=2, 1061fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters doc="""True. 1062fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1063fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters Push False onto the stack."""), 1064fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 10658ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # Ways to spell Unicode strings. 10668ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 10678ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='UNICODE', 10688ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='V', 10698ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=unicodestringnl, 10708ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 10718ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pyunicode], 10728ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, # this may be pure-text, but it's a later addition 10738ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Push a Python Unicode string object. 10748ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 10758ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The argument is a raw-unicode-escape encoding of a Unicode string, 10768ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters and so may contain embedded escape sequences. The argument extends 10778ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters until the next newline character. 10788ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 10798ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 10808ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='BINUNICODE', 10818ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='X', 10828ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=unicodestring4, 10838ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 10848ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pyunicode], 10858ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 10868ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Push a Python Unicode string object. 10878ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 10888ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters There are two arguments: the first is a 4-byte little-endian signed int 10898ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters giving the number of bytes in the string. The second is that many 10908ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters bytes, and is the UTF-8 encoding of the Unicode string. 10918ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 10928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 10938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # Ways to spell floats. 10948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 10958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='FLOAT', 10968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='F', 10978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=floatnl, 10988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 10998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pyfloat], 11008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 11018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Newline-terminated decimal float literal. 11028ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 11038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The argument is repr(a_float), and in general requires 17 significant 11048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters digits for roundtrip conversion to be an identity (this is so for 11058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters IEEE-754 double precision values, which is what Python float maps to 11068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters on most boxes). 11078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 11088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters In general, FLOAT cannot be used to transport infinities, NaNs, or 11098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters minus zero across boxes (or even on a single box, if the platform C 11108ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters library can't read the strings it produces for such things -- Windows 11118ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters is like that), but may do less damage than BINFLOAT on boxes with 11128ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters greater precision or dynamic range than IEEE-754 double. 11138ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 11148ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 11158ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='BINFLOAT', 11168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='G', 11178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=float8, 11188ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 11198ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pyfloat], 11208ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 11218ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Float stored in binary form, with 8 bytes of data. 11228ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 11238ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters This generally requires less than half the space of FLOAT encoding. 11248ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters In general, BINFLOAT cannot be used to transport infinities, NaNs, or 11258ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters minus zero, raises an exception if the exponent exceeds the range of 11268ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters an IEEE-754 double, and retains no more than 53 bits of precision (if 11278ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters there are more than that, "add a half and chop" rounding is used to 11288ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters cut it back to 53 significant bits). 11298ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 11308ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 11318ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # Ways to build lists. 11328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 11338ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='EMPTY_LIST', 11348ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code=']', 11358ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 11368ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 11378ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pylist], 11388ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 11398ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="Push an empty list."), 11408ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 11418ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='APPEND', 11428ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='a', 11438ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 11448ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[pylist, anyobject], 11458ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pylist], 11468ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 11478ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Append an object to a list. 11488ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 11498ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack before: ... pylist anyobject 11508ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack after: ... pylist+[anyobject] 115181098ac1c8c88f9f1dcf7511555d2ad1f7124575Tim Peters 115281098ac1c8c88f9f1dcf7511555d2ad1f7124575Tim Peters although pylist is really extended in-place. 11538ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 11548ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 11558ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='APPENDS', 11568ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='e', 11578ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 11588ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[pylist, markobject, stackslice], 11598ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pylist], 11608ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 11618ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Extend a list by a slice of stack objects. 11628ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 11638ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack before: ... pylist markobject stackslice 11648ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack after: ... pylist+stackslice 116581098ac1c8c88f9f1dcf7511555d2ad1f7124575Tim Peters 116681098ac1c8c88f9f1dcf7511555d2ad1f7124575Tim Peters although pylist is really extended in-place. 11678ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 11688ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 11698ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='LIST', 11708ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='l', 11718ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 11728ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[markobject, stackslice], 11738ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pylist], 11748ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 11758ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Build a list out of the topmost stack slice, after markobject. 11768ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 11778ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters All the stack entries following the topmost markobject are placed into 11788ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters a single Python list, which single list object replaces all of the 11798ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack from the topmost markobject onward. For example, 11808ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 11818ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack before: ... markobject 1 2 3 'abc' 11828ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack after: ... [1, 2, 3, 'abc'] 11838ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 11848ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 11858ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # Ways to build tuples. 11868ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 11878ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='EMPTY_TUPLE', 11888ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code=')', 11898ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 11908ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 11918ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pytuple], 11928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 11938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="Push an empty tuple."), 11948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 11958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='TUPLE', 11968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='t', 11978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 11988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[markobject, stackslice], 11998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pytuple], 12008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 12018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Build a tuple out of the topmost stack slice, after markobject. 12028ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 12038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters All the stack entries following the topmost markobject are placed into 12048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters a single Python tuple, which single tuple object replaces all of the 12058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack from the topmost markobject onward. For example, 12068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 12078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack before: ... markobject 1 2 3 'abc' 12088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack after: ... (1, 2, 3, 'abc') 12098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 12108ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1211fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters I(name='TUPLE1', 1212fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters code='\x85', 1213fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters arg=None, 1214fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_before=[anyobject], 1215fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_after=[pytuple], 1216fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters proto=2, 1217fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters doc="""One-tuple. 1218fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1219fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters This code pops one value off the stack and pushes a tuple of 1220fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters length 1 whose one item is that value back onto it. IOW: 1221fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1222fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack[-1] = tuple(stack[-1:]) 1223fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters """), 1224fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1225fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters I(name='TUPLE2', 1226fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters code='\x86', 1227fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters arg=None, 1228fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_before=[anyobject, anyobject], 1229fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_after=[pytuple], 1230fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters proto=2, 1231fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters doc="""One-tuple. 1232fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1233fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters This code pops two values off the stack and pushes a tuple 1234fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters of length 2 whose items are those values back onto it. IOW: 1235fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1236fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack[-2:] = [tuple(stack[-2:])] 1237fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters """), 1238fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1239fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters I(name='TUPLE3', 1240fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters code='\x87', 1241fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters arg=None, 1242fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_before=[anyobject, anyobject, anyobject], 1243fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_after=[pytuple], 1244fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters proto=2, 1245fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters doc="""One-tuple. 1246fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1247fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters This code pops three values off the stack and pushes a tuple 1248fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters of length 3 whose items are those values back onto it. IOW: 1249fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1250fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack[-3:] = [tuple(stack[-3:])] 1251fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters """), 1252fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 12538ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # Ways to build dicts. 12548ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 12558ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='EMPTY_DICT', 12568ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='}', 12578ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 12588ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 12598ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pydict], 12608ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 12618ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="Push an empty dict."), 12628ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 12638ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='DICT', 12648ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='d', 12658ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 12668ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[markobject, stackslice], 12678ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pydict], 12688ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 12698ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Build a dict out of the topmost stack slice, after markobject. 12708ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 12718ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters All the stack entries following the topmost markobject are placed into 12728ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters a single Python dict, which single dict object replaces all of the 12738ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack from the topmost markobject onward. The stack slice alternates 12748ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters key, value, key, value, .... For example, 12758ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 12768ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack before: ... markobject 1 2 3 'abc' 12778ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack after: ... {1: 2, 3: 'abc'} 12788ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 12798ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 12808ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='SETITEM', 12818ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='s', 12828ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 12838ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[pydict, anyobject, anyobject], 12848ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pydict], 12858ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 12868ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Add a key+value pair to an existing dict. 12878ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 12888ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack before: ... pydict key value 12898ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack after: ... pydict 12908ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 12918ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters where pydict has been modified via pydict[key] = value. 12928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 12938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 12948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='SETITEMS', 12958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='u', 12968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 12978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[pydict, markobject, stackslice], 12988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[pydict], 12998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 13008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Add an arbitrary number of key+value pairs to an existing dict. 13018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13028ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The slice of the stack following the topmost markobject is taken as 13038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters an alternating sequence of keys and values, added to the dict 13048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters immediately under the topmost markobject. Everything at and after the 13058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters topmost markobject is popped, leaving the mutated dict at the top 13068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters of the stack. 13078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack before: ... pydict markobject key_1 value_1 ... key_n value_n 13098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack after: ... pydict 13108ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13118ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters where pydict has been modified via pydict[key_i] = value_i for i in 13128ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1, 2, ..., n, and in that order. 13138ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 13148ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13158ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # Stack manipulation. 13168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='POP', 13188ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='0', 13198ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 13208ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[anyobject], 13218ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[], 13228ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 13238ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="Discard the top stack item, shrinking the stack by one item."), 13248ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13258ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='DUP', 13268ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='2', 13278ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 13288ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[anyobject], 13298ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[anyobject, anyobject], 13308ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 13318ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="Push the top stack item onto the stack again, duplicating it."), 13328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13338ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='MARK', 13348ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='(', 13358ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 13368ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 13378ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[markobject], 13388ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 13398ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Push markobject onto the stack. 13408ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13418ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters markobject is a unique object, used by other opcodes to identify a 13428ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters region of the stack containing a variable number of objects for them 13438ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters to work on. See markobject.doc for more detail. 13448ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 13458ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13468ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='POP_MARK', 13478ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='1', 13488ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 13498ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[markobject, stackslice], 13508ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[], 13518ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 13528ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Pop all the stack objects at and above the topmost markobject. 13538ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13548ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters When an opcode using a variable number of stack objects is done, 13558ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters POP_MARK is used to remove those objects, and to remove the markobject 13568ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters that delimited their starting position on the stack. 13578ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 13588ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13598ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # Memo manipulation. There are really only two operations (get and put), 13608ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # each in all-text, "short binary", and "long binary" flavors. 13618ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13628ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='GET', 13638ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='g', 13648ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=decimalnl_short, 13658ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 13668ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[anyobject], 13678ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 13688ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Read an object from the memo and push it on the stack. 13698ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13708ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The index of the memo object to push is given by the newline-teriminated 13718ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters decimal string following. BINGET and LONG_BINGET are space-optimized 13728ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters versions. 13738ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 13748ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13758ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='BINGET', 13768ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='h', 13778ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=uint1, 13788ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 13798ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[anyobject], 13808ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 13818ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Read an object from the memo and push it on the stack. 13828ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13838ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The index of the memo object to push is given by the 1-byte unsigned 13848ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters integer following. 13858ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 13868ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13878ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='LONG_BINGET', 13888ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='j', 13898ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=int4, 13908ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 13918ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[anyobject], 13928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 13938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Read an object from the memo and push it on the stack. 13948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The index of the memo object to push is given by the 4-byte signed 13968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters little-endian integer following. 13978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 13988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 13998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='PUT', 14008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='p', 14018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=decimalnl_short, 14028ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 14038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[], 14048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 14058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Store the stack top into the memo. The stack is not popped. 14068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 14078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The index of the memo location to write into is given by the newline- 14088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters terminated decimal string following. BINPUT and LONG_BINPUT are 14098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters space-optimized versions. 14108ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 14118ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 14128ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='BINPUT', 14138ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='q', 14148ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=uint1, 14158ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 14168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[], 14178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 14188ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Store the stack top into the memo. The stack is not popped. 14198ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 14208ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The index of the memo location to write into is given by the 1-byte 14218ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters unsigned integer following. 14228ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 14238ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 14248ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='LONG_BINPUT', 14258ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='r', 14268ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=int4, 14278ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 14288ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[], 14298ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 14308ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Store the stack top into the memo. The stack is not popped. 14318ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 14328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The index of the memo location to write into is given by the 4-byte 14338ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters signed little-endian integer following. 14348ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 14358ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1436fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters # Access the extension registry (predefined objects). Akin to the GET 1437fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters # family. 1438fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1439fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters I(name='EXT1', 1440fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters code='\x82', 1441fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters arg=uint1, 1442fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_before=[], 1443fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_after=[anyobject], 1444fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters proto=2, 1445fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters doc="""Extension code. 1446fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1447fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters This code and the similar EXT2 and EXT4 allow using a registry 1448fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters of popular objects that are pickled by name, typically classes. 1449fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters It is envisioned that through a global negotiation and 1450fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters registration process, third parties can set up a mapping between 1451fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters ints and object names. 1452fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1453fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters In order to guarantee pickle interchangeability, the extension 1454fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters code registry ought to be global, although a range of codes may 1455fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters be reserved for private use. 1456fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1457fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters EXT1 has a 1-byte integer argument. This is used to index into the 1458fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters extension registry, and the object at that index is pushed on the stack. 1459fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters """), 1460fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1461fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters I(name='EXT2', 1462fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters code='\x83', 1463fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters arg=uint2, 1464fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_before=[], 1465fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_after=[anyobject], 1466fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters proto=2, 1467fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters doc="""Extension code. 1468fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1469fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters See EXT1. EXT2 has a two-byte integer argument. 1470fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters """), 1471fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1472fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters I(name='EXT4', 1473fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters code='\x84', 1474fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters arg=int4, 1475fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_before=[], 1476fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_after=[anyobject], 1477fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters proto=2, 1478fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters doc="""Extension code. 1479fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1480fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters See EXT1. EXT4 has a four-byte integer argument. 1481fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters """), 1482fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 14838ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # Push a class object, or module function, on the stack, via its module 14848ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # and name. 14858ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 14868ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='GLOBAL', 14878ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='c', 14888ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=stringnl_noescape_pair, 14898ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 14908ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[anyobject], 14918ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 14928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Push a global object (module.attr) on the stack. 14938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 14948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Two newline-terminated strings follow the GLOBAL opcode. The first is 14958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters taken as a module name, and the second as a class name. The class 14968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters object module.class is pushed on the stack. More accurately, the 14978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters object returned by self.find_class(module, class) is pushed on the 14988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack, so unpickling subclasses can override this form of lookup. 14998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 15008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # Ways to build objects of classes pickle doesn't know about directly 15028ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # (user-defined classes). I despair of documenting this accurately 15038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # and comprehensibly -- you really have to read the pickle code to 15048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # find all the special cases. 15058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='REDUCE', 15078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='R', 15088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 15098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[anyobject, anyobject], 15108ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[anyobject], 15118ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 15128ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Push an object built from a callable and an argument tuple. 15138ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15148ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The opcode is named to remind of the __reduce__() method. 15158ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack before: ... callable pytuple 15178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack after: ... callable(*pytuple) 15188ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15198ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The callable and the argument tuple are the first two items returned 15208ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters by a __reduce__ method. Applying the callable to the argtuple is 15218ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters supposed to reproduce the original object, or at least get it started. 15228ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters If the __reduce__ method returns a 3-tuple, the last component is an 15238ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters argument to be passed to the object's __setstate__, and then the REDUCE 15248ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters opcode is followed by code to create setstate's argument, and then a 15258ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters BUILD opcode to apply __setstate__ to that argument. 15268ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15278ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters If type(callable) is not ClassType, REDUCE complains unless the 1528dffbf5f5421cbeb20237280c0bd70f989269f844Georg Brandl callable has been registered with the copy_reg module's 15298ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters safe_constructors dict, or the callable has a magic 15308ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters '__safe_for_unpickling__' attribute with a true value. I'm not sure 15318ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters why it does this, but I've sure seen this complaint often enough when 15328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I didn't want to <wink>. 15338ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 15348ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15358ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='BUILD', 15368ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='b', 15378ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 15388ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[anyobject, anyobject], 15398ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[anyobject], 15408ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 15418ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Finish building an object, via __setstate__ or dict update. 15428ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15438ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack before: ... anyobject argument 15448ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack after: ... anyobject 15458ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15468ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters where anyobject may have been mutated, as follows: 15478ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15488ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters If the object has a __setstate__ method, 15498ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15508ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters anyobject.__setstate__(argument) 15518ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15528ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters is called. 15538ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15548ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Else the argument must be a dict, the object must have a __dict__, and 15558ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters the object is updated via 15568ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15578ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters anyobject.__dict__.update(argument) 15588ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15598ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters This may raise RuntimeError in restricted execution mode (which 15608ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters disallows access to __dict__ directly); in that case, the object 15618ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters is updated instead via 15628ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15638ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters for k, v in argument.items(): 15648ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters anyobject[k] = v 15658ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 15668ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15678ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='INST', 15688ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='i', 15698ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=stringnl_noescape_pair, 15708ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[markobject, stackslice], 15718ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[anyobject], 15728ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 15738ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Build a class instance. 15748ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15758ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters This is the protocol 0 version of protocol 1's OBJ opcode. 15768ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters INST is followed by two newline-terminated strings, giving a 15778ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters module and class name, just as for the GLOBAL opcode (and see 15788ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters GLOBAL for more details about that). self.find_class(module, name) 15798ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters is used to get a class object. 15808ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15818ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters In addition, all the objects on the stack following the topmost 15828ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters markobject are gathered into a tuple and popped (along with the 15838ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters topmost markobject), just as for the TUPLE opcode. 15848ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15858ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Now it gets complicated. If all of these are true: 15868ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15878ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters + The argtuple is empty (markobject was at the top of the stack 15888ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters at the start). 15898ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15908ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters + It's an old-style class object (the type of the class object is 15918ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters ClassType). 15928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters + The class object does not have a __getinitargs__ attribute. 15948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 15958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters then we want to create an old-style class instance without invoking 15968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters its __init__() method (pickle has waffled on this over the years; not 15978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters calling __init__() is current wisdom). In this case, an instance of 15988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters an old-style dummy class is created, and then we try to rebind its 15998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters __class__ attribute to the desired class object. If this succeeds, 16008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters the new instance object is pushed on the stack, and we're done. In 16018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters restricted execution mode it can fail (assignment to __class__ is 16028ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters disallowed), and I'm not really sure what happens then -- it looks 16038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters like the code ends up calling the class object's __init__ anyway, 16048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters via falling into the next case. 16058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 16068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Else (the argtuple is not empty, it's not an old-style class object, 16078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters or the class object does have a __getinitargs__ attribute), the code 16088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters first insists that the class object have a __safe_for_unpickling__ 16098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters attribute. Unlike as for the __safe_for_unpickling__ check in REDUCE, 16108ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters it doesn't matter whether this attribute has a true or false value, it 1611ecb1104342af0e2dca191f7666c60d5ca65069a8Guido van Rossum only matters whether it exists (XXX this is a bug; cPickle 1612ecb1104342af0e2dca191f7666c60d5ca65069a8Guido van Rossum requires the attribute to be true). If __safe_for_unpickling__ 1613ecb1104342af0e2dca191f7666c60d5ca65069a8Guido van Rossum doesn't exist, UnpicklingError is raised. 16148ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 16158ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Else (the class object does have a __safe_for_unpickling__ attr), 16168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters the class object obtained from INST's arguments is applied to the 16178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters argtuple obtained from the stack, and the resulting instance object 16188ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters is pushed on the stack. 16192b93c4c70820d0314e8e297c2817d9e03d73df62Tim Peters 16202b93c4c70820d0314e8e297c2817d9e03d73df62Tim Peters NOTE: checks for __safe_for_unpickling__ went away in Python 2.3. 16218ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 16228ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 16238ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='OBJ', 16248ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='o', 16258ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 16268ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[markobject, anyobject, stackslice], 16278ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[anyobject], 16288ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 16298ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Build a class instance. 16308ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 16318ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters This is the protocol 1 version of protocol 0's INST opcode, and is 16328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters very much like it. The major difference is that the class object 16338ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters is taken off the stack, allowing it to be retrieved from the memo 16348ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters repeatedly if several instances of the same class are created. This 16358ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters can be much more efficient (in both time and space) than repeatedly 16368ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters embedding the module and class names in INST opcodes. 16378ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 16388ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Unlike INST, OBJ takes no arguments from the opcode stream. Instead 16398ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters the class object is taken off the stack, immediately above the 16408ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters topmost markobject: 16418ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 16428ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack before: ... markobject classobject stackslice 16438ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Stack after: ... new_instance_object 16448ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 16458ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters As for INST, the remainder of the stack above the markobject is 16468ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters gathered into an argument tuple, and then the logic seems identical, 1647ecb1104342af0e2dca191f7666c60d5ca65069a8Guido van Rossum except that no __safe_for_unpickling__ check is done (XXX this is 1648ecb1104342af0e2dca191f7666c60d5ca65069a8Guido van Rossum a bug; cPickle does test __safe_for_unpickling__). See INST for 1649ecb1104342af0e2dca191f7666c60d5ca65069a8Guido van Rossum the gory details. 16502b93c4c70820d0314e8e297c2817d9e03d73df62Tim Peters 16512b93c4c70820d0314e8e297c2817d9e03d73df62Tim Peters NOTE: In Python 2.3, INST and OBJ are identical except for how they 16522b93c4c70820d0314e8e297c2817d9e03d73df62Tim Peters get the class object. That was always the intent; the implementations 16532b93c4c70820d0314e8e297c2817d9e03d73df62Tim Peters had diverged for accidental reasons. 16548ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 16558ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1656fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters I(name='NEWOBJ', 1657fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters code='\x81', 1658fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters arg=None, 1659fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_before=[anyobject, anyobject], 1660fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_after=[anyobject], 1661fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters proto=2, 1662fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters doc="""Build an object instance. 1663fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1664fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters The stack before should be thought of as containing a class 1665fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters object followed by an argument tuple (the tuple being the stack 1666fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters top). Call these cls and args. They are popped off the stack, 1667fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters and the value returned by cls.__new__(cls, *args) is pushed back 1668fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters onto the stack. 1669fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters """), 1670fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 16718ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # Machine control. 16728ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1673fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters I(name='PROTO', 1674fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters code='\x80', 1675fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters arg=uint1, 1676fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_before=[], 1677fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters stack_after=[], 1678fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters proto=2, 1679fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters doc="""Protocol version indicator. 1680fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 1681fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters For protocol 2 and above, a pickle must start with this opcode. 1682fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters The argument is the protocol version, an int in range(2, 256). 1683fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters """), 1684fdc03462b3e0796ae6474da6f0f9844773d1da8fTim Peters 16858ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='STOP', 16868ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='.', 16878ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 16888ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[anyobject], 16898ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[], 16908ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 16918ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Stop the unpickling machine. 16928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 16938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Every pickle ends with this opcode. The object at the top of the stack 16948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters is popped, and that's the result of unpickling. The stack should be 16958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters empty then. 16968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 16978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 16988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # Ways to deal with persistent IDs. 16998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 17008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='PERSID', 17018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='P', 17028ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=stringnl_noescape, 17038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[], 17048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[anyobject], 17058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=0, 17068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Push an object identified by a persistent ID. 17078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 17088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The pickle module doesn't define what a persistent ID means. PERSID's 17098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters argument is a newline-terminated str-style (no embedded escapes, no 17108ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters bracketing quote characters) string, which *is* "the persistent ID". 17118ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters The unpickler passes this string to self.persistent_load(). Whatever 17128ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters object that returns is pushed on the stack. There is no implementation 17138ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters of persistent_load() in Python's unpickler: it must be supplied by an 17148ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters unpickler subclass. 17158ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 17168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 17178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters I(name='BINPERSID', 17188ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code='Q', 17198ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg=None, 17208ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_before=[anyobject], 17218ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stack_after=[anyobject], 17228ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters proto=1, 17238ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters doc="""Push an object identified by a persistent ID. 17248ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 17258ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Like PERSID, except the persistent ID is popped off the stack (instead 17268ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters of being a string embedded in the opcode bytestream). The persistent 17278ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters ID is passed to self.persistent_load(), and whatever object that 17288ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters returns is pushed on the stack. See PERSID for more detail. 17298ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """), 17308ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters] 17318ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdel I 17328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 17338ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# Verify uniqueness of .name and .code members. 17348ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersname2i = {} 17358ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterscode2i = {} 17368ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 17378ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersfor i, d in enumerate(opcodes): 17388ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if d.name in name2i: 17398ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("repeated name %r at indices %d and %d" % 17408ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters (d.name, name2i[d.name], i)) 17418ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if d.code in code2i: 17428ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("repeated code %r at indices %d and %d" % 17438ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters (d.code, code2i[d.code], i)) 17448ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 17458ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name2i[d.name] = i 17468ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code2i[d.code] = i 17478ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 17488ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdel name2i, code2i, i, d 17498ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 17508ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters############################################################################## 17518ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# Build a code2op dict, mapping opcode characters to OpcodeInfo records. 17528ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# Also ensure we've got the same stuff as pickle.py, although the 17538ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# introspection here is dicey. 17548ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 17558ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peterscode2op = {} 17568ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersfor d in opcodes: 17578ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code2op[d.code] = d 17588ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdel d 17598ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 17608ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdef assure_pickle_consistency(verbose=False): 17618ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters import pickle, re 17628ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 17638ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters copy = code2op.copy() 17648ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters for name in pickle.__all__: 17658ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if not re.match("[A-Z][A-Z0-9_]+$", name): 17668ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if verbose: 17678ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters print "skipping %r: it doesn't look like an opcode name" % name 17688ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters continue 17698ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters picklecode = getattr(pickle, name) 17708ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if not isinstance(picklecode, str) or len(picklecode) != 1: 17718ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if verbose: 17728ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters print ("skipping %r: value %r doesn't look like a pickle " 17738ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters "code" % (name, picklecode)) 17748ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters continue 17758ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if picklecode in copy: 17768ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if verbose: 17778ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters print "checking name %r w/ code %r for consistency" % ( 17788ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name, picklecode) 17798ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters d = copy[picklecode] 17808ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if d.name != name: 17818ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("for pickle code %r, pickle.py uses name %r " 17828ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters "but we're using name %r" % (picklecode, 17838ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters name, 17848ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters d.name)) 17858ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # Forget this one. Any left over in copy at the end are a problem 17868ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # of a different kind. 17878ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters del copy[picklecode] 17888ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters else: 17898ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("pickle.py appears to have a pickle opcode with " 17908ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters "name %r and code %r, but we don't" % 17918ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters (name, picklecode)) 17928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if copy: 17938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters msg = ["we appear to have pickle opcodes that pickle.py doesn't have:"] 17948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters for code, d in copy.items(): 17958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters msg.append(" name %r with code %r" % (d.name, code)) 17968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("\n".join(msg)) 17978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 17988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersassure_pickle_consistency() 1799c0c12b57070a5b494662bebc418e3958bf5bdbeeTim Petersdel assure_pickle_consistency 18008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 18018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters############################################################################## 18028ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# A pickle opcode generator. 18038ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 18048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdef genops(pickle): 1805a72ded9bc81bd278bf119e92feaeaea2aa16f02dGuido van Rossum """Generate all the opcodes in a pickle. 18068ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 18078ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'pickle' is a file-like object, or string, containing the pickle. 18088ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 18098ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Each opcode in the pickle is generated, from the current pickle position, 18108ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters stopping after a STOP opcode is delivered. A triple is generated for 18118ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters each opcode: 18128ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 18138ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters opcode, arg, pos 18148ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 18158ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters opcode is an OpcodeInfo record, describing the current opcode. 18168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 18178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters If the opcode has an argument embedded in the pickle, arg is its decoded 18188ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters value, as a Python object. If the opcode doesn't have an argument, arg 18198ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters is None. 18208ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 18218ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters If the pickle has a tell() method, pos was the value of pickle.tell() 18228ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters before reading the current opcode. If the pickle is a string object, 18238ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters it's wrapped in a StringIO object, and the latter's tell() result is 18248ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters used. Else (the pickle doesn't have a tell(), and it's not obvious how 18258ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters to query its current position) pos is None. 18268ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """ 18278ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 18288ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters import cStringIO as StringIO 18298ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 18308ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if isinstance(pickle, str): 18318ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters pickle = StringIO.StringIO(pickle) 18328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 18338ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if hasattr(pickle, "tell"): 18348ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters getpos = pickle.tell 18358ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters else: 18368ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters getpos = lambda: None 18378ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 18388ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters while True: 18398ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters pos = getpos() 18408ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code = pickle.read(1) 18418ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters opcode = code2op.get(code) 18428ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if opcode is None: 18438ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if code == "": 18448ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("pickle exhausted before seeing STOP") 18458ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters else: 18468ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters raise ValueError("at position %s, opcode %r unknown" % ( 18478ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters pos is None and "<unknown>" or pos, 18488ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters code)) 18498ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if opcode.arg is None: 18508ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg = None 18518ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters else: 18528ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters arg = opcode.arg.reader(pickle) 18538ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters yield opcode, arg, pos 18548ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if code == '.': 18558ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters assert opcode.name == 'STOP' 18568ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters break 18578ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 18588ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters############################################################################## 1859da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger# A pickle optimizer. 1860da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger 1861da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettingerdef optimize(p): 1862da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger 'Optimize a pickle string by removing unused PUT opcodes' 1863da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger gets = set() # set of args used by a GET opcode 1864da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger puts = [] # (arg, startpos, stoppos) for the PUT opcodes 1865da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger prevpos = None # set to pos if previous opcode was a PUT 1866da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger for opcode, arg, pos in genops(p): 1867da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger if prevpos is not None: 1868da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger puts.append((prevarg, prevpos, pos)) 1869da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger prevpos = None 1870da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger if 'PUT' in opcode.name: 1871da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger prevarg, prevpos = arg, pos 1872da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger elif 'GET' in opcode.name: 1873da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger gets.add(arg) 1874da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger 1875da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger # Copy the pickle string except for PUTS without a corresponding GET 1876da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger s = [] 1877da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger i = 0 1878da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger for arg, start, stop in puts: 1879da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger j = stop if (arg in gets) else start 1880da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger s.append(p[i:j]) 1881da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger i = stop 1882da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger s.append(p[i:]) 1883da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger return ''.join(s) 1884da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger 1885da614dcc4f56bfb136c53b04d60889870d969926Raymond Hettinger############################################################################## 18868ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters# A symbolic pickle disassembler. 18878ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 188862235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Petersdef dis(pickle, out=None, memo=None, indentlevel=4): 18898ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """Produce a symbolic disassembly of a pickle. 18908ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 18918ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 'pickle' is a file-like object, or string, containing a (at least one) 18928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters pickle. The pickle is disassembled from the current position, through 18938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters the first STOP opcode encountered. 18948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 18958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Optional arg 'out' is a file-like object to which the disassembly is 18968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters printed. It defaults to sys.stdout. 18978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 189862235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters Optional arg 'memo' is a Python dict, used as the pickle's memo. It 189962235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters may be mutated by dis(), if the pickle contains PUT or BINPUT opcodes. 190062235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters Passing the same memo object to another dis() call then allows disassembly 190162235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters to proceed across multiple pickles that were all created by the same 190262235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters pickler with the same memo. Ordinarily you don't need to worry about this. 190362235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters 19048ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters Optional arg indentlevel is the number of blanks by which to indent 19058ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters a new MARK level. It defaults to 4. 1906c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 1907c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters In addition to printing the disassembly, some sanity checks are made: 1908c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 1909c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters + All embedded opcode arguments "make sense". 1910c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 1911c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters + Explicit and implicit pop operations have enough items on the stack. 1912c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 1913c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters + When an opcode implicitly refers to a markobject, a markobject is 1914c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters actually on the stack. 1915c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 1916c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters + A memo entry isn't referenced before it's defined. 1917c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 1918c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters + The markobject isn't stored in the memo. 1919c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 1920c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters + A memo entry isn't redefined. 19218ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters """ 19228ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1923c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters # Most of the hair here is for sanity checks, but most of it is needed 1924c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters # anyway to detect when a protocol 0 POP takes a MARK off the stack 1925c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters # (which in turn is needed to indent MARK blocks correctly). 1926c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 1927c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters stack = [] # crude emulation of unpickler stack 192862235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters if memo is None: 192962235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters memo = {} # crude emulation of unpicker memo 1930c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters maxproto = -1 # max protocol number seen 1931c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters markstack = [] # bytecode positions of MARK opcodes 19328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters indentchunk = ' ' * indentlevel 1933c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters errormsg = None 19348ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters for opcode, arg, pos in genops(pickle): 19358ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if pos is not None: 19368ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters print >> out, "%5d:" % pos, 19378ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1938d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters line = "%-4s %s%s" % (repr(opcode.code)[1:-1], 1939d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters indentchunk * len(markstack), 1940d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters opcode.name) 19418ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 1942c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters maxproto = max(maxproto, opcode.proto) 1943c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters before = opcode.stack_before # don't mutate 1944c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters after = opcode.stack_after # don't mutate 194543277d64eb5e008e59009bd03887d66c134445efTim Peters numtopop = len(before) 194643277d64eb5e008e59009bd03887d66c134445efTim Peters 194743277d64eb5e008e59009bd03887d66c134445efTim Peters # See whether a MARK should be popped. 19488ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters markmsg = None 1949c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters if markobject in before or (opcode.name == "POP" and 1950c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters stack and 1951c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters stack[-1] is markobject): 1952c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters assert markobject not in after 195343277d64eb5e008e59009bd03887d66c134445efTim Peters if __debug__: 195443277d64eb5e008e59009bd03887d66c134445efTim Peters if markobject in before: 195543277d64eb5e008e59009bd03887d66c134445efTim Peters assert before[-1] is stackslice 1956c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters if markstack: 1957c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters markpos = markstack.pop() 1958c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters if markpos is None: 1959c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters markmsg = "(MARK at unknown opcode offset)" 1960c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters else: 1961c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters markmsg = "(MARK at %d)" % markpos 1962c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters # Pop everything at and after the topmost markobject. 1963c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters while stack[-1] is not markobject: 1964c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters stack.pop() 1965c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters stack.pop() 196643277d64eb5e008e59009bd03887d66c134445efTim Peters # Stop later code from popping too much. 1967c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters try: 196843277d64eb5e008e59009bd03887d66c134445efTim Peters numtopop = before.index(markobject) 1969c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters except ValueError: 1970c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters assert opcode.name == "POP" 197143277d64eb5e008e59009bd03887d66c134445efTim Peters numtopop = 0 1972c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters else: 1973c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters errormsg = markmsg = "no MARK exists on stack" 1974c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 1975c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters # Check for correct memo usage. 1976c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters if opcode.name in ("PUT", "BINPUT", "LONG_BINPUT"): 197743277d64eb5e008e59009bd03887d66c134445efTim Peters assert arg is not None 1978c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters if arg in memo: 1979c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters errormsg = "memo key %r already defined" % arg 1980c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters elif not stack: 1981c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters errormsg = "stack is empty -- can't store into memo" 1982c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters elif stack[-1] is markobject: 1983c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters errormsg = "can't store markobject in the memo" 1984c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters else: 1985c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters memo[arg] = stack[-1] 1986c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 1987c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters elif opcode.name in ("GET", "BINGET", "LONG_BINGET"): 1988c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters if arg in memo: 1989c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters assert len(after) == 1 1990c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters after = [memo[arg]] # for better stack emulation 1991c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters else: 1992c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters errormsg = "memo key %r has never been stored into" % arg 19938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 19948ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if arg is not None or markmsg: 19958ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters # make a mild effort to align arguments 19968ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters line += ' ' * (10 - len(opcode.name)) 19978ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if arg is not None: 19988ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters line += ' ' + repr(arg) 19998ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters if markmsg: 20008ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters line += ' ' + markmsg 20018ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters print >> out, line 20028ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2003c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters if errormsg: 2004c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters # Note that we delayed complaining until the offending opcode 2005c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters # was printed. 2006c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters raise ValueError(errormsg) 2007c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 2008c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters # Emulate the stack effects. 200943277d64eb5e008e59009bd03887d66c134445efTim Peters if len(stack) < numtopop: 201043277d64eb5e008e59009bd03887d66c134445efTim Peters raise ValueError("tries to pop %d items from stack with " 201143277d64eb5e008e59009bd03887d66c134445efTim Peters "only %d items" % (numtopop, len(stack))) 201243277d64eb5e008e59009bd03887d66c134445efTim Peters if numtopop: 201343277d64eb5e008e59009bd03887d66c134445efTim Peters del stack[-numtopop:] 2014c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters if markobject in after: 201543277d64eb5e008e59009bd03887d66c134445efTim Peters assert markobject not in before 20168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters markstack.append(pos) 20178ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2018c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters stack.extend(after) 2019c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 2020c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters print >> out, "highest protocol among opcodes =", maxproto 2021c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters if stack: 2022c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters raise ValueError("stack not empty after STOP: %r" % stack) 20238ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 202490718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters# For use in the doctest, simply as an example of a class to pickle. 202590718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Petersclass _Example: 202690718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters def __init__(self, value): 202790718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters self.value = value 202890718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 202903e35327f26d353db126a8f840a8890b3593f18aGuido van Rossum_dis_test = r""" 20308ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters>>> import pickle 20318ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters>>> x = [1, 2, (3, 4), {'abc': u"def"}] 2032570283584af6a9aff47d2341d6154055572aaff5Guido van Rossum>>> pkl = pickle.dumps(x, 0) 2033570283584af6a9aff47d2341d6154055572aaff5Guido van Rossum>>> dis(pkl) 2034d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 0: ( MARK 2035d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 1: l LIST (MARK at 0) 2036d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 2: p PUT 0 2037d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 5: I INT 1 2038d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 8: a APPEND 2039d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 9: I INT 2 2040d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 12: a APPEND 2041d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 13: ( MARK 2042d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 14: I INT 3 2043d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 17: I INT 4 2044d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 20: t TUPLE (MARK at 13) 2045d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 21: p PUT 1 2046d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 24: a APPEND 2047d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 25: ( MARK 2048d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 26: d DICT (MARK at 25) 2049d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 27: p PUT 2 2050d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 30: S STRING 'abc' 2051d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 37: p PUT 3 2052d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 40: V UNICODE u'def' 2053d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 45: p PUT 4 2054d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 48: s SETITEM 2055d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 49: a APPEND 2056d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 50: . STOP 2057c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Petershighest protocol among opcodes = 0 20588ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 20598ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersTry again with a "binary" pickle. 20608ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2061570283584af6a9aff47d2341d6154055572aaff5Guido van Rossum>>> pkl = pickle.dumps(x, 1) 2062570283584af6a9aff47d2341d6154055572aaff5Guido van Rossum>>> dis(pkl) 2063d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 0: ] EMPTY_LIST 2064d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 1: q BINPUT 0 2065d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 3: ( MARK 2066d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 4: K BININT1 1 2067d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 6: K BININT1 2 2068d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 8: ( MARK 2069d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 9: K BININT1 3 2070d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 11: K BININT1 4 2071d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 13: t TUPLE (MARK at 8) 2072d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 14: q BINPUT 1 2073d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 16: } EMPTY_DICT 2074d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 17: q BINPUT 2 2075d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 19: U SHORT_BINSTRING 'abc' 2076d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 24: q BINPUT 3 2077d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 26: X BINUNICODE u'def' 2078d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 34: q BINPUT 4 2079d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 36: s SETITEM 2080d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 37: e APPENDS (MARK at 3) 2081d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 38: . STOP 2082c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Petershighest protocol among opcodes = 1 20838ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 20848ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersExercise the INST/OBJ/BUILD family. 20858ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 208610085c656f864763237e30194d61041cadc0143bMark Dickinson>>> import pickletools 208710085c656f864763237e30194d61041cadc0143bMark Dickinson>>> dis(pickle.dumps(pickletools.dis, 0)) 208810085c656f864763237e30194d61041cadc0143bMark Dickinson 0: c GLOBAL 'pickletools dis' 208910085c656f864763237e30194d61041cadc0143bMark Dickinson 17: p PUT 0 209010085c656f864763237e30194d61041cadc0143bMark Dickinson 20: . STOP 2091c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Petershighest protocol among opcodes = 0 20928ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 209390718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters>>> from pickletools import _Example 209490718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters>>> x = [_Example(42)] * 2 2095f29d3d6011e41b40282994375454f2020a429d79Guido van Rossum>>> dis(pickle.dumps(x, 0)) 2096d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 0: ( MARK 2097d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 1: l LIST (MARK at 0) 2098d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 2: p PUT 0 2099d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 5: ( MARK 210090718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 6: i INST 'pickletools _Example' (MARK at 5) 2101d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 28: p PUT 1 2102d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 31: ( MARK 2103d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 32: d DICT (MARK at 31) 2104d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 33: p PUT 2 210590718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 36: S STRING 'value' 210690718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 45: p PUT 3 210790718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 48: I INT 42 210890718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 52: s SETITEM 210990718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 53: b BUILD 211090718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 54: a APPEND 211190718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 55: g GET 1 211290718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 58: a APPEND 211390718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 59: . STOP 2114c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Petershighest protocol among opcodes = 0 21158ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 21168ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters>>> dis(pickle.dumps(x, 1)) 2117d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 0: ] EMPTY_LIST 2118d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 1: q BINPUT 0 2119d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 3: ( MARK 2120d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 4: ( MARK 212190718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 5: c GLOBAL 'pickletools _Example' 2122d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 27: q BINPUT 1 2123d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 29: o OBJ (MARK at 4) 2124d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 30: q BINPUT 2 2125d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 32: } EMPTY_DICT 2126d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 33: q BINPUT 3 212790718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 35: U SHORT_BINSTRING 'value' 212890718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 42: q BINPUT 4 212990718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 44: K BININT1 42 213090718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 46: s SETITEM 213190718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 47: b BUILD 213290718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 48: h BINGET 2 213390718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 50: e APPENDS (MARK at 3) 213490718a4eb5a21d903d9cefdb7f8cdb50e847187bTim Peters 51: . STOP 2135c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Petershighest protocol among opcodes = 1 21368ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 21378ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersTry "the canonical" recursive-object test. 21388ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 21398ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters>>> L = [] 21408ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters>>> T = L, 21418ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters>>> L.append(T) 21428ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters>>> L[0] is T 21438ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersTrue 21448ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters>>> T[0] is L 21458ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersTrue 21468ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters>>> L[0][0] is L 21478ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersTrue 21488ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters>>> T[0][0] is T 21498ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim PetersTrue 2150f29d3d6011e41b40282994375454f2020a429d79Guido van Rossum>>> dis(pickle.dumps(L, 0)) 2151d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 0: ( MARK 2152d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 1: l LIST (MARK at 0) 2153d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 2: p PUT 0 2154d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 5: ( MARK 2155d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 6: g GET 0 2156d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 9: t TUPLE (MARK at 5) 2157d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 10: p PUT 1 2158d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 13: a APPEND 2159d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 14: . STOP 2160c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Petershighest protocol among opcodes = 0 2161c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 21628ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters>>> dis(pickle.dumps(L, 1)) 2163d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 0: ] EMPTY_LIST 2164d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 1: q BINPUT 0 2165d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 3: ( MARK 2166d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 4: h BINGET 0 2167d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 6: t TUPLE (MARK at 3) 2168d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 7: q BINPUT 1 2169d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 9: a APPEND 2170d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 10: . STOP 2171c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Petershighest protocol among opcodes = 1 21728ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2173c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim PetersNote that, in the protocol 0 pickle of the recursive tuple, the disassembler 2174c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Petershas to emulate the stack in order to realize that the POP opcode at 16 gets 2175c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Petersrid of the MARK at 0. 21768ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 2177f29d3d6011e41b40282994375454f2020a429d79Guido van Rossum>>> dis(pickle.dumps(T, 0)) 2178d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 0: ( MARK 2179d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 1: ( MARK 2180d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 2: l LIST (MARK at 1) 2181d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 3: p PUT 0 2182d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 6: ( MARK 2183d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 7: g GET 0 2184d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 10: t TUPLE (MARK at 6) 2185d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 11: p PUT 1 2186d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 14: a APPEND 2187d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 15: 0 POP 2188c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 16: 0 POP (MARK at 0) 2189c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 17: g GET 1 2190c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 20: . STOP 2191c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Petershighest protocol among opcodes = 0 2192c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Peters 21938ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters>>> dis(pickle.dumps(T, 1)) 2194d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 0: ( MARK 2195d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 1: ] EMPTY_LIST 2196d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 2: q BINPUT 0 2197d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 4: ( MARK 2198d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 5: h BINGET 0 2199d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 7: t TUPLE (MARK at 4) 2200d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 8: q BINPUT 1 2201d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 10: a APPEND 2202d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 11: 1 POP_MARK (MARK at 0) 2203d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 12: h BINGET 1 2204d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 14: . STOP 2205c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Petershighest protocol among opcodes = 1 2206d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 2207d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim PetersTry protocol 2. 2208d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 2209d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters>>> dis(pickle.dumps(L, 2)) 2210d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 0: \x80 PROTO 2 2211d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 2: ] EMPTY_LIST 2212d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 3: q BINPUT 0 2213d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 5: h BINGET 0 2214d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 7: \x85 TUPLE1 2215d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 8: q BINPUT 1 2216d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 10: a APPEND 2217d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 11: . STOP 2218c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Petershighest protocol among opcodes = 2 2219d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 2220d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters>>> dis(pickle.dumps(T, 2)) 2221d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 0: \x80 PROTO 2 2222d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 2: ] EMPTY_LIST 2223d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 3: q BINPUT 0 2224d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 5: h BINGET 0 2225d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 7: \x85 TUPLE1 2226d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 8: q BINPUT 1 2227d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 10: a APPEND 2228d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 11: 0 POP 2229d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 12: h BINGET 1 2230d0f7c86a20e9cef8cbf2c8fb676fcc8d8a7000b4Tim Peters 14: . STOP 2231c1c2b3e0e2af6eb05346b4908f4c4d56618fb7b6Tim Petershighest protocol among opcodes = 2 22328ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters""" 22338ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 223462235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters_memo_test = r""" 223562235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters>>> import pickle 223662235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters>>> from StringIO import StringIO 223762235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters>>> f = StringIO() 223862235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters>>> p = pickle.Pickler(f, 2) 223962235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters>>> x = [1, 2, 3] 224062235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters>>> p.dump(x) 224162235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters>>> p.dump(x) 224262235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters>>> f.seek(0) 224362235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters>>> memo = {} 224462235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters>>> dis(f, memo=memo) 224562235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters 0: \x80 PROTO 2 224662235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters 2: ] EMPTY_LIST 224762235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters 3: q BINPUT 0 224862235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters 5: ( MARK 224962235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters 6: K BININT1 1 225062235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters 8: K BININT1 2 225162235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters 10: K BININT1 3 225262235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters 12: e APPENDS (MARK at 5) 225362235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters 13: . STOP 225462235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Petershighest protocol among opcodes = 2 225562235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters>>> dis(f, memo=memo) 225662235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters 14: \x80 PROTO 2 225762235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters 16: h BINGET 0 225862235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters 18: . STOP 225962235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Petershighest protocol among opcodes = 2 226062235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters""" 226162235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters 2262570283584af6a9aff47d2341d6154055572aaff5Guido van Rossum__test__ = {'disassembler_test': _dis_test, 226362235e701e377fd1e4934bc029b9b86d8dc3ed95Tim Peters 'disassembler_memo_test': _memo_test, 22648ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters } 22658ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 22668ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersdef _test(): 22678ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters import doctest 22688ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters return doctest.testmod() 22698ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters 22708ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Petersif __name__ == "__main__": 22718ecfc8ef9d6ffe0d9c732a438cb36e1e11480a19Tim Peters _test() 2272