Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Exploring Python Code Objects (PyOhio)

Exploring Python Code Objects (PyOhio)

Python is an interpreted language, right? Wrong! In this talk, dive deep into Python bytecode, and learn what actually happens in everyone's favorite Python program, 'print "Hello world"'. Learn to use the compile() and exec statement, understand what your Python code is doing with the dis and compiler modules, and discover new ways to explore and enjoy Python at a low level.

PyOhio, July 29, 2012.

"The Known Universe" video from American Museum of Natural History and the Hayden Planetarium. See http://www.youtube.com/watch?v=17jymDn0W6U.

dcrosta

July 29, 2012
Tweet

More Decks by dcrosta

Other Decks in Programming

Transcript

  1. "This popular meme is incorrect, or, rather, constructed upon a

    misunderstanding of (natural) language levels: a similar mistake would be to say 'the Bible is a hardcover book'." "I've been given to understand that Python is an interpreted language..." Alex Martelli, http://stackoverflow.com/questions/2998215
  2. CPYTHON • Compiles Python source to "bytecode" • On demand,

    when modules are loaded • Virtual machine for this bytecode
  3. BYTECODE? • About 150 primitive instructions • Associates data with

    those operations • CPython implements a virtual processor • Stored in .pyc files
  4. MAKE YOUR OWN • compile() (we will use this) •

    Many other ways: • compiler • parser • compileall • py_compile
  5. MAKE YOUR OWN >>> code_str = """ ... print "Hello,

    world" ... """ >>> code_obj = compile( ... code_str, '<string>', 'exec')
  6. MAKE YOUR OWN >>> code_str = """ ... print "Hello,

    world" ... """ >>> code_obj = compile( ... code_str, '<string>', 'exec') >>> code_obj <code object <module> at 0x1054c74b0, file "<string>", line 2>
  7. LOOK INSIDE >>> code_obj.co_filename '<string>' >>> code_obj.co_name '<module>' >>> dir(code_obj)

    ['co_argcount', 'co_cellvars', 'co_code', 'co_consts', 'co_filename', 'co_firstlineno', 'co_flags', 'co_freevars', 'co_lnotab', 'co_name', 'co_names', 'co_nlocals', 'co_stacksize', 'co_varnames']
  8. MAKE IT GO >>> code_obj() Traceback (most recent call last):

    File "<stdin>", line 1, in <module> TypeError: 'code' object is not callable
  9. MAKE IT GO >>> code_obj() Traceback (most recent call last):

    File "<stdin>", line 1, in <module> TypeError: 'code' object is not callable >>> exec code_obj Hello, world
  10. MAKE IT GO >>> code_str = """ ... tmp =

    max(a, b) ... out = tmp + 1 ... """ >>> code_obj = compile( ... code_str, '<string>', 'exec')
  11. MAKE IT GO >>> code_str = """ ... tmp =

    max(a, b) ... out = tmp + 1 ... """ >>> code_obj = compile( ... code_str, '<string>', 'exec') >>> myglobals = {} >>> mylocals = {'a': 5, 'b': 10} >>> exec code_obj in myglobals, mylocals >>> mylocals['out'] 11
  12. MAKE IT GO >>> code_str = """ ... tmp =

    max(a, b) ... out = tmp + 1 ... """ >>> code_obj = compile( ... code_str, '<string>', 'exec') >>> code_obj.co_consts (1, None) >>> code_obj.co_names ('max', 'a', 'b', 'tmp', 'out') >>> code_obj.co_stacksize 3
  13. UNDER THE HOOD >>> dis.dis(code_obj) 2 0 LOAD_NAME 0 (max)

    3 LOAD_NAME 1 (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 3 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE
  14. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 'out' =>
  15. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 'out' =>
  16. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 'out' => <built-in function max>
  17. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 'out' => <built-in function max>
  18. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 'out' => <built-in function max> 5
  19. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 'out' => <built-in function max> 5
  20. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 'out' => <built-in function max> 5 10
  21. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 'out' => <built-in function max> 5 10
  22. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 'out' => <built-in function max> 5
  23. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 'out' => <built-in function max>
  24. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 'out' =>
  25. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 'out' => 10
  26. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 'out' => 10
  27. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 10 'out' =>
  28. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 10 'out' =>
  29. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 10 'out' => 10
  30. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 10 'out' => 10
  31. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 10 'out' => 10 1
  32. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 10 'out' => 10 1
  33. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 10 'out' => 10
  34. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 10 'out' =>
  35. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 10 'out' => 11
  36. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 10 'out' => 11
  37. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 10 'out' => 11
  38. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 10 'out' => 11
  39. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 10 'out' => 11 None
  40. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 10 'out' => 11 None
  41. UNDER THE HOOD 0 LOAD_NAME 0 (max) 3 LOAD_NAME 1

    (a) 6 LOAD_NAME 2 (b) 9 CALL_FUNCTION 2 12 STORE_NAME 3 (tmp) 15 LOAD_NAME 3 (tmp) 18 LOAD_CONST 0 (1) 21 BINARY_ADD 22 STORE_NAME 4 (out) 25 LOAD_CONST 1 (None) 28 RETURN_VALUE locals: 'a' => 5 'b' => 10 'tmp' => 10 'out' => 11
  42. CEVAL.C PyObject * PyEval_EvalFrameEx(PyFrameObject *f, int throwflag) { PyObject *retval

    = NULL; PyCodeObject *co = f->f_code; for (;;) { opcode = NEXTOP(); switch (opcode) { /* eventually retval is set */ } if (retval) break; } return retval; }
  43. CEVAL.C case BINARY_ADD: w = POP(); v = TOP(); if

    (PyInt_CheckExact(v) && PyInt_CheckExact(w)) { long a, b, i; a = PyInt_AS_LONG(v); b = PyInt_AS_LONG(w); /* cast to avoid undefined behaviour on overflow */ i = (long)((unsigned long)a + b); x = PyInt_FromLong(i); } Py_DECREF(v); Py_DECREF(w); SET_TOP(x);
  44. CLASSES ARE CODE, TOO >>> def factory(): ... class MyClass(object):

    ... def method(self, arg): ... return arg ... return MyClass ... >>> dis.dis(factory.func_code) 2 0 LOAD_CONST 1 ('MyClass') 3 LOAD_GLOBAL 0 (object) 6 BUILD_TUPLE 1 9 LOAD_CONST 2 (<code object MyClass>) 12 MAKE_FUNCTION 0 15 CALL_FUNCTION 0 18 BUILD_CLASS 19 STORE_FAST 0 (MyClass) 5 22 LOAD_FAST 0 (MyClass) 25 RETURN_VALUE
  45. CLASSES ARE CODE, TOO >>> dis.dis(factory.func_code.co_consts[2]) 2 0 LOAD_NAME 0

    (__name__) 3 STORE_NAME 1 (__module__) 3 6 LOAD_CONST 0 (<code object method>) 9 MAKE_FUNCTION 0 12 STORE_NAME 2 (method) 15 LOAD_LOCALS 16 RETURN_VALUE >>> dis.dis(factory.func_code.co_consts[2].co_consts[0]) 4 0 LOAD_FAST 1 (arg) 3 RETURN_VALUE
  46. CLOSURES >>> def outer(x): ... def inner(y): ... return x

    + y ... return inner ... >>> dis.dis(outer) 2 0 LOAD_CLOSURE 0 (x) 3 BUILD_TUPLE 1 6 LOAD_CONST 1 (<code object inner>) 9 MAKE_CLOSURE 0 12 STORE_FAST 1 (inner) 4 15 LOAD_FAST 1 (inner) 18 RETURN_VALUE
  47. CLOSURES >>> inner = outer(1) >>> dis.dis(inner) 3 0 LOAD_DEREF

    0 (x) 3 LOAD_FAST 0 (y) 6 BINARY_ADD 7 RETURN_VALUE
  48. CLOSURES >>> inner = outer(1) >>> dis.dis(inner) 3 0 LOAD_DEREF

    0 (x) 3 LOAD_FAST 0 (y) 6 BINARY_ADD 7 RETURN_VALUE >>> inner.func_closure (<cell at 0x1050ef210: int object at 0x104f131b8>,) >>> inner.func_closure[0].cell_contents 1
  49. ZOOMING OUT • I hope this was interesting • This

    is all only for CPython (2.x)! • Go exploring, share what you find • Questions?