Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ruby VM Internals: TMI

Ruby VM Internals: TMI

RubyConf 2015 talk.

Aaron Patterson

November 15, 2015
Tweet

More Decks by Aaron Patterson

Other Decks in Technology

Transcript

  1. Time Breakdown of "Feeling Successful" 1% 9% 11% 11% 31%

    37% Fail Fail Fail Fail Fail Success
  2. NO

  3. Stuff that happens before the program runs Running the actual

    program Lexer Virtual Machine About 2 Slides Each Parser Compiler Most of Our Time
  4. Method Call array each block DOT name: 'each' DO END

    name: 'array' PARSER AST (Abstract Syntax Tree)
  5. global _start section .text _start: ; write(1, message, 13) mov

    rax, 1 ; system call 1 is write mov rdi, 1 ; file handle 1 is stdout mov rsi, message ; address of string to output mov rdx, 13 ; number of bytes syscall ; invoke operating system to do the write ; exit(0) mov eax, 60 ; system call 60 is exit xor rdi, rdi ; exit code 0 syscall ; invoke operating system to exit message: db "Hello, World", 10 ; note the newline at the end
  6. *

  7. RubyVM::InstructionSequence == disasm: #<ISeq:<compiled>@<compiled>>================================ 0000 trace 1 ( 1) 0002

    putspecialobject 1 0004 putspecialobject 2 0006 putobject :foo 0008 putiseq foo 0010 opt_send_without_block <callinfo!mid:core#define_method, argc:3, ARGS_SIMPLE>, <callcache> 0013 pop 0014 trace 1 ( 5) 0016 putself 0017 putself 0018 opt_send_without_block <callinfo!mid:foo, argc:0, FCALL|VCALL|ARGS_SIMPLE>, <callcache> 0021 putstring "world" 0023 opt_plus <callinfo!mid:+, argc:1, ARGS_SIMPLE>, <callcache> 0026 opt_send_without_block <callinfo!mid:puts, argc:1, FCALL|ARGS_SIMPLE>, <callcache> 0029 leave == disasm: #<ISeq:foo@<compiled>>======================================= 0000 trace 8 ( 1) 0002 trace 1 ( 2) 0004 putstring "hello " 0006 trace 16 ( 3) 0008 leave ( 2)
  8. AST

  9. Which Branch To Follow? VALUE rb_iseq_compile_node(rb_iseq_t *iseq, NODE *node) {

    DECL_ANCHOR(ret); INIT_ANCHOR(ret); if (node == 0) { COMPILE(ret, "nil", node); iseq_set_local_table(iseq, 0); } else if (nd_type(node) == NODE_SCOPE) {
  10. Reach Inside require 'fiddle' include Fiddle func = Function.new Handle::DEFAULT['rb_compile_string'],

    [TYPE_VOIDP, TYPE_VOIDP, TYPE_INT], TYPE_VOIDP node = func.call "", Fiddle.dlwrap("def foo; end"), 0 p Fiddle.dlunwrap node
  11. MUWAHAHAHA $ ruby test2.rb test2.rb:9:in `p': method `inspect' called on

    unexpected T_NODE object (0x007facab8e6570 flags=0x1) (NotImplementedError) from test2.rb:9:in `<main>'
  12. Which Branch To Follow? VALUE rb_iseq_compile_node(rb_iseq_t *iseq, NODE *node) {

    DECL_ANCHOR(ret); INIT_ANCHOR(ret); if (node == 0) { COMPILE(ret, "nil", node); iseq_set_local_table(iseq, 0); } else if (nd_type(node) == NODE_SCOPE) {
  13. iseq_compile_each /** compile each node self: InstructionSequence node: Ruby compiled

    node poped: This node will be poped */ static int iseq_compile_each(rb_iseq_t *iseq, LINK_ANCHOR *ret, NODE * node, int poped)
  14. `true` case NODE_TRUE:{ if (!poped) { ADD_INSN1(ret, line, putobject, Qtrue);

    } break; } AST Node Type Add New LL Item Instruction Name Instruction Arg
  15. `p true` $ ruby -e'puts RubyVM::InstructionSequence.new("p true").disasm' == disasm: #<ISeq:<compiled>@<compiled>>================================

    0000 trace 1 ( 1) 0002 putself 0003 putobject true 0005 opt_send_without_block <callinfo!mid:p, argc:1, FCALL| ARGS_SIMPLE>, <callcache> 0008 leave `putobject` Instruction `true` Parameter
  16. If Statements then_label = NEW_LABEL(line); else_label = NEW_LABEL(line); end_label =

    NEW_LABEL(line); compile_branch_condition(iseq, cond_seq, node->nd_cond, then_label, else_label); COMPILE_(then_seq, "then", node->nd_body, poped); COMPILE_(else_seq, "else", node->nd_else, poped); ADD_SEQ(ret, cond_seq); ADD_LABEL(ret, then_label); ADD_SEQ(ret, then_seq); ADD_INSNL(ret, line, jump, end_label); ADD_LABEL(ret, else_label); ADD_SEQ(ret, else_seq); ADD_LABEL(ret, end_label);
  17. Optimization Settings $ ruby -e'p RubyVM::InstructionSequence.compile_option' {:inline_const_cache=>true, :peephole_optimization=>true , :tailcall_optimization=>false,

    :specialized_instructio n=>true, :operands_unification=>true, :instructions_unif ication=>false, :stack_caching=>false, :trace_instructio n=>true, :frozen_string_literal=>false, :frozen_string_l iteral_debug=>false, :debug_level=>0}
  18. Optimization Settings $ ruby -e'p RubyVM::InstructionSequence.compile_option' {:inline_const_cache=>true, :peephole_optimization=>true , :tailcall_optimization=>false,

    :specialized_instructio n=>true, :operands_unification=>true, :instructions_unif ication=>false, :stack_caching=>false, :trace_instructio n=>true, :frozen_string_literal=>false, :frozen_string_l iteral_debug=>false, :debug_level=>0}
  19. Remove Useless Instructions /* * useless jump elimination: * jump

    LABEL1 * ... * LABEL1: * jump LABEL2 * * => in this case, first jump instruction should jump to * LABEL2 directly */
  20. Regular Method Dispatch (foo.bar) $ ruby -e"puts RubyVM::InstructionSequence.new(\"foo.bar\", nil, nil,

    0, :specialized_instruction => false).disasm" == disasm: #<ISeq:<compiled>@<compiled>>================================ 0000 putself 0001 send <callinfo!mid:foo, argc:0, FCALL|VCALL|ARGS_SIMPLE>, <callcache>, nil 0005 send <callinfo!mid:bar, argc:0, ARGS_SIMPLE>, <callcache>, nil 0009 leave send
  21. Regular Method Dispatch (foo + bar) $ ruby -e"puts RubyVM::InstructionSequence.new(\"foo

    + bar\", nil, nil, 0, :specialized_instruction => false).disasm" == disasm: #<ISeq:<compiled>@<compiled>>================================ 0000 putself 0001 send <callinfo!mid:foo, argc:0, FCALL|VCALL|ARGS_SIMPLE>, <callcache>, nil 0005 putself 0006 send <callinfo!mid:bar, argc:0, FCALL|VCALL|ARGS_SIMPLE>, <callcache>, nil 0010 send <callinfo!mid:+, argc:1, ARGS_SIMPLE>, <callcache>, nil 0014 leave send
  22. Specialized Instructions $ ruby -e"puts RubyVM::InstructionSequence.new(\"foo + bar\", nil, nil,

    0, :specialized_instruction => true).disasm" == disasm: #<ISeq:<compiled>@<compiled>>================================ 0000 putself 0001 opt_send_without_block <callinfo!mid:foo, argc:0, FCALL|VCALL|ARGS_SIMPLE>, <callcache> 0004 putself 0005 opt_send_without_block <callinfo!mid:bar, argc:0, FCALL|VCALL|ARGS_SIMPLE>, <callcache> 0008 opt_plus <callinfo!mid:+, argc:1, ARGS_SIMPLE>, <callcache> 0011 leave send
  23. Raw Instructions /** ruby insn object list -> raw instruction

    sequence */ static int iseq_set_sequence(rb_iseq_t *iseq, LINK_ANCHOR *anchor) { LABEL *lobj; INSN *iobj; struct iseq_line_info_entry *line_info_table; unsigned int last_line = 0; LINK_ELEMENT *list; VALUE *generated_iseq; List of actual instructions and parameters
  24. Adds instruction to the list generated_iseq[code_index] = insn; … case

    TS_VALUE: /* VALUE */ { VALUE v = operands[j]; generated_iseq[code_index + 1 + j] = v; /* to mark ruby object */ iseq_add_mark_object(iseq, v); break; } Put the instruction in Add Parameter
  25. puts 'foo' $ ruby -e"puts RubyVM::InstructionSequence.new(\"puts 'foo'\").disasm" == disasm: #<ISeq:<compiled>@<compiled>>================================

    0000 trace 1 ( 1) 0002 putself 0003 putstring "foo" 0005 opt_send_without_block <callinfo!mid:puts, argc:1, FCALL|ARGS_SIMPLE>, <callcache> 0008 leave insn "foo"
  26. Instruction Format @c: category @e: english description @j: japanese description

    instruction form: DEFINE_INSN instruction_name (instruction_operands, ..) (pop_values, ..) (return value) { .. // insn body } Byte Code Stack
  27. putstring Instruction /** @c put @e put string val. string

    will be copied. @j จࣈྻΛίϐʔͯ͠ελοΫʹϓογϡ͢Δɻ */ DEFINE_INSN putstring (VALUE str) () (VALUE val) { val = rb_str_resurrect(str); }
  28. VM Optimization List $ ruby -e'p RubyVM::OPTS' ["direct threaded code",

    "operands unification", "inline method cache"]
  29. AHA! If it’s just a list of integers, we should

    be able to save that list to a file, then load it again.
  30. Adds instruction to the list generated_iseq[code_index] = insn; … case

    TS_VALUE: /* VALUE */ { VALUE v = operands[j]; generated_iseq[code_index + 1 + j] = v; /* to mark ruby object */ iseq_add_mark_object(iseq, v); break; } Add Parameter
  31. Interpolation vs String#+ # "#{foo}#{bar}" vs foo + bar z

    = RubyVM::InstructionSequence.new <<-eoruby x = "\#{foo}\#{bar}" eoruby puts z.disasm z = RubyVM::InstructionSequence.new <<-eoruby x = foo + bar eoruby puts z.disasm
  32. Which is faster? TABLE = { 'foo' => 'bar'.freeze, 'bar'

    => 'baz'.freeze } def table_lookup x TABLE[x] end def case_lookup x case x when 'foo' then 'bar'.freeze when 'bar' then 'baz'.freeze end end
  33. Now which is faster? TABLE = { 'foo' => 'bar'.freeze,

    'bar' => 'baz'.freeze } def table_lookup x TABLE[x] || 'omg'.freeze end def case_lookup x case x when 'foo' then 'bar'.freeze when 'bar' then 'baz'.freeze when nil then 'omg'.freeze end end
  34. yarvarch.ja #title YARVΞʔΩςΫνϟ #set author ೔ຊ Ruby ͷձ ͩ͜͞͞͏͍ͪ -

    2005-03-03(Thu) 00:31:12 +0900 ͍Ζ͍Ζͱॻ͖௚͠ ---- * ͜Ε͸ʁ [[YARV: Yet Another RubyVM|http://www.atdot.net/yarv]] ͷ ઃܭϝϞͰ͢ɻ YARV ͸ɺRuby ϓϩάϥϜͷͨΊͷ࣍ͷػೳΛఏڙ͠·͢ɻ - Compiler - VM Generator - VM (Virtual Machine) - Assembler - Dis-Assembler - (experimental) JIT Compiler - (experimental) AOT Compiler ݱࡏͷ YARV ͸ Ruby ΠϯλϓϦλͷ֦ுϥΠϒϥϦͱ࣮ͯ͠૷͍ͯ͠·͢ɻ͜