Upgrade to Pro — share decks privately, control downloads, hide ads and more …

High-Performance Asynchronous Applications with...

High-Performance Asynchronous Applications with (or without) Ruby - By Scott Tadman

Scott Tadman's presentation at Burlington Ruby Conf. 2012

Avatar for adam bouchard

adam bouchard

July 30, 2012
Tweet

More Decks by adam bouchard

Other Decks in Programming

Transcript

  1. PostageApp • Simply: “Email as a Service” • Created in

    2009 as an experiment. • Boils down to: Laziness, Impatience and Hubris. 2 Monday, July 30, 12
  2. Why PostageApp? • Sending email is generally not fun. •

    Sending email correctly is even less fun. 3 Monday, July 30, 12
  3. Why PostageApp? • Sending email is generally not fun: •

    Email support in most frameworks is usually primitive. • SMTP is not a reliable transport. • Creating a background job is not always possible. 4 Monday, July 30, 12
  4. Why PostageApp? • Sending email correctly is even less fun:

    • Standards for composing HTML email are tricky. • Changes to templates usually require re-deploying. • Previews not always easy to simulate. 5 Monday, July 30, 12
  5. Mission Accomplished • Prototype is a Ruby on Rails application

    • Sending engine is a Rails-based process. • 100% ActiveRecord and MySQL 6 Monday, July 30, 12
  6. How does Rails Scale? • Front-end performance is great! •

    Rails background workers...not so great. 10 Monday, July 30, 12
  7. Rails-Based Workers • Each worker process was limited: • Could

    only send one email at a time. • Every email required a new connection. • Process footprint was large (~100MB) 11 Monday, July 30, 12
  8. Why Use Rails for Workers? • Easy. • Literally. •

    Synchronous, blocking code is: • Easy to write. • Not hard to debug. 12 Monday, July 30, 12
  9. Why is Rails Synchronous? • Long tradition in PHP, Python,

    Perl, etc. • Database calls are supposed to be fast. • Not fast enough? • Optimize. • Cache. • Optimize some more! Get creative. 13 Monday, July 30, 12
  10. Limits of Synchronous • Some things can never be cached

    or optimized: • Third-party external servers. • Third-party external networks. • Basically stuff you have no control over. 14 Monday, July 30, 12
  11. Problems with Blocking 18 • Entire process grinds to a

    halt while waiting. • Memory is still in use. • Lights on, nobody home. • More workers requires more memory. Monday, July 30, 12
  12. If only... • Your processing application could: • Do other

    things while waiting for a response? • Juggle multiple jobs at the same time? • Make use of multiple CPU cores? 19 Monday, July 30, 12
  13. Multi-Threaded Code 21 • So you make your application multi-threaded.

    • Now yohaveu protwoblems: • Partition resources carefully? • Lock shared resources aggressively? • No magic bullet here. • Oh no. Deadlocks. Monday, July 30, 12
  14. Let’s Go Async! • All the cool kids are doing

    it. • JavaScript in the browser: • jQuery $.ajax • JavaScript on the server: • Node.js 23 Monday, July 30, 12
  15. Asynchronous JavaScript • In the browser: • Blocking calls can

    lock the JavaScript VM. • Your computer never needs more beach-ball. • The “A” in AJAX actually stands for “Awesome” • The “X” stands for “JSON” 24 Monday, July 30, 12
  16. Asynchronous Server? • Services implemented this way: • IRC •

    DNS • SNMP • SMTP! 25 Monday, July 30, 12
  17. select() • Workhorse of traditional UNIX networking. • Improved upon

    with epoll and kqueue • Want to use those in Ruby? • EventMachine 26 Monday, July 30, 12
  18. EventMachine • Ruby’s answer to Node.js • Implements the “Event

    Machine” pattern. • Built on similar libraries. • Recognizes slow networking is the problem to solve. 27 Monday, July 30, 12
  19. Network Calls • In your application is heavily networked: •

    Database queries. • HTTP requests. • External APIs. • Shared cache systems. 28 Monday, July 30, 12
  20. Event Driven • An “event” is something that happens. •

    Client example: Clicking on a page. • Server example: Receiving DNS request. 29 Monday, July 30, 12
  21. Event Loop • In a nutshell: • Wait for stuff

    to happen. • When stuff happens, deal with it... • ...quickly. • Seriously, are you done yet? 30 Monday, July 30, 12
  22. Event Responses • Break up longer operations into small units

    of work. • Keep the event stream flowing. • For best results: • Use external resources to do heavy lifting (DB, etc.) • Enable long-running tasks to pause and resume. • “Break time” 31 Monday, July 30, 12
  23. Blocks • A block of code that’s passed in to

    a function. • Many ways to create: • lambda • Proc.new • do ... end • Blocks make Ruby very flexible. 33 Monday, July 30, 12
  24. Definition of a Block • A small unit of code

    passed to a method. • May be executed zero or more times. • May be executed immediately... • ...or at some unspecified time in the future. 34 Monday, July 30, 12
  25. lambda vs. function() • Ruby has lambda { } •

    JavaScript has function () • Superficially very similar. • Ruby’s syntax advantage: • Append do...end to any method call. • Makes passing methods almost too easy. 35 Monday, July 30, 12
  26. Async JavaScript • Example: • async_method(function(r) { ... }); •

    async_method({ callback: function(r) { ... } }); • $.ajax(...).done(function (data) { ... }); 36 Monday, July 30, 12
  27. Async JavaScript • Example from jQuery’s documentation: var jqxhr =

    $.ajax( "example.php" ) .done(function() { alert("success"); }) .fail(function() { alert("error"); }) .always(function() { alert("complete"); }); • That looks easy enough. • What about chaining operations? • Uh oh... 37 Monday, July 30, 12
  28. Asynchronous Ruby • Give a method a block to call:

    • ...when the operation is complete. • ...when something went wrong. • ...when it timed out. 38 Monday, July 30, 12
  29. Asynchronous Ruby • Chaining operations: • Often the result of

    one action informs the next: • Fetching additional records. • Error recovery. • May skip steps if data already cached. 39 Monday, July 30, 12
  30. Nested Calls • Typical Example: • User... • ...belongs to

    an Account... • ...which has Notices. • ...has many Messages. 40 Monday, July 30, 12
  31. Nested Calls • Synchronous Implementation: • user = User.find(...) •

    account = user.account • notice = Notice.find(account.last_notice_id) • m_count = Message.where(:user_id => user.id).count 41 Monday, July 30, 12
  32. Nested Calls • Asynchronous Implementation: • User.async_find(...) do |user| •

    Account.async_find(user.account_id) do |account| • Notice.async_find(account.last_notice_id) do | notice| • Message.where(:user_id => user.id).async_count ... 42 Monday, July 30, 12
  33. Asynchronous Calls • Without convention you have anarchy: • Multiple

    callback styles. • Differing return types and result codes. • Predictable, dependable behavior is essential. • Limitations imposed by Ruby need to be respected: • Don’t try to make it something it isn’t. 43 Monday, July 30, 12
  34. Asynchronous House Rules • A well-behaving asynchronous method will... •

    Call the supplied block with a well-defined response: • Once and once only. • Always. • Even if something unexpected happens. • Never trigger any exceptions it can’t handle. 44 Monday, July 30, 12
  35. Asynchronous Exceptions • Don’t do things that might cause trouble:

    • Be aware of what exceptions methods can cause. • Catch and handle them where they occur. • Exceptions will not be caught by the caller. • Exceptions will crash your entire application. 48 Monday, July 30, 12
  36. Asynchronous Conditions conditional_async_method do |result_id| if (cached = @cache[result_id]) yield(cached)

    else fetch_and_cache(result_id) do |result| yield(result) end end end 49 Monday, July 30, 12
  37. Leaving the Nest • Multiply nested asynchronous calls: • Tend

    to grow more complicated. • Optional steps are hard to express. • End up very hard to debug. • Make for a very deep stack. • Maybe there’s a better way. 50 Monday, July 30, 12
  38. Leaving the Nest • Reasons to find an alternative: •

    Do you know what your application is doing? • What callbacks are still outstanding? • Why the application is not responding? • Where that asynchronous call was initiated? 51 Monday, July 30, 12
  39. Stack Trace • EventMachine’s core is an event loop. •

    while (true) do ... end • The ... is the magical EventMachine stuff. • Asynchronous code not reflected in stack trace... • ...unless executing at that exact moment. • ...which is unlikely. 52 Monday, July 30, 12
  40. State Machine Pattern • A state machine is: • One

    or more formally defined steps: • To complete an operation. • To handle conditions. • Basically like a flow-chart. 54 Monday, July 30, 12
  41. State Machine Benefits • Using a state machine to track

    async code: • Encapsulates a multi-stage process. • Provides insight into completion status. • Easy to hook in to for logging and benchmarking. • Makes it easy to find where things have stalled. • Easily rendered as a diagrams. 55 Monday, July 30, 12
  42. Use Case: SMTP • Line-based protocol • Very simple syntax

    • Well documented in various RFCs 56 Monday, July 30, 12
  43. Use Case: SMTP • Example command stream sent to remote

    server: • HELO myhostname.net • MAIL FROM:<someone@example.com> • RCPT TO:<someone@yahoo.com> • DATA • QUIT 57 Monday, July 30, 12
  44. State Machine Example Example from remailer library: state :helo do

    enter do send_line("HELO #{hostname}") end interpret(250) do if (requires_authentication?) enter_state(:auth) else enter_state(:established) end end end 58 Monday, July 30, 12
  45. State Machine Example Example from remailer library: state :auth do

    enter do send_line("AUTH PLAIN #{encode_authentication(username, password)}") end interpret(235) do enter_state(:established) end interpret(535) do |reply_message, continues| handle_reply_continuation(535, reply_message, continues) do |reply_code, reply_message| error_notification(reply_code, reply_message) enter_state(:quit) end end end 59 Monday, July 30, 12
  46. State Machine Use Cases • Best applied to problems that:

    • Have a complicated, multi-step procedure. • May need to recover from crashes. • A standardized way of tracking progress is required. • Insight into what’s “going on” is necessary. 60 Monday, July 30, 12
  47. Fibers • Facility within Ruby since 1.8.6 • Even more

    confusing than blocks to the uninitiated. • Not many libraries make use of them. • ...but that seems to be changing. 63 Monday, July 30, 12
  48. Fibers in Ruby • Fibers are like blocks that you

    can pause and resume. • Asynchronous operations often involve a lot of that. • So... • Fibers + asynchronous code are best friends? 64 Monday, July 30, 12
  49. High Fiber Ruby • Instead of passing callback blocks to

    async methods... • ...yield control to that method... • ...then resume control when a response is received. • Seems simple enough, right? • Simplicity is a compelling reason to use them. 65 Monday, July 30, 12
  50. High Fiber Ruby • Blocking synchronous code: • user =

    User.find(...) • account = user.account • ... 66 Monday, July 30, 12
  51. High Fiber Ruby • Callback-driven asynchronous code: • User.find_async(...) do

    |user| • Account.where(...).find_async do |account| • ... 67 Monday, July 30, 12
  52. High Fiber Ruby • Fibered asynchronous code: • user =

    User.find(...) • account = user.account • ... 68 Monday, July 30, 12
  53. Fiber Wrapper Example 70 def find_async(id) fiber = Fiber.current standard_async_query(...)

    do |result| fiber.resume(result) end Fiber.yield end Fiber.new do user = find_async(id) end Monday, July 30, 12
  54. Fiber Future • Making Rails more Fiber-friendly. • Need more

    Fiber-aware libraries. • Not unlike being “thread-aware” or “thread-safe” • All non-trivial applications can benefit. • ...if the cost of change is low. • Fibers can make it easy. 71 Monday, July 30, 12
  55. GitHub Resources • https://github.com/eventmachine/eventmachine • eventmachine • https://github.com/igrigorik • em-synchrony,

    em-websocket, etc. • em-mysqlplus based on em-mysql • https://github.com/tmm1/ • em-mysql 72 Monday, July 30, 12