Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kavya Joshi - A tale of concurrency through cre...

Kavya Joshi - A tale of concurrency through creativity in Python: a deep dive into how gevent works.

gevent is an open source Python library for asynchronous I/O. It provides a powerful construct to build concurrent applications; think threads, except lightweight and cooperatively scheduled. We will delve into how gevent is architected from its building blocks — sophisticated coroutines, an event loop, and a dash of creativity to neatly integrate them.

https://us.pycon.org/2016/schedule/presentation/2015/

PyCon 2016

May 29, 2016
Tweet

More Decks by PyCon 2016

Other Decks in Programming

Transcript

  1. # Open a connection to the server conn = get_authenticated_connection(user)

    # Download all photos photos = get_photos(conn) # Save for later display save_photos(user, photos) def download_photos(user):
  2. import multiprocessing as mp def downloader(): pool = [] for

    user in users: p = mp.Process(download_photos, user) pool.append(p) p.start() for p in pool: p.join()
  3. import threading def downloader(): pool = [] for user in

    users: t = threading.Thread(download_photos, user) pool.append(t) t.start() for t in pool: t.join()
  4. import twisted def download_photos(): # Modify this to add callbacks

    def downloader(): # Something something loop.run()
  5. green threads user space — 
 the OS does not

    create or manage them cooperatively scheduled — 
 the OS does not schedule or preempt them lightweight
  6. import gevent from gevent import monkey; monkey.patch_all() def downloader(): pool

    = [] for user in users: g = gevent.Greenlet(download_photos, user) g.start() pool.append(g) gevent.joinall(pool)
  7. from greenlet import greenlet ... class Greenlet(greenlet): """ A light-weight

    cooperatively-scheduled execution unit. """ ... ? g = gevent.Greenlet(download_photos, user)
  8. def print_red(): print 'red' gr2.switch() print ‘red done!’ def print_blue():

    print 'blue' gr1.switch() print ‘blue done!’ red blue red done! from greenlet import greenlet gr1 = greenlet(print_red) gr2 = greenlet(print_blue) gr1.switch()
  9. { base = SP1 } SP1 SP2 { base =

    SP1 start = SP2 } { base = SP2 } gr1.switch() gr2.switch() SP3 gr1.switch() } C STACK
  10. import gevent from gevent import monkey; monkey.patch_all() def downloader(): pool

    = [] for user in users: g = gevent.Greenlet(download_photos, user) g.start() pool.append(g) gevent.joinall(pool)
  11. def start(self): """ Schedule the greenlet to run in this

    loop iteration. """ if self._start_event is None: self._start_event = \ ...loop.run_callback(self.switch) g.start()
  12. “Hey loop, Wait for a write on this socket and

    call parse_recv() when that happens.”
  13. while True: block for I/O call pending io_watchers fd =

    make_nonblocking(socket_fd) loop.io_watch(fd, write, callback_fn) loop.run() call all pre_block_watchers call all post_block_watchers
  14. always call pre_block_watchers Hook to integrate other event mechanisms into

    the loop. “Hey loop, If there are coroutines ready to run, run them. Then, block for a write on...”
  15. import gevent from gevent import monkey; monkey.patch_all() def downloader(): pool

    = [] for user in users: g = gevent.Greenlet(download_photos, user) g.start() pool.append(g) gevent.joinall(pool)
  16. HUB

  17. block for I/O call pending io_watchers = [g1.switch] Hub resumes

    download_photos(user1) g1 g1.switch() loop.run() call pre_block_watchers = [] ...
  18. minuses no parallelism non-cooperative code will block the entire process:

    
 C-extensions —> use pure Python libraries
 compute-bound greenlets —> use gevent.sleep(0)
 —> use greenlet blocking detection
 monkey-patch may have confusing implications
 order of imports matters
  19. …but excellent for workloads that are: 
 I/O bound, highly

    concurrent —> 20-30k concurrent connections!
 Used at “web scale” at:
 Pinterest, Facebook, Mixpanel, PayPal, Disqus, Nylas…