November 05, 2014

Rudy Gilmore - Parallel processing - PyDSLA meetup - Nov 2014

November 05, 2014

  1. Intro  to  Multiprocessing  with  Python     Rudy  Gilmore  

      Data  Scien3st,  TrueCar  Analy3cs  Team     PyData  Meetup,  11/3/14    
  2. Code  Paralleliza,on     •  Modern  processors  are  not  becoming

     much  faster,  but  are  more  numerous   •  Many  problems  in  analy3cs  are  easily  parallelizable     •  Wri3ng  parallel  code  will  oGen  allow  you  to  get  done  in  1/nth  the  3me   •  Amdahl’s  Law:     •  Python  has  some  barriers  to  paralleliza3on,  but  there  are  simple  workarounds   There  are  many  op3ons  for  high-­‐performance  parallel  compu3ng   Ø  Cluster  Compu,ng?     Ø  Hadoop?   Ø  Distributed  Processing?   Ø  GPGPUs?     Let’s  start  simple,  how  to  get  mul,ple  cores  on  one  machine  into  the  ac,on    
  3. “Embarrassingly  Parallel”   (Processes  completely  independent)     Examples:  

    1  independent    for    loop   2  .map  ops  on  dataset   3  integra3on   4  Monte-­‐Carlo  methods   5  Some  ML  problems   “Inherently  Serial”   (Difficult  or  impossible  to   run  in  parallel)     Example:    numerical  PDE   “Somewhat  Parallelizable”   (Some  communica3on  needed)     Example:  sor3ng   Parallel  algorithms  can  be  classified  by  data  transfer  required  between  processes    -­‐  this  can  be  done  via  message  passing  or  shared  memory  
  4. Python’s  Global  Interpreter  Lock  (GIL)   Only  one  thread  may

     access  code  in  python  interpreter  at  a  ,me     •  Mul3ple  threads  will  automa3cally  switch  off  at  standard  interval   •  GIL  appears  in  Cython;  some  other  distros  like  Jython  and  PyPy  do   not  have  this  limita3on    
  5. Python’s  thread  and  threading  modules     •  Provide  resources

     for  spli^ng  program  into   mul3ple  threads   •  However,  for  CPU-­‐intensive  tasks...                ....there  will  not  be  any  speedup  from                                  mul3threading  alone   •  GIL  s3ll  in  effect   •  So  what  good  is  mul3threading  anyways?   •  CPU-­‐bound  vs  I/O  bound:              threading  useful  in  lacer  but  not  former       What  you  want   What  you’re     gonna  get  
  6. mul,processing  module     •  part  of  standard  lib  as

     of  python  2.6     •  launchs  mul3ple  processes   •  processes  include  separate  interpreters  -­‐  and   therefore  separate  GILs   •  each  process  operates  on  a  separate  copy  of   memory  from  3me  of  launch   •  similar  syntax  to  threading   •  beware,  processes  have  significant  overhead  in   some  OS,  namely  Windows       GIL  1   GIL  2  
  7. Some  simple  examples  of  threading  and  mul,processing   Running  Cpython

     v2.7.6     First,  let’s  set  up  a  CPU-­‐bound  task:     def isprime(n):! for i in range(2,int(n**(0.5))+1):! if n%i==0:! return False! return True! ! def prime(Nth,q=None): # prints Nth prime! n_found = 0! i = 0! while n_found<Nth:! i+=1! n_found = n_found+int(isprime(i))! if q:! q.put(i) # send to Queue object if set! return i!
  8. import time! import threading as th! import multiprocessing as mp!

    ! start=20000! ! if __name__=='__main__':! t1=time.time() #time serial segment! print prime(start), prime(start+1), prime(start+2), prime(start+3)! print 'Serial test took',time.time() - t1,'seconds'! ! t2 = time.time() #time multithreaded segment! jobs = [th.Thread(target=prime, args=(start,q))\! ,th.Thread(target=prime, args=(start+1,q))\! ,th.Thread(target=prime, args=(start+2,q))\! ,th.Thread(target=prime, args=(start+3,q))]! for j in jobs:! j.start()! for j in jobs:! j.join()! print 'Multithreaded test took',time.time() - t2,'seconds'! ! q = mp.Queue()! t3 = time.time() #time multiprocessing segment! jobs = [mp.Process(target=prime, args=(start,q))\! ,mp.Process(target=prime, args=(start+1,q))\! ,mp.Process(target=prime, args=(start+2,q))\! ,mp.Process(target=prime, args=(start+3,q))]! for j in jobs:! j.start()! for j in jobs:! j.join()! print 'Multiprocessing test took',time.time() - t3,'seconds'!
  9. Output:     224729 224737 224743 224759! Serial test took

    3.68699979782 seconds! Multithreaded test took 5.64900016785 seconds! Multiprocessing test took 1.29299998283 seconds!
  10. mul3processing.Pool()  provides  a  map-­‐like  interface  with  automa3c   paralleliza3on  among

     pool  of  workers      # converting into a pool process! t4 = time.time()! pool = mp.Pool(processes=4)! result = pool.map(prime,range(start,start+4))! print result! print 'Pool test took',time.time() - t4,'seconds'!   Output:     Serial test took 3.68699979782 seconds! Multithreaded test took 5.64900016785 seconds! Multiprocessing test took 1.29299998283 seconds! [224729, 224737, 224743, 224759]! Pool test took 1.31299996376 seconds!   Notes:     •  Tasks  should  be  roughly  equal  size  -­‐  adjust  manually  if  possible   •  map() will  block  un3l  job  complete,  can  use    map_async()  to  return   result  immediately   •  mul3ple  args  will  need  to  be  combined  into  a  single  list,  unwrap  with  *  
  11. Further  reading:     •  mul3processing  supports  inter-­‐process  communica3on  using

       Queue() and  Pipe() ! •  support  for  sharing  objects  in  memory  using    Value() and    Array()! •  "premature  op,miza,on  is  the  root  of  all  evil”.    Discuss.     In  Conclusion:     •  Use  threading  if  you  have  a  poten3ally  blocking  I/O  procedure,  like  a  download   or  SQL  query   •  Use  mul3processing.Process()  and  mul3processing.Pool()  to  run  CPU-­‐intensive   tasks  in  parallel     References:   hcp://sebas3anraschka.com/Ar3cles/2014_mul3processing_intro.html#An-­‐introduc3on-­‐to-­‐parallel-­‐ programming-­‐using-­‐Python%27s-­‐mul3processing-­‐module   hcp://www.quantstart.com/ar3cles/Parallelising-­‐Python-­‐with-­‐Threading-­‐and-­‐Mul3processing   hcp://www.dabeaz.com/python/GIL.pdf   hcp://calcul.math.cnrs.fr/Documents/Ecoles/2010/cours_mul3processing.pdf   hcp://pymotw.com/2/mul3processing/communica3on.html#process-­‐pools