Upgrade to Pro — share decks privately, control downloads, hide ads and more …

QCON NYC 2014: Scaling Foursquare: from check-i...

QCON NYC 2014: Scaling Foursquare: from check-ins to recommendations.

Foursquare has grown from a simple check-in application to one that can tell you the best place to grab a burger. Along the way we've also become the best source of local data for apps like Instagram and Pinterest. I'll talk about the general architecture, storage systems and development practices that we've created to handle the ever increasing volume and complexity. Foursquare started out as a PHP MySQL app running on a single virtual server. Early on we switched to Scala and Mongo and have recently been aggressively splitting a monolithic application into multiple services while maintaining a single code base. I'll talk about the somewhat unique approach we've taken in our service oriented architecture. I'll also go over how we run Mongo and the additional storage systems we're running for more specialized tasks like search, key-value lookups, and offline tasks.

Jon Hoffman

June 11, 2014
Tweet

More Decks by Jon Hoffman

Other Decks in Technology

Transcript

  1. This  is  a  talk  about  how  we  scaled  foursquare  over

     the  years.       And,  by  scale  I  mean  designing  our  systems  to  handle  more  incoming  requests,   storing  more  data,  and  managing  complexity.     I’ll  give  you  a  bit  of  background  on  where  we  started  and  how  we  got  to  our  current   design.   This  story’s  been  told  many  @mes,  by  many  different  people,  and  will  con@nue  be   told  several  more  @mes  in  this  very  room.   So  I’ll  try  to  highlight  some  of  the  more  unique  approaches  we’ve  taken.     1  
  2. Foursquare  is  a  recommenda@on  engine  for  finding  the  best  places

     to  go  for  things   like  food  and  drinks     We  also  have  a  newly  released  loca@on  app  called  swarm  for  keeping  up  and  mee@ng   up  with  your  friends.     Sub@tle  of  talk     Based  in  NYC.  Around  50  million  users.     2  
  3. I’ll  break  this  up  into  two  parts.     In

     part  one  I’ll  talk  about  scaling  the  data  storage  layer   and  in  part  two  i’ll  talk  about  scaling  our  codebase  and  tooling   3  
  4. Development  of  foursquare  start  2009   ini@ally  we  were  using

     mysql,  but  quickly  switched  to  postgres  because  it  happened   to  have   beQer  support  in  the  ORM  library  we  were  using.   4  
  5. We  overcame  ini@al  scaling  scaling  challenges  fairly  easily  (see  how

     confident  those   goats  look)     with  bigger  boxes  for  our  postgres  server  and  memcache  in  front  of  it  to  fully  cache   some  types  of  objects.     5  
  6. But  then  things  started  to  get  a  bit  more  challenging

     as  we  got  our  first  million  users   in  early  2010.         6  
  7. Soon  we  started  to  split  some  of  the  very  large

     and  fast  growing  tables  to  their  own   dedicated  servers   and  we  switched  from  doing  SQL  joins  to  joining  things  in  our  applica@on  code   7  
  8. We  also  sent  some  query  traffic  to  read  only  replicas

     that  we  already  had  set  up  for   failover.   8  
  9. In  early  2010,  our  checkin  rate  was  growing  exponen@ally  

    it  was  clear  that  in  a  few  months  we’d  get  to  a  point  where  that  single  table  would   get  too  large  for  the  available  memory  or  where  the  write  rate  would  be  too  high  for   the  available  IOPS     So  we  knew  we’d  have  to  split  that  data  up  across  mul@ple  machines   9  
  10. Spli`ng  up  a  single  type  of  data  across  mul@ple  servers

     is  oaen  referred  to  as   sharding.     And  none  of  us  had  any  prior  experience  doing  that,  but  we  knew  that  we  could   either  built  something  ourselves  on  top  of  postgres   or  use  something  that  purported  to  handle  that  for  us.    We  knew  that  one  a  big   challenge  with  sharding  is  rebalancing  data  as  you  add  new  nodes   or  as  data  sizes  become  unbalanced  due  to  the  way  things  are  split.  So  our  inclina@on   was  to  not  try  to  solve  that  problem  ourselves     We  were  already  interested  in  Mongo  because  of  the  geo  index  features  and  schema   flexibility,  and  they  also  had  sharding  and  autobalancing  on  their  roadmap,  but  those   features  were  not  yet  implemented.     But  we  decided  to  take  a  gamble  that  Mongo  would  have  stable  sharding  features  by   the  @me  we  really  needed  them.   10  
  11. So  it  took  around  a  year  and  half,  but  eventually

     we  got  all  of  our  data  into  mongo   and  it’s  s@ll   our  system  of  record  for  all  user  data.       That’s  a  screenshot  of  our  real  @me  monitoring  dashboard       We’re  currently  running  over  15  clusters  with  a  over  600  replicas  which  handle  over   one  million  queries  per  second   11  
  12. Mongo  is  great  for  general  purpose  storage,  but  there  are

     a  few  cases  where  it’s  not   the  best  fit  for  us     we  have  key-­‐value  data  that  we  generate  nightly  and  want  to  be  able  to  swap  in  an   en@rely  new  dataset     there  are  some  specialized  queries  that  are  difficult  to  make  performant  within   mongo     and  some  highly  queried  data  where  we’re  very  sensi@ve  to  latency  variability   12  
  13. A  lot  of  the  data  we  use  to  service  online

     recommenda@on  queries  is  calculated   offline.   That  data  might  include  venue  ra@ngs  based  on  aggregated  signals,  score  data  from   machine  learning  models,  etc.   So  we  run  MR  reduce  jobs  to  generate  special  files  that  contain  efficient  indexes   preceeding  the  actual  data   Those  hfiles  are  wriQen  out  to  the  distributed  file  system,  the  data  may  be  split  up  so   that  we  can  fit  the  en@re  data  set  into  mul@ple  machines  memory.   the  hfile  server  processes  read  the  data  from  HDFS.  And  then  the  applica@on  servers   use  Thria  RPC  to  send  Key  lookups  to  the  hfile  servers       Zookeeper’s  used  to  coordinate  everything.  I’ll  men@on  Zookeeper  a  few  more  @mes,   it’s  become  the  indispensible  glue  that  holds  everything  together.   13  
  14. Tail  mongo  opera@on  log   Send  data  updates  to  kaia

     queue     One  or  more  consumer  might  read  the  data  from  kaia  and  send  it  to  a  cache  server   A  cache  contains  business  logic  around  how  to  index  that  data  into  redis   It  also  exposes  rpc  methods  for  clients  to  send  queries     And  again  all  the  accoun@ng  for  where  things  live  is  handled  by  zookeeper.   14  
  15. Part  2  Code.  The  year  is  2009  and  foursquare  would

     launch  at  SXSW  in  march  of  that   year.  Dennis  and  Naveen,  the  co-­‐founders,  split  up   Dev  responsibili@es  with  Dennis  on  the  server  side  and  Naveen  on  the  iPhone  app.   Dennis  wrote  the  server  side  code  in  php  and  did  a     PreQy  impressive  job  given  his  limited  prior  experience,  but  the  code  was  an   unmaintainable  mess  of  inline  sql,  layout,  and  logic.   No  models,  no  views,  no  controllers.     Later  that  year,  Foursquare  would  hire  their  first  server  engineer:  Harry  Heymann,   who’s  now  the  head  of  engineering.     Harry  came  from  a  Java  development  background  and  was  really  intrigued  with  Scala.   So  he  rewrote  Foursquare  using  a  scala  web  framework  called  Lia.   That  effort  was  really  about  code  maintainability  and  not  scaling.   15  
  16. We  built  a  trace  tool  to  account  for  all  the

     RPC  and  database  calls  were  executed  by   API  endpoints.   I  think  this  is  one  of  the  most  important  things  we  built  early  on  that  was  a  huge  help   in  inves@ga@ng  performance  problems.     You’ll  no@ce  that  you  can  also  see  the  call  site  for  the  mongo  query   16  
  17. We  would  also  graph  the  number  of  RPC  calls  executed

     by  each  endpoint  so  we   would  know  when  we  introduced  unintended  regressions.     This  was  one  of  the  first  graphs  we  looked  at  when  we  no@ced  a  performance   regression  aaer  a  code  push.     17  
  18. We  also  built  a  system  that  allows  us  to  dynamically

     switch  code  on  or  off.   we  call  that  throQles     we  can  switch  a  feature  on  for  specific  user  ids,  or  a  consistent  hashed  percentage  of   users     we  use  for  developing  new  features  and  doing  gradual  rollouts  among  other  things     this  is  another  indespensible  tool  for  us.   18  
  19. For  all  that  @me  we  con@nued  to  run  all  of

     foursquare  within  the  same  binary.  We   had  a  giant  monolithic  codebase.   20  
  20. scala  very  slow  to  compile  and  our  builds  took  many

     minutes  to  compile.     very  high  code  churn,  so  high  probability  that  something  broke  or  regressed  in  which   case  the  en@re  process  would  have  to  be  rolled  back         21  
  21. Our  first  effort  to  break  up  the  monolith  was  to

     break  apart  the  dependencies   between  dis@nct  chunks  of  func@onality     22  
  22. But  we  were  worried  that  the  easy  of  service  crea@on

     would  lead  to  a  cambrian   explosion  of  services   25  
  23. Recently  as  we  started  to  build  swarm  we  had  smaller

     engineering  @mes  which  own   front  to  feature  development   of  iphone/android  applica@ons  along  with  the  server  backends.     those  teams  wanted  to  run  quick  experiment  with  new  api  endpoints  which  were   exposed  via  the  public  api   so  that  they  could  access  them  on  the  iphone  clients.   34  
  24. this  is  basically  all  it  takes  to  implement  a  new

     endpoint.   an  engineer  can  run  that  from  their  development  machine  in  ec2  and  it  will  get  traffic   sent  to  those  paths  at  api.foursquare.com     by  default  the  authoriza@on  layer  only  allows  employee  access  to  the  paths   36  
  25.   The  apirouter  is  a  standalone  process  that  takes  the

     incoming  hQp  request   and  looks  up  the  requested  path  in  the  rou@ng  table  stored  in  zookeeper   if  it  finds  a  service  that  can  handle  the  request,  it  will  do  authen@ca@on,   authoriza@on  and  forward  the  request  to  one  of  the   servers  in  the  pool  that  adver@sed  handling  for  that  path.     the  communica@on  from  the  router  to  the  remote  endpoint  is  thria  rpc   37