Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Path to Smoother Database Migrations

Nathan Duthoit
September 03, 2013

The Path to Smoother Database Migrations

Migrations are often applied as part of the deploy process. For most Django applications, this is fine. However, when your data becomes large, slow migrations and locked tables can result in downtime. Running migrations outside of the deploy process helps minimize the impact. This talk (presented at DjangoCon US 2013) will go over this and other strategies I have found successful in minimizing downtime during database migrations.

Nathan Duthoit

September 03, 2013
Tweet

More Decks by Nathan Duthoit

Other Decks in Programming

Transcript

  1. @nduthoit 8 Step 1: Add the field to your model

    1 from django.db import models 2 3 4 class Book(models.Model): 5 author = models.ForeignKey(Author) 6 title = models.CharField(max_length=100) 7 pages = models.IntegerField(default=0) 8 Thursday, 5 September, 13
  2. @nduthoit 9 Step 2: Create and run the schema migration

    $ python manage.py schemamigration library --auto add_pages_field_to_book ... $ python manage.py migrate library takes  15s Thursday, 5 September, 13
  3. @nduthoit 10 Step 3: Use the new field in your

    code 1 class LargeBooksListView(ListView): 2 ... 3 def get_queryset(self): 4 qset = super(LargeBooksListView, self).get_queryset() 5 qset = qset.filter(pages__gte=500) 6 return qset 7 ... 8 Thursday, 5 September, 13
  4. @nduthoit 11 Step 4: Deploy and migrate 1 @task 2

    def deploy(version): 3 ... 4 execute(download_code, version) 5 execute(install_requirements, version) 6 execute(create_symlinks, version) 7 ... 8 execute(migrate_db) 9 execute(restart_app) 10 ... Thursday, 5 September, 13
  5. @nduthoit 12 Challenges 1 @task 2 def deploy(version): 3 ...

    4 execute(download_code, version) 5 execute(install_requirements, version) 6 execute(create_symlinks, version) 7 ... 8 execute(migrate_db) 9 execute(restart_app) 10 ... new  schema  is  live with  old  code takes  10min   on  prod Thursday, 5 September, 13
  6. @nduthoit 13 Challenges 1. code  and  db  schema  are  

    out  of  sync 2. migra6ons  can:  take  a  long   6me,  fail,  lock  tables Thursday, 5 September, 13
  7. @nduthoit 15 Challenges It  only  gets  worse  as  you  

    scale  (database  size  and   server  count) Thursday, 5 September, 13
  8. @nduthoit 16 Dealing with the challenges 1. code  and  db

     schema  are   out  of  sync 2. migra6ons  can:  take  a  long   6me,  fail,  lock  tables Thursday, 5 September, 13
  9. @nduthoit 19 out-of-band migrations deploy  and  migrate FROM deploy  1

    migrate TO deploy  2 Thursday, 5 September, 13
  10. @nduthoit 21 migration-agnostic code deploy  1 migrate deploy  2 code

     works  before  and   aAer  the  migraBon code  depends  on   migraBon  being   completed Thursday, 5 September, 13
  11. @nduthoit 22 example 1 adding a new model 2  branches

    1  schema  migra6on Thursday, 5 September, 13
  12. @nduthoit 23 adding a new model (branchA) • add  the

     new  model • do  not  use  the  new  model  (assume   it  does  not  exist) • create  the  schema  migraBon   [migraBonOne] branchA master Thursday, 5 September, 13
  13. @nduthoit 24 adding a new model (branchB) • add  code

     that  uses  the  new  model branchA master branchB Thursday, 5 September, 13
  14. @nduthoit 25 1. merge  and  deploy  branchA 2. apply  migraBonOne

    3. merge  and  deploy  branchB adding a new model (deploying and migrating) deploy branchA apply migraBonOne deploy branchB Thursday, 5 September, 13
  15. @nduthoit 26 example 2 adding a required field 3  branches

    1  schema  migra6on 1  data  migra6on Thursday, 5 September, 13
  16. @nduthoit 28 adding a required field (branchA) 1. add  the

     new  field  (nullable) 2. do  not  use  the  new  field  (assume  it   does  not  exist) 3. create  the  schema  migraBon   [migraBonOne]  with  south 4. comment  out  the  new  field  in  your   model  definiBon Thursday, 5 September, 13
  17. @nduthoit 29 adding a required field (branchB) 1. un-­‐comment  the

     new  field 2. add  code  that  populates  the  new   field  for  new  instances 3. create  the  data  migraBon   [migraBonTwo]  that  populates  the   new  field  for  exisBng  instances   (backfill) 4. do  not  include  code  that  relies  on   the  new  field’s  value  (not   guaranteed  to  exist) Thursday, 5 September, 13
  18. @nduthoit 30 adding a required field (branchC) • add  code

     that  depends  on  the  field   being  populated Thursday, 5 September, 13
  19. @nduthoit 31 adding a required field (deploying and migrating) deploy

    branchA apply migraBonOne deploy branchB apply migraBonTwo deploy branchC from  this  point  forward,   the  field  is  defined  in   the  db  (set  to  null  for  all   instances) at  this  point  the  field   does  not  exist  in  the  db from  this  point  forward,  new   instances  will  have  a  proper   value  set  for  the  field at  this  point  all   instances  have  a  proper   value  for  the  new  field Thursday, 5 September, 13
  20. @nduthoit 32 adding a required field (follow-up) • make  the

     field  required  (1  branch,  1   schema  migraBon) Thursday, 5 September, 13
  21. @nduthoit 34 Running out-of-band migrations (review) PROS CONS • code

     and  db  schema  are  never  out   of  sync • code  deploys  can  be  safely  rolled   back • database  migraBons  can  be  safely   rolled  back • more  complexity  (example:  4   branches  and  3  migraBons  vs.  1   branch  and  1  migraBon) • more  coordinaBon  required   (ensure  order) Thursday, 5 September, 13
  22. @nduthoit 35 1. code  and  db  schema  are   out

     of  sync 2. migra6ons  can:  take  a  long   6me,  fail,  lock  tables Dealing with the challenges Thursday, 5 September, 13
  23. @nduthoit 36 Mitigating long running migrations • test  your  migraBon

     against  producBon-­‐ sized  database • be  aware  of  the  effect  of  producBon-­‐ scale  load • run  migraBons  during  low  usage  periods Thursday, 5 September, 13
  24. @nduthoit 38 Dealing with long-running schema migrations some  schema  migraBons

     can  take  a  long   Bme  to  complete  when  running  it  with  south (not  south’s  fault,  more  oAen  the  RDBMS’) Thursday, 5 September, 13
  25. @nduthoit 39 Dealing with long-running schema migrations pt-­‐online-­‐schema-­‐change “alters  a

     table’s  structure  without   blocking  reads  or  writes” h^p://www.percona.com/doc/percona-­‐toolkit/2.2/pt-­‐online-­‐schema-­‐ change.html Thursday, 5 September, 13
  26. @nduthoit 40 pt-online-schema-change $ sudo pt-online-schema-change --alter ..... ... Altering

    `mydb`.`library_books`... Creating new table... Created new table mydb._library_books_new OK. Altering new table... Altered `mydb`.`_library_books_new` OK. Creating triggers... Created triggers OK. Copying approximately 5023 rows... Copied rows OK. Swapping tables... Swapped original and new tables OK. Dropping old table... Dropped old table `mydb`.`_library_books_old` OK. Dropping triggers... Dropped triggers OK. Successfully altered `mydb`.`library_books`. Thursday, 5 September, 13
  27. @nduthoit 41 pt-online-schema-change (primer) sudo pt-online-schema-change --alter "DROP COLUMN pages"

    D=mydb,t=library_book --set-vars="lock_wait_timeout=7, innodb_lock_wait_timeout=7" --alter-foreign-keys-method auto --defaults-file=/root/.my.cnf --execute ALTER TABLE library_book DROP COLUMN pages Thursday, 5 September, 13
  28. @nduthoit 42 • -­‐-­‐alter  -­‐  it  should  contain  whatever  comes

     aAer  the   name  of  the  table  in  the  "ALTER  TABLE"  statement • D  -­‐  name  of  the  database • t  -­‐  name  of  the  table  from  the  "ALTER  TABLE"  statement • -­‐-­‐set-­‐vars="lock_wait_Bmeout=7,   innodb_lock_wait_Bmeout=7"  -­‐  determines  how  long   pt-­‐online-­‐schema-­‐change  will  wait  to  acquire  a  lock   during  execuBon  (note  that,  by  default,  it  will  retry  the   operaBon  up  to  10  Bmes  before  giving  up) • -­‐-­‐defaults-­‐file=/root/.my.cnf  -­‐  loads  seengs  required  to   connect  to  the  database • -­‐-­‐dry-­‐run  vs  -­‐-­‐execute  -­‐  You  should  always  run  the   command  with  the  -­‐-­‐dry-­‐run  argument  before  running   them  with  the  -­‐-­‐execute  argument. pt-online-schema-change (args) Thursday, 5 September, 13
  29. @nduthoit 43 pt-online-schema-change (process) 1.generate  schema  migraBon  with  south 2.apply

     the  migraBon  locally  and  record  the   SQL  “ALTER  TABLE”  commands  executed   by  south  (use  -­‐-­‐verbosity) 3.generate  the  pt-­‐online-­‐schema-­‐change   command 4.test  on  producBon-­‐sized  db 5.run  with  -­‐-­‐dry-­‐run  on  producBon 6.run  with  -­‐-­‐execute  on  producBon 7.fake  the  south  migraBon  (use  -­‐-­‐fake) Thursday, 5 September, 13
  30. @nduthoit 45 Dealing with long running data migrations (challenges) •

    dropped  ssh  connec6ons • failures • slowness Thursday, 5 September, 13
  31. @nduthoit 46 1 ... 2 def forwards(self, orm): 3 import

    random, sha, string 4 for user in orm.User.objects.all(): 5 random_letters = [random.choice(string.letters) for i in 6 range(8)] 7 user.password_salt = "".join(random_letters) 8 user.password_hash = sha.sha(user.password_salt 9 + user.password).hexdigest() 10 user.save() 11 ... data migration Thursday, 5 September, 13
  32. @nduthoit 47 run in a screen session will  prevent  6meouts

     on  the   ssh  connec6on  from  affec6ng   the  running  migra6on Thursday, 5 September, 13
  33. @nduthoit 49 1 ... 2 def hash_password(user): 3 random_letters =

    [random.choice(string.letters) for i in 4 range(8)] 5 user.password_salt = "".join(random_letters) 6 user.password_hash = sha.sha(user.password_salt 7 + user.password).hexdigest() 8 user.save() 9 ... management command Thursday, 5 September, 13
  34. @nduthoit 50 management command (benefits) • re-­‐run  many  6mes •

    pass  arguments • run  in  batches  (in  parallel,   restart  failed  batches) • more  work  :-­‐( $ python manage.py hash_user_passwords --start_date=2012-01-01 \ --end_date=2012-02-01 Thursday, 5 September, 13
  35. @nduthoit 51 1 ... 2 @task 3 def hash_password(user): 4

    random_letters = [random.choice(string.letters) for i in 5 range(8)] 6 user.password_salt = "".join(random_letters) 7 user.password_hash = sha.sha(user.password_salt 8 + user.password).hexdigest() 9 user.save() 10 ... celery tasks $ python manage.py hash_user_passwords --start_date=2012-01-01 \ --end_date=2012-02-01 Thursday, 5 September, 13
  36. @nduthoit 52 celery tasks (benefits) • easy  paralleliza6on:  workers  

    perform  tasks • use  exis6ng  infrastructure • monitor  and  tweak:  workers,   db  connec6ons,  etc. • use  a  dedicated  queue Thursday, 5 September, 13
  37. @nduthoit 53 Dealing with the challenges 1. code  and  db

     schema  are   out  of  sync 2. migra6ons  can:  take  a  long   6me,  fail,  lock  tables Thursday, 5 September, 13