Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Vitess: Online Schema Migration Automation

Shlomi Noach
October 21, 2020

Vitess: Online Schema Migration Automation

For many, running an online schema migration operation is still a manual job: from building the correct command, through identifying where the migration should run and which servers are to be affected, to auditing progress and completing the migration. Sharded environment poses an additional burden, as any logical migration must be applied multiple times, once for each shard.

What if you could just issue an ALTER TABLE ... statement, and have all that complexity automated away?

Vitess, an open source sharding framework for MySQL, is in a unique position to do just that. This session shows how Vitess's proxy/agent/topology architecture, together with gh-ost or pt-online-schema-change, are used to hide schema change complexity, and carefully schedule and apply schema migrations.

Shlomi Noach

October 21, 2020
Tweet

More Decks by Shlomi Noach

Other Decks in Programming

Transcript

  1. Engineer at PlanetScale Author of orchestrator, gh-ost, freno and others

    Blog at openark.org github.com/shlomi-noach @ShlomiNoach About me
  2. • Founded in February 2018 • Venture backed: a16z, SignalFire

    • ~40 employees distributed • Authors Vitess • Offers PlanetScaleDB for cloud offering and PlanetScaleDB BYOK: Bring Your Own Kubernetes PlanetScale
  3. Vitess • CNCF graduated project • Maintained by the community

    • PlanetScale is a major contributor A database clustering system for horizontal scaling of MySQL
  4. Vitess architecture vttablet • A daemon/sidecar • Controls the mysqld

    process • Interacts with the mysqld server • Typically on same host as mysqld
  5. Vitess architecture vtgate vtgate • A smart proxy • Speaks

    mysql protocol • Impersonates the database • Relays queries to tablets
  6. Vitess architecture commerce shard 0 commerce shard 1 internal unsharded

    USE commerce SELECT * FROM orders WHERE customer_id=4
  7. Vitess architecture commerce shard 0 commerce shard 1 internal unsharded

    topo topo • Vitess state backend • etcd/zk/consul • Info: • Sharding schemes • Schemas • Clusters • Tablets, servers, • Operations • ...
  8. Vitess architecture commerce shard 0 commerce shard 1 internal unsharded

    topo vtctld • Daemon • API server • Operations
  9. Issues with schema migrations • ALTER TABLE is blocking •

    gh-ost, pt-online-schema-change, fb-osc as alternatives Installing, formalizing, command line options
  10. Issues with schema migrations • ALTER TABLE is blocking •

    gh-ost, pt-online-schema-change, fb-osc as alternatives • Discovery commerce shard 0 commerce shard 1 internal unsharded
  11. Issues with schema migrations • ALTER TABLE is blocking •

    gh-ost, pt-online-schema-change, fb-osc as alternatives • Discovery • Running commerce shard 0 commerce shard 1
  12. Issues with schema migrations • ALTER TABLE is blocking •

    gh-ost, pt-online-schema-change, fb-osc as alternatives • Discovery • Running • Throttling commerce shard 0 commerce shard 1
  13. Issues with schema migrations • ALTER TABLE is blocking •

    gh-ost, pt-online-schema-change, fb-osc as alternatives • Discovery • Running • Throttling • Visibility commerce shard 0 commerce shard 1
  14. Issues with schema migrations • ALTER TABLE is blocking •

    gh-ost, pt-online-schema-change, fb-osc as alternatives • Discovery • Running • Throttling • Visibility • Control commerce shard 0 commerce shard 1
  15. Online DDL flow commerce shard 0 commerce shard 1 internal

    unsharded topo vtgate • Intercepts • Parses • Persists
  16. Online DDL flow commerce shard 0 commerce shard 1 internal

    unsharded topo vtctld • Observes • Discovers • Distributes • Validates
  17. Online DDL flow commerce shard 0 commerce shard 1 internal

    unsharded topo vttablet • Has all the fun
  18. vttablet on primary • Schedules • Prepares script(s) • Runs

    gh-ost or pt-osc • Tracks • Throttles • Cleans up • Tracks state in local _vt meta schema Online DDL flow
  19. Q: Which MySQL account and privileges do I need for

    the migration user? How do I tell Vitess the credentials? A: Vitess creates a migration user with random password for each migration, on your behalf. You do not need to do anything.
  20. Q: How do I setup the gh-ost hooks? A: Vitess

    creates gh-ost hooks directory and scripts for you. Vitess creates a pt-osc plugin for you. You do not need to do anything.
  21. Q: Where do I install gh-ost? A: On linux amd64

    and with recent glibc, Vitess comes with gh-ost binary pre-compiled and embedded, and will extract the binary on your behalf. You do not need to do anything.
  22. Q: Where do I install gh-ost? (Cont.) Otherwise use -gh-ost-path.

    pt-online-schema-change is not bundled. Install it on all MySQL servers. Use -pt-osc-path
  23. Q: How do I configure the throttling replicas? A: Each

    vttablet runs a throttler service, which automatically and dynamically detects the throttling replicas. It evaluates replication lag autonomically. Both gh-ost and pt-osc are configured to use it. You do not need to do anything (but you can configure -throttle_tablet_types="replica,rdonly" )
  24. Q: How do I drop pt-osc leftover triggers in case

    of failure? A: Vitess will identify a failed migration, even if vttablet itself fails, and will drop the triggers on your behalf. You do not need to do anything.
  25. Q: How do I drop the leftover tables? A: Each

    primary vttablet runs a table lifecycle process. It will collect the leftover tables (whether migration is successful or failed) and will safely and slowly (using the throttler mechanism) purge and drop them. You do not need to do anything.
  26. Vitess as an infrastructure framework None of the above requires

    you to run traffic through Vitess. You may keep running your production traffic as normal, if you wish, and only use Vitess for infrastructure.
  27. • GitHub and GitHub team: • github.com/github/gh-ost • github.com/github/freno •

    Lessons learned • Percona • percona.com/doc/percona-toolkit • Community • Vitess maintainers • PlanetScale team Acknowledgements