Upgrade to Pro — share decks privately, control downloads, hide ads and more …

its-time-for-relational-databases-to-focus-on-d...

 its-time-for-relational-databases-to-focus-on-developers_shlomi-noach.pdf

The relational model is one of the oldest surviving models in computer science. And while it is still popular, it is gradually losing the developers’ good opinion. While relational databases have invested efforts in optimizing speed, latency and overall read and write performance, they have not adapted to the evolving change in developer’s needs.

The relational model comes with an operational complexity, and schema changes remain an operational blocker, outside the domain and ownership of the very developers who plan those changes. As a result, developers perceive relational databases unlike any other production system and seek ways to avoid schema changes.

Exactly what went wrong? And what would it take for relational databases to close the gap and regain the developer’s confidence? This session highlights the problems around making and deploying schema changes. We discuss scheduling, operational conflicts, production impact and blast radius, blue/green deployments, idempotency and more. We provide a paradigm that relational databases can follow, illustrated by Vitess, an OSS CNCF project which implements that paradigm.

Shlomi Noach

October 25, 2022
Tweet

More Decks by Shlomi Noach

Other Decks in Technology

Transcript

  1. ALTER TABLE node_health ADD COLUMN extra_info varchar(128) NOT NULL, DROP

    KEY instance_tag_idx DROP TABLE instance_tags Schema changes (DDLs)
  2. About me Engineer at PlanetScale; previously at GitHub, Booking.com, Outbrain

    Author of open source projects orchestrator, gh-ost, freno and others Maintainer for Vitess Blog at http://openark.org github.com/shlomi-noach @ShlomiNoach
  3. Vitess A database clustering system for horizontal scaling of MySQL

    • CNCF graduated project • Open source, Apache 2.0 licence • Contributors from around the community • Originally developed at YouTube to mass scale their load
  4. ALTER TABLE node_health ADD COLUMN extra_info varchar(128) NOT NULL, DROP

    KEY instance_tag_idx DROP TABLE instance_tags Avoiding schema changes
  5. Schema change as an operation The relational database does not

    have a concept of developers. It only has the concept of operations people.
  6. Schema change as a deployment We should apply our deployment

    terminology and practices to schema changes
  7. Operational time ALTER TABLE node_health ADD COLUMN extra_info varchar(128) NOT

    NULL; [Three days later…] Query OK, 0 rows affected (70h 12m 56s)
  8. app

  9. app

  10. app

  11. app

  12. app

  13. app

  14. app

  15. app

  16. app

  17. app

  18. CREATE TABLE IF NOT EXISTS node_health ( hostname varchar(128) CHARACTER

    SET ascii NOT NULL, token varchar(128) NOT NULL, last_seen_active timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (hostname, token) ) ENGINE=InnoDB DEFAULT CHARSET=ascii “Base schema”
  19. ALTER TABLE node_health DROP PRIMARY KEY, ADD PRIMARY KEY (hostname,

    token) ALTER TABLE node_health ADD COLUMN extra_info varchar(128) CHARACTER SET utf8 NOT NULL ALTER TABLE node_health MODIFY last_seen_active timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP “patches”
  20. if flags2&BinlogThroughGTID != 0 { dataSize, pos, ok := readUint32(data,

    pos) if !ok { return logFile, logPos, position, readPacketErr } if gtid := string(data[pos : pos+int(dataSize)]); gtid != "" { position, err = DecodePosition(gtid) if err != nil { return logFile, logPos, position, err } } } return logFile, logPos, position, nil Source code
  21. ALTER TABLE node_health DROP PRIMARY KEY, ADD PRIMARY KEY (hostname,

    token) ALTER TABLE node_health ADD COLUMN extra_info varchar(128) CHARACTER SET utf8 NOT NULL ALTER TABLE node_health MODIFY last_seen_active timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP Human calculated diff
  22. CREATE TABLE IF NOT EXISTS node_health ( hostname varchar(128) CHARACTER

    SET ascii NOT NULL, token varchar(128) NOT NULL, last_seen_active timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (hostname, token) ) ENGINE=InnoDB DEFAULT CHARSET=ascii Declarative schema
  23. CREATE TABLE IF NOT EXISTS node_health ( hostname varchar(128) CHARACTER

    SET ascii NOT NULL, token varchar(128) NOT NULL, last_seen_active timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, extra_info varchar(128) CHARACTER SET utf8 NOT NULL PRIMARY KEY (hostname, token) ) ENGINE=InnoDB DEFAULT CHARSET=ascii Declarative schema
  24. ALTER TABLE node_health ADD COLUMN extra_info varchar(128) CHARACTER SET utf8

    NOT NULL Imperative deployments via DDL statements
  25. More to discuss Asynchronous and scheduled deployments, job management, load

    shedding and throttling, discoverability, multi change deployments, multi target deployments, artifacts and garbage collection