Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Denormalise Your Database - Schema Design for M...

alcaeus
September 27, 2019

Denormalise Your Database - Schema Design for Modern Database Systems

For years we have designed our databases around the rules of database normalisation. Modern database systems offer more than just scalar fields, and it’s time we start using them: we’ll take a look at how to leverage complex data types in relational databases. We’re also going to explore document databases like MongoDB to better design our schema while putting our data first instead of our database.

alcaeus

September 27, 2019
Tweet

More Decks by alcaeus

Other Decks in Programming

Transcript

  1. Unnormalised ID album founded published tracks 1 Anastacia - Not

    That Kind 1999 2000 {1. Not That Kind, 2. I’m Outta Love, 3. Cowboys & Kisses, … } 2 Pink Floyd - Wish You Were Here 1965 1975 {1. Shine On You Crazy Diamond} 3 Anastacia - Freak Of Nature 1999 2001 {1. Paid My Dues }
  2. 1 NF: mixed data ID album artist founded published track

    title 1 Not That Kind Anastacia 1999 2000 1 Not That Kind 1 Not That Kind Anastacia 1999 2000 2 I’m Outta Love 1 Not That Kind Anastacia 1999 2000 3 Cowboys & Kisses 2 Wish You Were Here Pink Floyd 1965 1975 1 Shine On You Crazy Diamond 3 Freak Of Nature Anastacia 1999 2001 1 Paid My Dues
  3. 2 NF: Tracks album_id track title 1 1 Not That

    Kind 1 2 I’m Outta Love 1 3 Cowboys & Kisses 2 1 Shine On You Crazy Diamond 3 1 Paid my Dues
  4. 2 NF: Albums album_id album artist founded published 1 Not

    That Kind Anastacia 1999 2000 2 Wish You Were Here Pink Floyd 1965 1975 3 Freak Of Nature Anastacia 1999 2001
  5. 3 NF: Albums album_id album artist_id published 1 Not That

    Kind 1 2000 2 Wish You Were Here 2 1975 3 Freak Of Nature 1 2001
  6. Entities /** @ORM\Entity() */ class Artist { /** @ORM\Column(type="string", length=255)

    */ private $name; /** @ORM\Column(type="integer") */ private $founded; /** @ORM\OneToMany(targetEntity=Album::class, mappedBy="artist") */ private $albums; }
  7. Entities /** @ORM\Entity() */ class Album { /** @ORM\Column(type="string", length=255)

    */ private $title; /** * @ORM\ManyToOne(targetEntity=Artist::class, inversedBy="albums") * @ORM\JoinColumn(nullable=false) */ private $artist; /** @ORM\Column(type="integer") */ private $published; /** @ORM\OneToMany(targetEntity=Track::class, mappedBy="album") */ private $tracks; }
  8. Entities /** @ORM\Entity() */ class Track { /** @ORM\Column(type="integer") */

    private $trackNumber; /** @ORM\Column(type="string", length=255) */ private $title; /** * @ORM\ManyToOne(targetEntity=Album::class, inversedBy="tracks") * @ORM\JoinColumn(nullable=false) */ private $album; }
  9. Fetching Data SELECT * FROM artists WHERE name = 'Anastacia';

    SELECT * FROM albums WHERE artist_id = 1; SELECT * FROM tracks WHERE album_id = 1; SELECT * FROM tracks WHERE album_id = 3;
  10. Fetching Single Album SELECT * FROM albums JOIN artists ON

    albums.artist_id = artists.id WHERE albums.title = 'Freak Of Nature’ AND artists.name = 'Anastacia';
  11. Fetching Single Track SELECT * FROM tracks JOIN albums ON

    tracks.album_id = albums.id JOIN artists ON albums.artist_id = artists.id WHERE artists.name = 'Anastacia' AND albums.title = 'Not That Kind' AND tracks.title = 'Not That Kind';
  12. Using MongoDB db.artists.insert( { "name": "Anastacia", "founded": 1999, "albums": [

    { "title": "Not That Kind", "published": 2000, "tracks": [ { "trackNumber": 1, "title": "Not That Kind" }, { "trackNumber": 2, "title": "I'm Outta Love" }, { "trackNumber": 3, "title": "Cowboys & Kisses" } ] }, { "title": "Freak Of Nature", "published": 2001, "tracks": [ { "trackNumber": 1, "title": "Paid My Dues"} ] } ] } );
  13. Documents /** @ODM\Document() */ class Artist { /** @ODM\Field(type="string") */

    private $name; /** @ODM\Field(type="int") */ private $founded; /** @ODM\EmbedMany(targetDocument=Album::class) */ private $albums; }
  14. Documents /** @ODM\EmbeddedDocument() */ class Album { /** @ODM\Field(type="string") */

    private $title; private $artist; /** @ODM\Field(type="int") */ private $published; /** @ODM\EmbedMany(targetDocument=Tracks::class) */ private $tracks; }
  15. Documents /** @ODM\EmbeddedDocument() */ class Track { /** @ODM\Field(type="int") */

    private $trackNumber; /** @ODM\Field(type="string") */ private $title; private $album; }
  16. Fetching Single Album db.artists.find( { name: 'Anastacia'}, { albums: {

    $elemMatch: {'title': 'Not That Kind'} } } );
  17. Query Result { "albums": [ { "title": "Freak Of Nature",

    "published": 2001, "tracks": [ { "trackNumber": 1, "title": "Paid My Dues"} ] } ] }
  18. Fetching Single Track db.artists.find( { name: 'Anastacia'}, { albums: {

    $elemMatch: {'title': 'Not That Kind'} }, 'albums.tracks': { $elemMatch: {'title': 'Not That Kind'} } } );
  19. Oops { "ok" : 0, "errmsg" : "Cannot use $elemMatch

    projection on a nested field.", "code" : 2, "codeName" : "BadValue" }
  20. Aggregation { $match: { 'name': 'Anastacia' } }, { $unwind:

    '$albums' }, { $match: { 'albums.title': 'Not That Kind' } }, { $unwind: '$albums.tracks' }, { $match: { 'albums.tracks.title': 'Not That Kind' } }, db.artists.aggregate([ ]);
  21. Query Result { "name": "Anastacia", "founded": 1999, "albums": { "title":

    "Not That Kind", "published": 2000, "tracks": { "trackNumber": 1, "title": "Not That Kind" } } }
  22. JSON Columns CREATE TABLE artists ( id INTEGER NOT NULL

    PRIMARY KEY, name TEXT NOT NULL, founded INTEGER NOT NULL, albums JSON );
  23. Inserting JSON Data INSERT INTO artists (id, name, founded, albums)

    VALUES ( 1, 'Anastacia', 1999, '[ { "title": "Not That Kind", "published": 2000, "tracks": [ { "trackNumber": 1, "title": "Not That Kind" }, { "trackNumber": 2, "title": "I''m Outta Love" }, { "trackNumber": 3, "title": "Cowboys & Kisses" } ] }, { "title": "Freak Of Nature", "published": 2001, "tracks": [{ "trackNumber": 1, "title": "Paid My Dues"}] } ]' );
  24. Fetching Data SELECT artists.name, album.albumTitle, track.* FROM artists, JSON_TABLE( artists.albums,

    "$[*]" COLUMNS( albumTitle TEXT PATH "$.title", published INT PATH "$.published", tracks JSON PATH "$.tracks" ) ) as album, JSON_TABLE( album.tracks, "$[*]" COLUMNS( trackNumber INT PATH "$.trackNumber", title TEXT PATH "$.title" ) ) as track;
  25. Fetching Data SELECT name, album->>'title' albumTitle, track->>'trackNumber' trackNumber, track->>'title' title

    FROM artists, json_array_elements(albums) album, json_array_elements(album->'tracks') track ORDER BY album->>'published', track->>'trackNumber';
  26. Normalising MongoDB { "_id": 1, "name": "Not That Kind”, "artist":

    { "$ref": "artists", "$id": 1 }, "published": 2000, "tracks": [ { "trackNumber": 1, "title": "Not That Kind" }, { "trackNumber": 2, "title": "I'm Outta Love" }, { "trackNumber": 3, "title": "Cowboys & Kisses" } ] }
  27. Normalising MongoDB { "_id": 1, "name": "Anastacia", "founded": 1999, "albums":

    [ { "$ref": "albums", "$id": 1 }, { "$ref": "albums", "$id": 2 } ] }
  28. DBRef Objects • Store collection name and identifier • Work

    across databases • Don’t check referential integrity (remember MyISAM?) • Don’t work well in aggregation pipelines
  29. Alternative References { "_id": 1, "name": "Not That Kind”, "artist_id":

    1, "published": 2000, "tracks": [ { "trackNumber": 1, "title": "Not That Kind" }, { "trackNumber": 2, "title": "I'm Outta Love" }, { "trackNumber": 3, "title": "Cowboys & Kisses" } ] }
  30. Fetching Single Track artist_id = db.artists.findOne({ name: 'Anastacia' })._id; db.albums.find(

    { 'artist.$id': artist_id, name: 'Not That Kind' }, { 'tracks': { $elemMatch: {'title': 'Not That Kind'} } } );
  31. Denormalising Data { "_id": 1, "title": "Not That Kind", "artist":

    { "$ref": "artist", "$id": 1, "name": "Anastacia" }, "published": 2000, "tracks": [ { "trackNumber": 1, "title": "Not That Kind" }, { "trackNumber": 2, "title": "I'm Outta Love" }, { "trackNumber": 3, "title": "Cowboys & Kisses" } ] }
  32. Fetching Single Track db.albums.find( { 'artist.name': 'Anastacia', name: 'Not That

    Kind' }, { 'tracks': { $elemMatch: {'title': 'Not That Kind'} } } );
  33. Denormalising Data { "_id": 1, "name": "Anastacia", "founded": 1999, "albums":

    [ { "$ref": "albums", "$id": 1, "title": "Not That Kind" }, { "$ref": "albums", "$id": 2, "title": "Freak Of Nature" } ] }
  34. Denormalising Data CREATE TABLE albums ( id INT PRIMARY KEY

    NOT NULL, title VARCHAR(255) NOT NULL, artist_id INT NOT NULL, artist_name VARCHAR(255) NOT NULL, published INT NOT NULL, FOREIGN KEY fk_artist(artist_id) REFERENCES artists(id) ON UPDATE CASCADE ON DELETE CASCADE, FOREIGN KEY fk_artist_name(artist_name) REFERENCES artists(name) ON UPDATE CASCADE ON DELETE CASCADE );
  35. Fetching Single Track SELECT * FROM tracks JOIN albums ON

    tracks.album_id = albums.id WHERE albums.artist_name = 'Anastacia' AND albums.title = 'Not That Kind' AND tracks.title = 'Not That Kind';
  36. Composite Types CREATE TYPE artist_ref AS ( id INT, name

    VARCHAR(255) ); CREATE TABLE albums ( id INT PRIMARY KEY NOT NULL, title VARCHAR(255) NOT NULL, artist artist_ref NOT NULL, published INT NOT NULL );
  37. Inserting Data INSERT INTO artists VALUES (1, 'Anastacia', 1999); INSERT

    INTO albums VALUES (1, 'Not That Kind', ROW (1, 'Anastacia'), 2000); SELECT (artist).name, title FROM albums WHERE (artist).id = 1;
  38. Foreign Key Limitations CREATE TABLE albums ( id INT PRIMARY

    KEY NOT NULL, title VARCHAR(255) NOT NULL, artist artist_ref NOT NULL, published INT NOT NULL, FOREIGN KEY (artist.id) REFERENCES artists(id) ON UPDATE CASCADE ON DELETE CASCADE, FOREIGN KEY (artist.name) REFERENCES artists(name) ON UPDATE CASCADE ON DELETE CASCADE );
  39. Foreign Keys CREATE TABLE artists ( artist artist_ref PRIMARY KEY,

    founded INT NOT NULL ); CREATE TABLE albums ( id INT PRIMARY KEY NOT NULL, title VARCHAR(255) NOT NULL, artist artist_ref NOT NULL, published INT NOT NULL, FOREIGN KEY (artist) REFERENCES artists(artist) ON UPDATE CASCADE ON DELETE CASCADE );
  40. Array Fields CREATE TYPE album_ref AS ( id INT, title

    VARCHAR(255) ); CREATE TABLE artists ( artist artist_ref PRIMARY KEY, founded INT NOT NULL, albums album_ref[] );
  41. Inserting Data INSERT INTO artists VALUES (ROW(1, 'Anastacia'), 1999); INSERT

    INTO albums VALUES (1, 'Not That Kind', ROW (1, 'Anastacia'), 2000); UPDATE artists SET albums = ARRAY[CAST(ROW(1, 'Not That Kind') AS album_ref)] WHERE (artist).name = 'Anastacia';
  42. Modifying Data INSERT INTO albums VALUES (1, 'Freak of Nature',

    ROW (1, 'Anastacia'), 2001); UPDATE artists SET albums = albums || ARRAY[CAST(ROW(1, 'Freak of Nature') AS album_ref)] WHERE (artist).name = 'Anastacia';
  43. Value Objects CREATE TYPE price AS ( currency VARCHAR(3), amount

    INT ); CREATE TABLE products ( id INT PRIMARY KEY NOT NULL, price price NOT NULL, sale_price price NOT NULL );
  44. Array Types CREATE TYPE price AS ( currency VARCHAR(3), amount

    INT ); CREATE TABLE products ( id INT PRIMARY KEY NOT NULL, price price[] NOT NULL, sale_price price[] NOT NULL ); INSERT INTO products VALUES ( 1, ARRAY[CAST(ROW('EUR', 1499) AS price), CAST(ROW('USD', 1999) AS price)], ARRAY[CAST(ROW('EUR', 999) AS price), CAST(ROW('USD', 1499) AS price)] );
  45. Embed vs. Reference • Embed data when it’s owned by

    another entity • Reference data when it’s needed by itself • Use hybrid references to avoid joins • Beware of write penalties when duplicating data
  46. Entity Validation class Price { /** * @var string */

    private $currency; /** * @var int * @GreaterThan(0) */ private $price; }
  47. Entity Validation public function __invoke(Request $request, EntityManagerInterface $entityManager, PriceForm $form)

    { $form->handleRequest($request); if (!$form->isSubmitted() || !$form->isValid()) { throw new BadRequestHttpException(); } $price = $form->getData(); assert($price instanceof Price); $entityManager->persist($price); $entityManager->flush(); }
  48. Check Constraints CREATE TABLE products ( id INT PRIMARY KEY

    NOT NULL, price price NOT NULL, sale_price price, CONSTRAINT positive_price CHECK ((price).amount > 0) );
  49. Check Constraints CREATE DOMAIN price_amount AS INT CHECK ( VALUE

    > 0 ); CREATE TYPE price AS ( currency VARCHAR(3), amount price_amount );
  50. Check Constraints CREATE TABLE products ( id INT PRIMARY KEY

    NOT NULL, price price NOT NULL, sale_price price, CONSTRAINT valid_price CHECK ((price).amount > (sale_price).amount), CONSTRAINT same_currency CHECK ((price).currency = (sale_price).currency) );
  51. Schema Validation db.artists.createCollection('artists', { validator: { $jsonSchema: { bsonType: 'object',

    required: [ 'name', 'founded', 'albums' ], properties: { name: { bsonType: 'string' }, ... } } } });
  52. Schema Validation db.createCollection('products', { validator: { $and: [ { $jsonSchema:

    { ... } }, { price: { $gt: 0 } }, { $expr: { $gt: [ '$price.amount', '$salePrice.amount' ] }}, { $or: [ { salePrice: { $exists: false } }, { $expr: { $eq: ['$price.currency', '$salePrice.currency'] }} ]}, ] } });