Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Amazon DynamoDB - A serverless database for eve...

Amazon DynamoDB - A serverless database for everyone

Aleksandar Simovic

November 05, 2019
Tweet

More Decks by Aleksandar Simovic

Other Decks in Programming

Transcript

  1. simalexan Aleksandar Simovic Senior Software Engineer @ ScienceExchange AWS Serverless

    Hero coauthor of “Serverless Applications with Node.js” book AWS SAM & Lambda Builders Contributor Co-organizer of JS Belgrade, Serverless Belgrade, Wardley Maps Belgrade meetups
  2. simalexan (1967-71)
 
 $1 Million per 1 MB $200 per

    92k IPS Cost of Resources (2019) 
 
 $0.02 per 1 GB
 $500 per 300 bill. IPS
  3. simalexan SQL NoSQL Relational Hierarchical (Denormalized) Vertical Scale Horizontal Scale

    Queries Instantiated Views Maximize Storage Maximize compute (CPU)
  4. simalexan Serverless DB
 (No patches / updates) Amazon DynamoDB Consistent

    and Fast
 (4M transc. / sec) Document or Key-Value Scales per Any Load Access control (Fine grained access 
 table, items, attributes, values) Event Driven Model
 (Connected to AWS Lambda)
  5. simalexan • 2007, published paper • One of the authors

    is Werner Vogels 
 (CTO of Amazon)
 • Used by Amazon from day 1
 Amazon e-commerce Shopping Cart • 2012, GA Built by AWS / Used By AWS
  6. simalexan • Hardware provisioning • Cross-availability zone replication
 (replicas whenever

    needed)
 • Monitoring and handling of failures • Patches / updates / fixes Managed for you
  7. simalexan • First 25 GB stored per month is free

    • $0.25 per GB-month • Write $1.25 / million req units • Read $0.25 / million req units Pay-per-use database!
  8. simalexan • Single digit (1-9) millisecond Put / Get •

    Custom SSD based platform
 - performance independent of table size
 - no need for working set to fit in memory Performance
  9. simalexan • “Replication? We don’t need stinking replication” • “Automated”

    • “multi-region” • "multiple replica”
 tables 
 (one per region that you choose)
 • DynamoDB treats as a single unit Global Tables
  10. simalexan • Scalar data types:
 - string, 
 - number,

    
 - binary • MultiValue data types
 (string set, number set, binary set) Data Types Available
  11. simalexan • Data Index by a Primary Key • Types

    of primary keys:
 - partitionKey (hash)
 - partitionKey + range Indexing Scientist field year Marie Curie Chemistry 1911 Marie Curie Physics 1903 John Bardeen Physics 1956 Leonid Hurwicz Economics 2007 hash + range example
  12. simalexan Uniquely identifies a single item Partition Key Unordered hash

    index Allows table partitioning for scale 
 (chop up and throw in a storage node - automatic routing for service request)
  13. simalexan Two attributes to uniquely identify an item Partition Sort

    Key Items arranged by the sort key No limits on the number of items per partition key
  14. simalexan Local index to a single partition key Local Secondary

    Index Another sort key attribute 
 (alternate range key)
  15. simalexan Country Scientist field year topic France Marie Curie Chemistry

    1903 Radioactivity France Esther Duflo Economics 2019 Alleviating Global Poverty Get items 
 country = “France”
 scientist begin_with “Marie” NO Local Secondary Index
  16. simalexan Get items 
 country = “France”
 year > 2000

    Local Secondary Index Country Scientist field year topic France Marie Curie Chemistry 1903 Radioactivity France Esther Duflo Economics 2019 Alleviating Global Poverty LSI
 Partition Key: scientist
 range: year

  17. simalexan Global Secondary Index (GSI) Index across all partitions
 


    Alternate partition (+sort) key
 Use composite sort keys for compound indexes
  18. simalexan Artist Song Year of Release Album AC/DC Shoot to

    Thrill 1980 Back in Black Florian Pellieser Quintet Coup de foudre
 a Thessalonique 2018 Coup de foudre
 a Thessalonique NO Global Secondary Index Get items 
 artist = “AC/DC”
 order by song ASC
  19. simalexan Artist Song Year of Release Album AC/DC Shoot to

    Thrill 1980 Back in Black Florian Pellieser Quintet Coup de foudre
 a Thessalonique 2018 Coup de foudre
 a Thessalonique Global Secondary Index Get items 
 album = “Back In Black”
 order by song DESC GSI 
 Partition Key: album
 range: song
 * this is why the GSI costs extra *
  20. simalexan 1. Join two columns 2. Use “begins with” Composite

    Key is_verified date TRUE 11-05-2019 TRUE 11-05-2019 FALSE is_verified_date TRUE_11-05-2019 TRUE_11-05-2019 FALSE Useful for multivalue filter (and sort sometimes)
  21. simalexan Limitations Item max size 400kb
 
 Max 5 LSI


    
 Initial limit of 20 max GSI
 Max 32 levels deep
 
 Max 40,000 reads and 40,000 write req. units / table
 
 256 tables per Account per AWS Region
  22. simalexan SQL vs NoSQL SELECT * FROM aircrafts
 INNER JOIN

    bookings
 WHERE… SELECT * FROM aircrafts
  23. simalexan Use Case Context Modeling steps Rick Houlihan Review ->

    Repeat -> Review Data modeling Avoid relational database patterns, use one table Access patterns • Read Write Workloads • Query dimensions and aggregations • Nature of the app (OLTP, OLAP, DSS) • ER Model • Data Lifecycle (TTL, Backup) • Data Sources • Query aggregations • Document all workflows • 1 application service = 1 table • Identify primary keys • Define indexes for secondary access
 patterns
  24. simalexan • As data enters a database,
 it can also

    leave.
 • AWS DynamoDB Streams
 • API Gateway -> DynamoDB
 DynamoDB -> Lambda Event Driven Model
  25. simalexan When (not) to use DynamoDB Real-time analytics and queries

    Complex queries and joins Bad use cases Good use cases Very high read/write Key-value simple queries Consistently low-latency Autosharding / multiple node scaling No tuning No size throughout put limits OLTP OLAP