id = 100 date = 2012-05-16-09-00-10 total = 25.00 id = 101 date = 2012-05-15-15-00-11 total = 35.00 id = 101 date = 2012-05-16-12-00-10 total = 100.00 id = 102 date = 2012-03-20-18-23-10 total = 20.00 id = 102 date = 2012-03-20-18-23-10 total = 120.00
id = 100 date = 2012-05-16-09-00-10 total = 25.00 id = 101 date = 2012-05-15-15-00-11 total = 35.00 id = 101 date = 2012-05-16-12-00-10 total = 100.00 id = 102 date = 2012-03-20-18-23-10 total = 20.00 id = 102 date = 2012-03-20-18-23-10 total = 120.00 Table
id = 100 date = 2012-05-16-09-00-10 total = 25.00 id = 101 date = 2012-05-15-15-00-11 total = 35.00 id = 101 date = 2012-05-16-12-00-10 total = 100.00 id = 102 date = 2012-03-20-18-23-10 total = 20.00 id = 102 date = 2012-03-20-18-23-10 total = 120.00 Item
id = 100 date = 2012-05-16-09-00-10 total = 25.00 id = 101 date = 2012-05-15-15-00-11 total = 35.00 id = 101 date = 2012-05-16-12-00-10 total = 100.00 id = 102 date = 2012-03-20-18-23-10 total = 20.00 id = 102 date = 2012-03-20-18-23-10 total = 120.00 Attribute
id = 100 date = 2012-05-16-09-00-10 total = 25.00 id = 101 date = 2012-05-15-15-00-11 total = 35.00 id = 101 date = 2012-05-16-12-00-10 total = 100.00 id = 102 date = 2012-03-20-18-23-10 total = 20.00 id = 102 date = 2012-03-20-18-23-10 total = 120.00 Hash Key
id = 100 date = 2012-05-16-09-00-10 total = 25.00 id = 101 date = 2012-05-15-15-00-11 total = 35.00 id = 101 date = 2012-05-16-12-00-10 total = 100.00 id = 102 date = 2012-03-20-18-23-10 total = 20.00 id = 102 date = 2012-03-20-18-23-10 total = 120.00 Hash Key Range Key +
One API call, multiple items. BatchGet returns multiple items by primary key. BatchWrite performs up to 25 put or delete operations. Throughput is measured by IO, not API calls.
Query patterns. Retrieve all items by hash key. Range key conditions: ==, <, >, >=, <=, begins with, between. Counts. Top and bottom n values. Paged responses.
Data model example: online gaming. Storing scores and leader boards. Players with high Scores. Leader board for each game. user_id = mza location = Cambridge joined = 2011-07-04 user_id = jeffbarr location = Seattle joined = 2012-01-20 user_id = werner location = Worldwide joined = 2011-05-15 Players: hash key
Data model example: secondary indices. Storing more than 64k across items. user_id = mza first_name = Matt last_name = Wood user_id = mattfox first_name = Matt last_name = Fox user_id = werner first_name = Werner last_name = Vogels Users: hash key
Data model example: secondary indices. Storing more than 64k across items. user_id = mza first_name = Matt last_name = Wood user_id = mattfox first_name = Matt last_name = Fox user_id = werner first_name = Werner last_name = Vogels Users: hash key first_name = Matt user_id = mza first_name = Matt user_id = mattfox first_name = Werner user_id = werner First name index: composite keys
Data model example: secondary indices. Storing more than 64k across items. Users: hash key first_name = Matt user_id = mza first_name = Matt user_id = mattfox first_name = Werner user_id = werner First name index: composite keys Second name index: composite keys last_name = Wood user_id = mza last_name = Fox user_id = mattfox last_name = Vogels user_id = werner user_id = mza first_name = Matt last_name = Wood user_id = mattfox first_name = Matt last_name = Fox user_id = werner first_name = Werner last_name = Vogels
last_name = Wood user_id = mza last_name = Fox user_id = mattfox last_name = Vogels user_id = werner user_id = mza first_name = Matt last_name = Wood user_id = mattfox first_name = Matt last_name = Fox user_id = werner first_name = Werner last_name = Vogels Data model example: secondary indices. Storing more than 64k across items. Users: hash key first_name = Matt user_id = mza first_name = Matt user_id = mattfox first_name = Werner user_id = werner First name index: composite keys Second name index: composite keys
last_name = Wood user_id = mza last_name = Fox user_id = mattfox last_name = Vogels user_id = werner user_id = mza first_name = Matt last_name = Wood user_id = mattfox first_name = Matt last_name = Fox user_id = werner first_name = Werner last_name = Vogels Data model example: secondary indices. Storing more than 64k across items. Users: hash key first_name = Matt user_id = mza first_name = Matt user_id = mattfox first_name = Werner user_id = werner First name index: composite keys Second name index: composite keys
4. Time series data. Logging, click through, ad views, game play data, application usage. Non-uniform access patterns. Newer data is ‘live’. Older data is read only. Patterns
Data model example: time series data. Rolling tables for hot and cold data. event_id = 1000 timestamp = 2012-05-16-09-59-01 key = value event_id = 1001 timestamp = 2012-05-16-09-59-02 key = value event_id = 1002 timestamp = 2012-05-16-09-59-02 key = value Events table: composite keys
Uniform workloads. DynamoDB divides table data into multiple partitions. Data is distributed primarily by hash key. Provisioned throughput is divided evenly across the partitions.
Data model example: hash key selection. Well distributed work loads user_id = mza first_name = Matt last_name = Wood user_id = jeffbarr first_name = Jeff last_name = Barr user_id = werner first_name = Werner last_name = Vogels user_id = mattfox first_name = Matt last_name = Fox ... ... ... Users
Data model example: hash key selection. Well distributed work loads user_id = mza first_name = Matt last_name = Wood user_id = jeffbarr first_name = Jeff last_name = Barr user_id = werner first_name = Werner last_name = Vogels user_id = mattfox first_name = Matt last_name = Fox ... ... ... Users Lots of users with unique user_id. Workload well distributed across user partitions.
Data model example: small hash value range. Non-uniform workload. status = 200 date = 2012-04-01-00-00-01 status = 404 date = 2012-04-01-00-00-01 status 404 date = 2012-04-01-00-00-01 status = 404 date = 2012-04-01-00-00-01 Status responses
Data model example: small hash value range. Non-uniform workload. status = 200 date = 2012-04-01-00-00-01 status = 404 date = 2012-04-01-00-00-01 status 404 date = 2012-04-01-00-00-01 status = 404 date = 2012-04-01-00-00-01 Status responses Small number of status codes. Unevenly, non-uniform workload.
mobile_id = 100 access_date = 2012-04-01-00-00-01 mobile_id = 100 access_date = 2012-04-01-00-00-02 mobile_id = 100 access_date = 2012-04-01-00-00-03 mobile_id = 100 access_date = 2012-04-01-00-00-04 ... ... Devices Large number of devices. Small number which are much more popular than others. Workload unevenly distributed. Data model example: uneven access pattern by key. Non-uniform access workload.
In summary... DynamoDB Data modeling Predictable performance Provisioned throughput Libraries & mappers Tables & items Read & write patterns Time series data
In summary... DynamoDB Data modeling Partitioning Predictable performance Provisioned throughput Libraries & mappers Tables & items Read & write patterns Time series data Automatic partitioning Hot and cold data Size/throughput ratio
In summary... DynamoDB Data modeling Partitioning Analytics Predictable performance Provisioned throughput Libraries & mappers Tables & items Read & write patterns Time series data Automatic partitioning Hot and cold data Size/throughput ratio Elastic MapReduce Hive queries Backup & restore