Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Database Performance for Ruby on Rails Applicat...

Leonard Chin
November 01, 2018

Database Performance for Ruby on Rails Applications

A few real-world stories from on database performance from Cookpad's Ruby on Rails application.

Presented at
- RubyWorld Conference 2018 (Matsue, Japan)
- Tokyo Rails #37 (Tokyo, Japan)

Leonard Chin

November 01, 2018
Tweet

More Decks by Leonard Chin

Other Decks in Technology

Transcript

  1. Simple Complex 10 100 1k 10k 100k 1m 10m 100m

    1b+ Total Users Ruby on Rails apps at Cookpad * for illustrative purposes only
  2. Simple Complex 10 100 1k 10k 100k 1m 10m 100m

    1b+ Total Users Ruby on Rails apps at Cookpad * for illustrative purposes only Internal admin / business apps Internal tools Supporting Services New and peripheral products
  3. Simple Complex 10 100 1k 10k 100k 1m 10m 100m

    1b+ Total Users Ruby on Rails apps at Cookpad * for illustrative purposes only Internal admin / business apps Internal tools Supporting Services New and peripheral products Two “Cookpads”
  4. #

  5. Simple Complex 10 100 1k 10k 100k 1m 10m 100m

    1b+ Total Users Ruby on Rails apps at Cookpad * for illustrative purposes only Two “Cookpads”
  6. Rack::Timeout::RequestTimeoutException: Request ran for longer than 19000ms app/views/bookmarks/_list.html.erb:4:in `block in

    _app_views_bookmarks__list_html_erb__1896250766765096406_70283982598900' <%= render partial: "bookmarks/bookmark", collection: bookmarks %> app/views/bookmarks/_list.html.erb:2:in `tap' <% bookmarks.page(params[page]).tap do |bookmarks| %> app/views/bookmarks/_list.html.erb:2:in `_app_views_bookmarks__list_html_erb__1896250766765096406_70283982598900' <% bookmarks.page(params[page]).tap do |bookmarks| %> app/views/layouts/_bookmarks_all.html.erb:5:in `_app_views_layouts__bookmarks_all_html_erb__1020300051023946889_702839690775 00' <%= render "bookmarks/list", bookmarks: current_user.bookmarks.listing.recent, name: "recent_bookmarks" %> app/views/layouts/_sidebar.html.erb:87:in `_app_views_layouts__sidebar_html_erb___3139679681373029435_70284006194440' <%= render "layouts/bookmarks_all" %>
  7. Response Time (ms) 45 54 44 42 42 41 51

    40 250 24 35 36 32 32 49 28 40 28 980 18 50 30 30 42 50 38 39 60 32 42 * This is illustrative example data • Average: 70ms • Median: 40ms • 90th Percentile: 54.6ms • 95th Percentile: 193ms • 99th Percentile: 812ms • Maximum: 980ms
  8. THE IMPLEMENTATION # In the view current_user.bookmarks.listing.recent # As SQL

    SELECT `bookmarks`.* FROM `bookmarks` WHERE `bookmarks`.`user_id` = ? ORDER BY visited_at DESC LIMIT ? OFFSET ?
  9. You could EXPLAIN locally… # In the view current_user.bookmarks.listing.recent.explain #

    As SQL EXPLAIN SELECT `bookmarks`.* FROM `bookmarks` WHERE `bookmarks`.`user_id` = ?;
  10. # In the view current_user.bookmarks.listing.recent.explain # As SQL EXPLAIN SELECT

    `bookmarks`.* FROM `bookmarks` WHERE `bookmarks`.`user_id` = ?; JUST ADD EXPLAIN You could EXPLAIN locally…
  11. FILESORT … THIS IS NOT GOOD LOTS OF ROWS EXAMINED…

    ALSO NOT GOOD GREAT! WE’RE USING AN INDEX
  12. Ruby on Rails App Database Database Client Database Server SQL

    Query Result Set Let’s find out what goes on in here
  13. Database Internals (the easy version) ✅ Your data is stored

    as rows in a table on disk ✅ Indexes are stored separately as B-Trees, with leaf nodes pointing to table rows
  14. B-Trees (the easy version) ✅ A sorted, tree data structure

    ✅ It self-balances on insert/delete
  15. B-Trees (the easy version) ✅ A sorted, tree data structure

    ✅ It self-balances on insert/delete ✅ It’s fast: O(log n) for searching
  16. Index as a B-Tree 25 40 1 6 13 32

    3 4 28 35 1 99 110 120 123 78 12 54 65 90 75 78 89 41 2 7 Leaf nodes are primary keys (row ids) pointing to the table record
  17. # In the view current_user.bookmarks.listing.recent # As SQL SELECT `bookmarks`.*

    FROM `bookmarks` WHERE `bookmarks`.`user_id` = ? ORDER BY visited_at DESC LIMIT ? OFFSET ?
  18. # In the view current_user.bookmarks.listing.recent # As SQL SELECT `bookmarks`.*

    FROM `bookmarks` WHERE `bookmarks`.`user_id` = ? ORDER BY visited_at DESC LIMIT ? OFFSET ? Why is this slow?
  19. The query was using an index # index_bookmarks_on_user_id t.index [“user_id"]

    SELECT `bookmarks`.* FROM `bookmarks` WHERE `bookmarks`.`user_id` = ? ORDER BY visited_at DESC LIMIT ? OFFSET ? It matches the query, doesn’t it?
  20. # index_bookmarks_on_user_id t.index [“user_id"] SELECT `bookmarks`.* FROM `bookmarks` WHERE `bookmarks`.`user_id`

    = ? ORDER BY visited_at DESC LIMIT ? OFFSET ? But…. look at this ORDER BY
  21. What is the database doing? ✅ Find all records for

    user_id from the index (fast) ✅ Retrieves all these records from the table (despite the LIMIT)
  22. What is the database doing? ✅ Find all records for

    user_id from the index (fast) ✅ Retrieves all these records from the table (despite the LIMIT) ✅ Sorts all these records by visited_at
  23. What is the database doing? ✅ Find all records for

    user_id from the index (fast) ✅ Retrieves all these records from the table (despite the LIMIT) ✅ Sorts all these records by visited_at ✅ Returns LIMIT records at OFFSET
  24. t.index [“user_id”, “visited_at”] SELECT `bookmarks`.* FROM `bookmarks` WHERE `bookmarks`.`user_id` =

    ? ORDER BY visited_at DESC LIMIT ? OFFSET ? Index with the sorted column
  25. 25, Feb 1 40, Jan 40 1, Mar 1 6,

    Feb 3 6, Feb 5 6, Mar 12 13, Feb 22 32 3 99 42 12 Branch Node Leaf Node For an index like (user_id, visited_at), you can visualise the sorting on the second column as above Composite indexes as B-Trees
  26. t.index [“user_id”, “recipe_id”, “created_at”] Given 1, 22, Mar 1 6,

    14 Feb 3 6, 14 Feb 5 6, 15 Mar 12 13, 5 Feb 22 32 3 99 42 12 Leaf Node
  27. t.index [“user_id”, “recipe_id”, “created_at”] Given ✅ user_id ✅ user_id, recipe_id

    ❌ recipe_id ❌ created_at ❌ recipe_id, user_id ❌ user_id, created_at
  28. SELECT COUNT(*) FROM `users` INNER JOIN `follows` ON `users`.`id` =

    `follows`.`follower_id` WHERE `follows`.`followee_id` = ?
  29. SELECT COUNT(*) FROM `users` INNER JOIN `follows` ON `users`.`id` =

    `follows`.`follower_id` WHERE `follows`.`followee_id` = ? It’s counting the number of followers for a user
  30. TABLES JE OBNF 1 Mary 2 Jenny 3 Alice JE

    GPMMPXFS@JE GPMMPXFF@JE 1 1 2 2 1 3 3 2 3 Follows Users
  31. TABLES JE OBNF 1 Mary 2 Jenny 3 Alice JE

    GPMMPXFS@JE GPMMPXFF@JE 1 1 2 2 1 3 3 2 3 Follows Users
  32. TABLES JE OBNF 1 Mary 2 Jenny 3 Alice JE

    GPMMPXFS@JE GPMMPXFF@JE 1 1 2 2 1 3 3 2 3 Follows Users
  33. # follow.rb belongs_to :follower, foreign_key: :follower_id, class_name: "User", counter_cache: :followees_count

    belongs_to :followee, foreign_key: :followee_id, class_name: "User", counter_cache: :followers_count # user.rb has_many :incoming_follows, class_name: "Follow", foreign_key: :followee_id, dependent: :destroy has_many :outgoing_follows, class_name: "Follow", foreign_key: :follower_id, dependent: :destroy has_many :followers, through: :incoming_follows has_many :followees, through: :outgoing_follows
  34. # follow.rb belongs_to :follower, foreign_key: :follower_id, class_name: "User", counter_cache: :followees_count

    belongs_to :followee, foreign_key: :followee_id, class_name: "User", counter_cache: :followers_count # user.rb has_many :incoming_follows, class_name: "Follow", foreign_key: :followee_id, dependent: :destroy has_many :outgoing_follows, class_name: "Follow", foreign_key: :follower_id, dependent: :destroy has_many :followers, through: :incoming_follows has_many :followees, through: :outgoing_follows counter_cache is already configured!
  35. [2] pry(main)> user.followers.size [slave/1] (3752.5ms) SELECT COUNT(*) FROM `users` INNER

    JOIN `follows` ON `users`.`id` = `follows`.`follower_id` WHERE `follows`.`followee_id` = 2008217 => 1712 [3] pry(main)> user.followees.size [slave/1] (71.0ms) SELECT COUNT(*) FROM `users` INNER JOIN `follows` ON `users`.`id` = `follows`.`followee_id` WHERE `follows`.`follower_id` = 2008217 => 1087
  36. [2] pry(main)> user.followers.size [slave/1] (3752.5ms) SELECT COUNT(*) FROM `users` INNER

    JOIN `follows` ON `users`.`id` = `follows`.`follower_id` WHERE `follows`.`followee_id` = 2008217 => 1712 [3] pry(main)> user.followees.size [slave/1] (71.0ms) SELECT COUNT(*) FROM `users` INNER JOIN `follows` ON `users`.`id` = `follows`.`followee_id` WHERE `follows`.`follower_id` = 2008217 => 1087 Nope
  37. ✅ The counter cache on Follow is set to followers_count

    ✅ The counter cache column exists in the database
  38. ✅ The counter cache on Follow is set to followers_count

    ✅ The counter cache column exists in the database ✅ The counter cache increments/decrements correctly
  39. ✅ The counter cache on Follow is set to followers_count

    ✅ The counter cache column exists in the database ✅ The counter cache increments/decrements correctly ✅ The counter cache name follows the convention of association_name_count
  40. pry(main)> followers = User. reflect_on_all_associations(:has_many). find { |r| r.name ==

    :followers } => #<ActiveRecord::Reflection::ThroughReflection: 0x00007fe0f0b7a8d0>, pry(main)> followers.has_cached_counter? => nil pry(main)> followers.inverse_of => nil pry(main)> followers.counter_cache_column => "followers_count"
  41. pry(main)> followers = User. reflect_on_all_associations(:has_many). find { |r| r.name ==

    :followers } => #<ActiveRecord::Reflection::ThroughReflection: 0x00007fe0f0b7a8d0>, pry(main)> followers.has_cached_counter? => nil pry(main)> followers.inverse_of => nil pry(main)> followers.counter_cache_column => "followers_count" This is not what I expected
  42. ✅ The association has the correct counter_cache column ❌ The

    association does not believe it has a cached counter
  43. ActiveRecord::Reflection # Returns whether a counter cache should be used

    for this association. # # The counter_cache option must be given on either the owner or inverse # association, and the column must be present on the owner. def has_cached_counter? options[:counter_cache] || inverse_which_updates_counter_cache && inverse_which_updates_counter_cache.options[:counter_cache ] && !!active_record.columns_hash[counter_cache_column] end
  44. The association expects to be able to find a counterpart

    belongs_to on its inverse association
  45. # follow.rb belongs_to :follower, foreign_key: :follower_id, class_name: "User", counter_cache: :followees_count

    belongs_to :followee, foreign_key: :followee_id, class_name: "User", counter_cache: :followers_count # user.rb has_many :incoming_follows, class_name: "Follow", foreign_key: :followee_id, dependent: :destroy has_many :outgoing_follows, class_name: "Follow", foreign_key: :follower_id, dependent: :destroy has_many :followers, through: :incoming_follows has_many :followees, through: :outgoing_follows This is a self-referential has_many :through many-to-many association!
  46. ActiveRecord::Reflection # Returns whether a counter cache should be used

    for this association. # # The counter_cache option must be given on either the owner or inverse # association, and the column must be present on the owner. def has_cached_counter? options[:counter_cache] || inverse_which_updates_counter_cache && inverse_which_updates_counter_cache.options[:counter_cache ] && !!active_record.columns_hash[counter_cache_column] end
  47. 4.3.2 Options for has_many 4.3.2.4 :counter_cache This option can be

    used to configure a custom named :counter_cache. You only need this option when you customized the name of your :counter_cache on the belongs_to association.
  48. The solution has_many :followers, -> { order("follows.id DESC") }, through:

    :incoming_follows, counter_cache: :followers_count has_many :followees, -> { order("follows.id DESC") }, through: :outgoing_follows, counter_cache: :followees_count
  49. While trying to improve the performance of our slowest, high

    traffic endpoint, I noticed an outlier
  50. Mysql2::Error: MySQL server has gone away: SELECT COUNT(*) FROM `recipe_likes`

    WHERE `recipe_likes`.`recipe_id` = 4612396 AND `recipe_likes`.`user_id` IN (11526355, 10243093, 10233195, 11675568, 10241820, 11343209, 10239509, 10227928, 10226294, 10230604, 10225640, 10499978, 10223459, 10230036, 10228518, 10228876, 9111946, 11598081, 11598079, 11598074, 11598083, 10225955, 11598080, 11598070, 11598072, 11598061, 11598066, 11598058, 11598037, 11598068, 11598062, 11598043, 11598041, 11598038, 11598034, 11598046, 11598044, 11598042, 11598014, 11598009, 11598040, 11598036, 11598006, 11598003, 11597999, 11597996, 11598035, 11598031, 11598039, 11598030, 10229548, 11598025, 11598023, 11597989, 11598029, 11598024, 11597991, 11598020, 11597990, 11597987, 11598021, 11598019, 11598018, 11598013, 11598008, 11598012, 11598015, 11598011, 11597997, 11598001, 11597994, 11597993, 11597992, 11597988, 11597971, 11597967, 11597966, 11597960, 11597959, 11597954, 11597951, 11597948, 11597945, 11597986, 11597985, 11597983, 11597935, 10229554, 11597978, 11597974, 11597953, 11597958, 11597949, 11597947, 11597936, 11597956, 11597950, 11597943, 11597939, 11597937, 11597910, 11597912, 11597933, 11597931, 11597927, 11597925, 11597917, 11597919, 11597920, 11597918, 11597913, 11597896, 11597892, 11597888, 11597877, 11597921, 11597886, 11597914, 11597882, 11597879, 11597911, 11597909, 11597908, 11597904, 10225953, 11597903, 11597905, 11597902, 11597901, 11597893, 11597898, 11597846, 11597874, 11597849, 11597872, 11597869, 11597868, 11597866, 11597860, 11597864, 11597828, 11597863, 11597858, 11597850, 11597819, 11597862, 11597857, 11597854, 11597852, 11597842, 11597853, 11597856, 11597848, 11597839, 11597836, 11597837, 11597835, 11597817, 11597834, 11597832, 11597829, 11597821, 11597807, 11597803, 11597831, 11597833, 11597820, 11597805, 11597798, 11597818, 11597801, 11597816, 11597793, 11597813, 11597781, 11597778, 11597811, 11597806, 11597761, 11597799, 11597766, 11597804, 11597797, 10225733, 11597794, 11597768, 11597762, 11597759, 11597788, 11597789, 11597791, 11597790, 11597785, 11597786, 11597756, 11597784, 11597779, 11597782, 11597780, 11597775, 11597773, 11597739, 11597777, 10225943, 11597774, 11597767, 11597763, 11597765, 11597760, 11597736, 11597757, 11597755, 11597753, 11597751, 11597750, 11597734, 11597735, 11597733, 11597744, 11597730, 11597728, 11597715, 11597708, 11597742, 11597738, 11597723, 11597718, 11597713, 11597703, 11597749, 11597747, 11597741, 11597710, 11597699, 11597705, 11597732, 10231546, 11597709, 11597714, 11597727, 11597716, 11597711, 11597706, 11597694, 11597671, 11597704, 11597700, 11597698, 11597691, 11597690, 11597689, 11597687, 11597688, 11597684, 11597682, 11597676, 11597652, 11597685, 11597678, 11597681, 11597674, 11597672, 11597668, 11597666, 11597639, 11597680, 11597677, 11597644, 11597642, 11597638, 11597659, 11597657, 11597655, 11597620, 11597615, 11597656, 11597654, 11597649, 11597636, 11597640, 11597647, 11597648, 11597641, 11597637, 11597635, 11597631, 11597628, 11597626, 11597623, 11597621, 11597618, 11597592, 11597617, 11597614, 11597610, 11597607, 11597606, 11597579, 11597616, 11597613, 11597611, 11597609, 11597591, 11597581, 11597587, 11597602, 11597601, 11597594, 11597596, 11597598, 11597589, 11597599, 11597593, 11597557, 11597590, 11597553, 10225940, 11597582, 11597580, 11597586, 11597577, 11597578, 11597548, 11597545, 11597550, 11597549, 11597571, 11597574, 11597547, 11597544, 11597568, 11597564, 11597563, 11597561, 11597532, 11597567, 11597560, 10229535, 11597562, 11597559, 11597558, 11597529, 11597556, 11597554, 11597526, 11597522, 11597513, 11597518, 11597516, 11597503, 11597508, 11597542, 11597539, 11597534, 11597530, 11597533, 11597531, 11597491, 11597538, 11597504, 11597497, 11597499, 11597485, 11597490, 11597493, 11597528, 11597519, 11597523, 11597514, 11597512, 11597509, 11597510, 11597502, 11597506, 11597498, 11597488, 11597492, 11597483, 11597486, 11597505, 115...
  51. Mysql2::Error: MySQL server has gone away: SELECT COUNT(*) FROM `recipe_likes`

    WHERE `recipe_likes`.`recipe_id` = 4612396 AND `recipe_likes`.`user_id` IN (11526355, 10243093, 10233195, 11675568, 10241820, 11343209, 10239509, 10227928, 10226294, 10230604, 10225640, 10499978, 10223459, 10230036, 10228518, 10228876, 9111946, 11598081, 11598079, 11598074, 11598083, 10225955, 11598080, 11598070, 11598072, 11598061, 11598066, 11598058, 11598037, 11598068, 11598062, 11598043, 11598041, 11598038, 11598034, 11598046, 11598044, 11598042, 11598014, 11598009, 11598040, 11598036, 11598006, 11598003, 11597999, 11597996, 11598035, 11598031, 11598039, 11598030, 10229548, 11598025, 11598023, 11597989, 11598029, 11598024, 11597991, 11598020, 11597990, 11597987, 11598021, 11598019, 11598018, 11598013, 11598008, 11598012, 11598015, 11598011, 11597997, 11598001, 11597994, 11597993, 11597992, 11597988, 11597971, 11597967, 11597966, 11597960, 11597959, 11597954, 11597951, 11597948, 11597945, 11597986, 11597985, 11597983, 11597935, 10229554, 11597978, 11597974, 11597953, 11597958, 11597949, 11597947, 11597936, 11597956, 11597950, 11597943, 11597939, 11597937, 11597910, 11597912, 11597933, 11597931, 11597927, 11597925, 11597917, 11597919, 11597920, 11597918, 11597913, 11597896, 11597892, 11597888, 11597877, 11597921, 11597886, 11597914, 11597882, 11597879, 11597911, 11597909, 11597908, 11597904, 10225953, 11597903, 11597905, 11597902, 11597901, 11597893, 11597898, 11597846, 11597874, 11597849, 11597872, 11597869, 11597868, 11597866, 11597860, 11597864, 11597828, 11597863, 11597858, 11597850, 11597819, 11597862, 11597857, 11597854, 11597852, 11597842, 11597853, 11597856, 11597848, 11597839, 11597836, 11597837, 11597835, 11597817, 11597834, 11597832, 11597829, 11597821, 11597807, 11597803, 11597831, 11597833, 11597820, 11597805, 11597798, 11597818, 11597801, 11597816, 11597793, 11597813, 11597781, 11597778, 11597811, 11597806, 11597761, 11597799, 11597766, 11597804, 11597797, 10225733, 11597794, 11597768, 11597762, 11597759, 11597788, 11597789, 11597791, 11597790, 11597785, 11597786, 11597756, 11597784, 11597779, 11597782, 11597780, 11597775, 11597773, 11597739, 11597777, 10225943, 11597774, 11597767, 11597763, 11597765, 11597760, 11597736, 11597757, 11597755, 11597753, 11597751, 11597750, 11597734, 11597735, 11597733, 11597744, 11597730, 11597728, 11597715, 11597708, 11597742, 11597738, 11597723, 11597718, 11597713, 11597703, 11597749, 11597747, 11597741, 11597710, 11597699, 11597705, 11597732, 10231546, 11597709, 11597714, 11597727, 11597716, 11597711, 11597706, 11597694, 11597671, 11597704, 11597700, 11597698, 11597691, 11597690, 11597689, 11597687, 11597688, 11597684, 11597682, 11597676, 11597652, 11597685, 11597678, 11597681, 11597674, 11597672, 11597668, 11597666, 11597639, 11597680, 11597677, 11597644, 11597642, 11597638, 11597659, 11597657, 11597655, 11597620, 11597615, 11597656, 11597654, 11597649, 11597636, 11597640, 11597647, 11597648, 11597641, 11597637, 11597635, 11597631, 11597628, 11597626, 11597623, 11597621, 11597618, 11597592, 11597617, 11597614, 11597610, 11597607, 11597606, 11597579, 11597616, 11597613, 11597611, 11597609, 11597591, 11597581, 11597587, 11597602, 11597601, 11597594, 11597596, 11597598, 11597589, 11597599, 11597593, 11597557, 11597590, 11597553, 10225940, 11597582, 11597580, 11597586, 11597577, 11597578, 11597548, 11597545, 11597550, 11597549, 11597571, 11597574, 11597547, 11597544, 11597568, 11597564, 11597563, 11597561, 11597532, 11597567, 11597560, 10229535, 11597562, 11597559, 11597558, 11597529, 11597556, 11597554, 11597526, 11597522, 11597513, 11597518, 11597516, 11597503, 11597508, 11597542, 11597539, 11597534, 11597530, 11597533, 11597531, 11597491, 11597538, 11597504, 11597497, 11597499, 11597485, 11597490, 11597493, 11597528, 11597519, 11597523, 11597514, 11597512, 11597509, 11597510, 11597502, 11597506, 11597498, 11597488, 11597492, 11597483, 11597486, 11597505, 115... A Huge IN Clause
  52. def liked_by_multiple_followees? recipe.
 recipe_likes. where(user: subscriber_followee_ids). count >= LIKES_REQUIRED_FOR_PUBLISH end

    The source of the problem This wants to check if a Recipe is “liked” by users followed by the current user
  53. SELECT `users`.id FROM `users` INNER JOIN `follows` ON `users`.`id` =

    `follows`.`followee_id` WHERE `follows`.`follower_id` = ? ORDER BY follows.id DESC SELECT COUNT(*) FROM `recipe_likes` WHERE `recipe_likes`.`recipe_id` = ? AND `recipe_likes`.`user_id` IN (?+);
  54. What does the IN clause do? ✅ It’s kind of

    like an OR ✅ Every element is a separate index lookup
  55. What does the IN clause do? ✅ It’s kind of

    like an OR ✅ Every element is a separate index lookup ❌ MySQL doesn't really have a limit on number of elements in IN
  56. # Find the ids liker_ids = RecipeLike.where(recipe: recipe).pluck(:user_id) follower_ids =

    Follow.where(follower_id: subscriber).pluck(:followee_id)
  57. # Find the ids liker_ids = RecipeLike.where(recipe: recipe).pluck(:user_id) follower_ids =

    Follow.where(follower_id: subscriber).pluck(:followee_id) # Find the intersection liker_ids & follower_ids
  58. # Find the ids liker_ids = RecipeLike.where(recipe: recipe).pluck(:user_id) follower_ids =

    Follow.where(follower_id: subscriber).pluck(:followee_id) # Find the intersection liker_ids & follower_ids Now its two fast queries and an array operation