Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Real World Ruby Performance
Search
Aaron Quint
November 19, 2014
Programming
5
280
Real World Ruby Performance
My talk from RubyConf 2014 about Ruby Performance and the philosophy of performance.
Aaron Quint
November 19, 2014
Tweet
Share
More Decks by Aaron Quint
See All by Aaron Quint
Beyond JSON: Improving Inter-app Communication
aq
0
220
Fast Everything: Ruby Performance Tools and Understanding
aq
4
580
The Good, The Bad, The Ugly of Growth
aq
0
300
Chromium Embedded Framework - Go + JS
aq
0
1.5k
The Future of Ruby Performance Tooling
aq
2
740
Working with Rubyists
aq
1
150
Correlation: The Next Frontier
aq
0
410
DevStackup
aq
4
160
Paperless Ops Chef Workflow
aq
1
210
Other Decks in Programming
See All in Programming
Generative AI Use Cases JP (略称:GenU)奮闘記
hideg
0
150
Tuning GraphQL on Rails
pyama86
2
1k
Jakarta Concurrencyによる並行処理プログラミングの始め方 (JJUG CCC 2024 Fall)
tnagao7
1
230
cXML という電子商取引の トランザクションを支える プロトコルと向きあっている話
phigasui
3
2.3k
Java ジェネリクス入門 2024
nagise
0
600
CSC509 Lecture 08
javiergs
PRO
0
110
リリース8年目のサービスの1800個のERBファイルをViewComponentに移行した方法とその結果
katty0324
5
3.6k
C#/.NETのこれまでのふりかえり
tomokusaba
1
160
Server Driven Compose With Firebase
skydoves
0
400
Nuxtベースの「WXT」でChrome拡張を作成する | Vue Fes 2024 ランチセッション
moshi1121
1
510
Vue.js学習の振り返り
hiro_xre
2
130
カスタムしながら理解するGraphQL Connection
yanagii
1
1.2k
Featured
See All Featured
Making the Leap to Tech Lead
cromwellryan
132
8.9k
The Power of CSS Pseudo Elements
geoffreycrofte
72
5.3k
4 Signs Your Business is Dying
shpigford
180
21k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
27
4.2k
Scaling GitHub
holman
458
140k
Unsuck your backbone
ammeep
668
57k
The Art of Programming - Codeland 2020
erikaheidi
51
13k
A better future with KSS
kneath
238
17k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
159
15k
BBQ
matthewcrist
85
9.3k
Measuring & Analyzing Core Web Vitals
bluesmoon
1
40
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
126
18k
Transcript
Real WORLD RUBY PERFORMANCE Aaron Quint / @aq / Ruby
Conf 2014
@tmm1 @SamSaffron @_ko1 SHOUTOUT
We’ll come back to who I am later. It’s [relatively]
unimportant. SKIPPING THE INTRO
I’ve learned so much over the past 5 years, what
could I share? This TALK was HARD TO WRITE
It’s a ⌘+C ⌘+P culture. TIPS And tricks are the
CLIFF NOTES of tech learning
How to THINK about a problem is much more interesting
than how to solve it. As a mentor I want to teach philosophy not snippets
The tools and tricks will change over time. Today, Take
away the process
A multi-step process. Ruby Performance as therapy
It’s a multi-step process Relax, Open up We’re going to
go deep
Step 1: Acceptance
It’s your Fault.
Really?
Yes.
None
It’s not you, It’s me.
It’s not you, It’s me.
— George Costanza (Inventor of “It’s not you, it’s me”)
It’s not you, It’s me.
Performance is about context
Doesn’t scale for what? To what degree? With what hardware?
… “X Doesn’t SCALE” IS BS
So when we talk about our ruby being slow
None
Rails
Rails 10ms
Rails Your application 10ms
Rails Your application DB 10ms
Rails Your application DB 10ms 20ms
Rails Your application DB Cache 10ms 20ms
Rails Your application DB Cache 10ms 20ms 10ms
Rails Your application DB Cache 10ms 20ms 10ms 250ms
IT’s MY FAULT.
Step 2: Diagnosis
Where did I go wrong?
METRICS! Measurement! MMMNUMBERS! Milliseconds MATTER!
Use the right one for the job. Tools abound!
Step 3: Treatment
what are the steps to fix this problem?
How many strokes for the lowest #? Playing golf.
Two angles of optimization
Proxies/Balancers Application Datastores Filesystem/OS/Hardware Individual Request Path (Controller#action)
aka, speeding up a single query, controller action, or code
path Vertical: Fix individual Elements
aka, Adding more workers per-node, buying better hardware Horizontal: Address
hardware or software across a cluster
Important Themes:
Context is crucial to acceptance
Visibility and Introspect- ability are crucial to diagnosis
Knowing your tools is crucial to treatment
I’m Aaron Quint. I’m the chief Scientist at Paperless Post.
None
Opposing forces. Features vs. speed
We realized that being fast meant being stable
CASE STUDIES in performance therapy
None
Case 1: JSON FOR DAYS
None
None
None
package:7292:1123434234234
package:7290:11234342343424 partner:8:11234342343424 partner:8:11234342343424
package:7290:11234342343424 partner:8:11234342343424 partner:8:11234342343424 package:7292:1123434234234
package:7290:11234342343424 partner:8:11234342343424 partner:8:11234342343424 package:7292:1123434234234
Uncached performance is still a problem
ppprofiler to the rescue
ppprofiler
ppprofiler • Auto-cache toggling • Benchmark • Rblineprof • As::Notification
Counts (SQL/Cache, etc) • MemoryProfiler (NEW!) • Gist-able (markdown) output
None
None
None
Rinse and Repeat Make the slowest lines faster
None
None
None
None
None
Case 2: FINGER IN THE SOCKET
Before Vday we were looking for any wins
IN BETWEEN THE LINES! stackprof + stackprof-remote
None
Ruby Process (Unicorn)
Ruby Process (Unicorn)
Ruby Process (Unicorn)
Ruby Process (Unicorn) AC::Dispatch
Ruby Process (Unicorn) AC::Dispatch MyController::Create
Ruby Process (Unicorn) AC::Dispatch MyController::Create Template::Render
Ruby Process (Unicorn) AC::Dispatch MyController::Create Template::Render Ar::Find
Ruby Process (Unicorn)
Ruby Process (Unicorn) StackProf.start rb_profile_frames() rb_profile_frames() rb_profile_frames() rb_profile_frames() StackProf.stop StackProf.dump
! [paperless@production-webapp10 current]$ stackprof tmp/stackprof-cpu-30715-1391204970.dump ================================== Mode: cpu(1000) Samples: 1761
(3.61% miss rate) GC: 128 (7.27%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 344 (19.5%) 342 (19.4%) Statsd#send_to_socket 393 (22.3%) 44 (2.5%) Statsd#sampled 44 (2.5%) 44 (2.5%) block in ActiveRecord::ConnectionAdapters::PostgreSQLPoolAdapter#execute 56 (3.2%) 29 (1.6%) block in ActiveSupport::Notifications::Fanout#listeners_for 29 (1.6%) 29 (1.6%) ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#extract_pg_identifier_from_name 26 (1.5%) 26 (1.5%) ActiveSupport::Notifications::Fanout::Subscribers::Evented#subscribed_to? 25 (1.4%) 25 (1.4%) String#blank? 25 (1.4%) 25 (1.4%) block (2 levels) in ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#select 24 (1.4%) 24 (1.4%) ActiveRecord::Base.scoped_methods 22 (1.2%) 22 (1.2%) Dalli::Server::KSocket#kgio_wait_readable 21 (1.2%) 21 (1.2%) ActiveSupport::CoreExtensions::Hash::Keys#assert_valid_keys 42 (2.4%) 20 (1.1%) block in Dalli::Server::KSocket#readfull 28 (1.6%) 19 (1.1%) ActiveRecord::ConnectionAdapters::ConnectionHandler#retrieve_connection_pool 18 (1.0%) 18 (1.0%) #<Module:0x00000002004b08>.instrumenter 17 (1.0%) 16 (0.9%) Dalli::Server#deserialize 15 (0.9%) 15 (0.9%) block (2 levels) in ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#select_raw 14 (0.8%) 14 (0.8%) #<Module:0x000000033e25d0>.decode_www_form_component 13 (0.7%) 13 (0.7%) Dalli::Server#write 15 (0.9%) 11 (0.6%) ActiveSupport::CoreExtensions::Time::Calculations#minus_with_coercion 10 (0.6%) 10 (0.6%) block in ActiveRecord::Base.with_scope 10 (0.6%) 10 (0.6%) block in ActiveRecord::ConnectionAdapters::QueryCache#cache_sql 21 (1.2%) 10 (0.6%) Yajl::Encoder.encode 10 (0.6%) 10 (0.6%) Set#add 10 (0.6%) 10 (0.6%) block (2 levels) in ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#result_as_array 10 (0.6%) 10 (0.6%) ActiveSupport::CoreExtensions::Time::Calculations::ClassMethods#time_with_datetime_fallback 9 (0.5%) 9 (0.5%) ActiveRecord::DynamicFinderMatch#initialize 9 (0.5%) 9 (0.5%) ActiveSupport::LogSubscriber.logger 9 (0.5%) 9 (0.5%) block in ActionController::Base.action_methods 9 (0.5%) 9 (0.5%) block in ActionController::Base.action_methods 9 (0.5%) 9 (0.5%) block (2 levels) in ActiveRecord::Base.connection_handler=
Hmm, why is statsd slow?
Pull out good old benchmark
$ ruby test/profile/statsd.rb user system total real udp with connect
0.010000 0.000000 0.010000 ( 0.074522) udp without connect 0.120000 0.530000 0.650000 ( 13.096515) statsd with connect 0.000000 0.090000 0.090000 ( 0.103520) statsd without connect 0.100000 0.620000 0.720000 ( 13.483539)
WIN!
None
Case 3: THE HOLIDAY SCALE
None
Some times you can throw money at the problem
None
Case 4: SHRINKING THE GAP
Start at the top, work your way down. Starting with
a HITLIST
Number of Requests x 90th Percentile Response Time Total Time
None
None
Using Stackprof flamegraphs on production.
Using Stackprof flamegraphs on production. SET IT ON FIRE!
None
None
None
None
None
Big wins are not the point
If you’re not failing you’re not being honest
Don’t just make tools, learn to use them
twitter: @aq github.com/quirkey github.com/paperlesspost Thanks!