Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Real World Ruby Performance
Search
Aaron Quint
November 19, 2014
Programming
5
350
Real World Ruby Performance
My talk from RubyConf 2014 about Ruby Performance and the philosophy of performance.
Aaron Quint
November 19, 2014
Tweet
Share
More Decks by Aaron Quint
See All by Aaron Quint
Beyond JSON: Improving Inter-app Communication
aq
0
240
Fast Everything: Ruby Performance Tools and Understanding
aq
4
690
The Good, The Bad, The Ugly of Growth
aq
0
340
Chromium Embedded Framework - Go + JS
aq
0
1.6k
The Future of Ruby Performance Tooling
aq
2
890
Working with Rubyists
aq
1
200
Correlation: The Next Frontier
aq
0
470
DevStackup
aq
4
190
Paperless Ops Chef Workflow
aq
1
260
Other Decks in Programming
See All in Programming
Rancher と Terraform
fufuhu
2
550
プロポーザル駆動学習 / Proposal-Driven Learning
mackey0225
2
1.3k
The Past, Present, and Future of Enterprise Java with ASF in the Middle
ivargrimstad
0
170
AIでLINEスタンプを作ってみた
eycjur
1
230
ユーザーも開発者も悩ませない TV アプリ開発 ~Compose の内部実装から学ぶフォーカス制御~
taked137
0
190
Navigation 2 を 3 に移行する(予定)ためにやったこと
yokomii
0
340
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
420
私の後悔をAWS DMSで解決した話
hiramax
4
210
意外と簡単!?フロントエンドでパスキー認証を実現する WebAuthn
teamlab
PRO
2
780
AI時代のUIはどこへ行く?
yusukebe
18
9.1k
2025 年のコーディングエージェントの現在地とエンジニアの仕事の変化について
azukiazusa1
24
12k
print("Hello, World")
eddie
2
530
Featured
See All Featured
Done Done
chrislema
185
16k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
23
1.4k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
139
34k
VelocityConf: Rendering Performance Case Studies
addyosmani
332
24k
Agile that works and the tools we love
rasmusluckow
330
21k
Why Our Code Smells
bkeepers
PRO
339
57k
[RailsConf 2023] Rails as a piece of cake
palkan
57
5.8k
Typedesign – Prime Four
hannesfritz
42
2.8k
Java REST API Framework Comparison - PWX 2021
mraible
33
8.8k
Build The Right Thing And Hit Your Dates
maggiecrowley
37
2.9k
Automating Front-end Workflow
addyosmani
1370
200k
Imperfection Machines: The Place of Print at Facebook
scottboms
268
13k
Transcript
Real WORLD RUBY PERFORMANCE Aaron Quint / @aq / Ruby
Conf 2014
@tmm1 @SamSaffron @_ko1 SHOUTOUT
We’ll come back to who I am later. It’s [relatively]
unimportant. SKIPPING THE INTRO
I’ve learned so much over the past 5 years, what
could I share? This TALK was HARD TO WRITE
It’s a ⌘+C ⌘+P culture. TIPS And tricks are the
CLIFF NOTES of tech learning
How to THINK about a problem is much more interesting
than how to solve it. As a mentor I want to teach philosophy not snippets
The tools and tricks will change over time. Today, Take
away the process
A multi-step process. Ruby Performance as therapy
It’s a multi-step process Relax, Open up We’re going to
go deep
Step 1: Acceptance
It’s your Fault.
Really?
Yes.
None
It’s not you, It’s me.
It’s not you, It’s me.
— George Costanza (Inventor of “It’s not you, it’s me”)
It’s not you, It’s me.
Performance is about context
Doesn’t scale for what? To what degree? With what hardware?
… “X Doesn’t SCALE” IS BS
So when we talk about our ruby being slow
None
Rails
Rails 10ms
Rails Your application 10ms
Rails Your application DB 10ms
Rails Your application DB 10ms 20ms
Rails Your application DB Cache 10ms 20ms
Rails Your application DB Cache 10ms 20ms 10ms
Rails Your application DB Cache 10ms 20ms 10ms 250ms
IT’s MY FAULT.
Step 2: Diagnosis
Where did I go wrong?
METRICS! Measurement! MMMNUMBERS! Milliseconds MATTER!
Use the right one for the job. Tools abound!
Step 3: Treatment
what are the steps to fix this problem?
How many strokes for the lowest #? Playing golf.
Two angles of optimization
Proxies/Balancers Application Datastores Filesystem/OS/Hardware Individual Request Path (Controller#action)
aka, speeding up a single query, controller action, or code
path Vertical: Fix individual Elements
aka, Adding more workers per-node, buying better hardware Horizontal: Address
hardware or software across a cluster
Important Themes:
Context is crucial to acceptance
Visibility and Introspect- ability are crucial to diagnosis
Knowing your tools is crucial to treatment
I’m Aaron Quint. I’m the chief Scientist at Paperless Post.
None
Opposing forces. Features vs. speed
We realized that being fast meant being stable
CASE STUDIES in performance therapy
None
Case 1: JSON FOR DAYS
None
None
None
package:7292:1123434234234
package:7290:11234342343424 partner:8:11234342343424 partner:8:11234342343424
package:7290:11234342343424 partner:8:11234342343424 partner:8:11234342343424 package:7292:1123434234234
package:7290:11234342343424 partner:8:11234342343424 partner:8:11234342343424 package:7292:1123434234234
Uncached performance is still a problem
ppprofiler to the rescue
ppprofiler
ppprofiler • Auto-cache toggling • Benchmark • Rblineprof • As::Notification
Counts (SQL/Cache, etc) • MemoryProfiler (NEW!) • Gist-able (markdown) output
None
None
None
Rinse and Repeat Make the slowest lines faster
None
None
None
None
None
Case 2: FINGER IN THE SOCKET
Before Vday we were looking for any wins
IN BETWEEN THE LINES! stackprof + stackprof-remote
None
Ruby Process (Unicorn)
Ruby Process (Unicorn)
Ruby Process (Unicorn)
Ruby Process (Unicorn) AC::Dispatch
Ruby Process (Unicorn) AC::Dispatch MyController::Create
Ruby Process (Unicorn) AC::Dispatch MyController::Create Template::Render
Ruby Process (Unicorn) AC::Dispatch MyController::Create Template::Render Ar::Find
Ruby Process (Unicorn)
Ruby Process (Unicorn) StackProf.start rb_profile_frames() rb_profile_frames() rb_profile_frames() rb_profile_frames() StackProf.stop StackProf.dump
! [paperless@production-webapp10 current]$ stackprof tmp/stackprof-cpu-30715-1391204970.dump ================================== Mode: cpu(1000) Samples: 1761
(3.61% miss rate) GC: 128 (7.27%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 344 (19.5%) 342 (19.4%) Statsd#send_to_socket 393 (22.3%) 44 (2.5%) Statsd#sampled 44 (2.5%) 44 (2.5%) block in ActiveRecord::ConnectionAdapters::PostgreSQLPoolAdapter#execute 56 (3.2%) 29 (1.6%) block in ActiveSupport::Notifications::Fanout#listeners_for 29 (1.6%) 29 (1.6%) ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#extract_pg_identifier_from_name 26 (1.5%) 26 (1.5%) ActiveSupport::Notifications::Fanout::Subscribers::Evented#subscribed_to? 25 (1.4%) 25 (1.4%) String#blank? 25 (1.4%) 25 (1.4%) block (2 levels) in ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#select 24 (1.4%) 24 (1.4%) ActiveRecord::Base.scoped_methods 22 (1.2%) 22 (1.2%) Dalli::Server::KSocket#kgio_wait_readable 21 (1.2%) 21 (1.2%) ActiveSupport::CoreExtensions::Hash::Keys#assert_valid_keys 42 (2.4%) 20 (1.1%) block in Dalli::Server::KSocket#readfull 28 (1.6%) 19 (1.1%) ActiveRecord::ConnectionAdapters::ConnectionHandler#retrieve_connection_pool 18 (1.0%) 18 (1.0%) #<Module:0x00000002004b08>.instrumenter 17 (1.0%) 16 (0.9%) Dalli::Server#deserialize 15 (0.9%) 15 (0.9%) block (2 levels) in ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#select_raw 14 (0.8%) 14 (0.8%) #<Module:0x000000033e25d0>.decode_www_form_component 13 (0.7%) 13 (0.7%) Dalli::Server#write 15 (0.9%) 11 (0.6%) ActiveSupport::CoreExtensions::Time::Calculations#minus_with_coercion 10 (0.6%) 10 (0.6%) block in ActiveRecord::Base.with_scope 10 (0.6%) 10 (0.6%) block in ActiveRecord::ConnectionAdapters::QueryCache#cache_sql 21 (1.2%) 10 (0.6%) Yajl::Encoder.encode 10 (0.6%) 10 (0.6%) Set#add 10 (0.6%) 10 (0.6%) block (2 levels) in ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#result_as_array 10 (0.6%) 10 (0.6%) ActiveSupport::CoreExtensions::Time::Calculations::ClassMethods#time_with_datetime_fallback 9 (0.5%) 9 (0.5%) ActiveRecord::DynamicFinderMatch#initialize 9 (0.5%) 9 (0.5%) ActiveSupport::LogSubscriber.logger 9 (0.5%) 9 (0.5%) block in ActionController::Base.action_methods 9 (0.5%) 9 (0.5%) block in ActionController::Base.action_methods 9 (0.5%) 9 (0.5%) block (2 levels) in ActiveRecord::Base.connection_handler=
Hmm, why is statsd slow?
Pull out good old benchmark
$ ruby test/profile/statsd.rb user system total real udp with connect
0.010000 0.000000 0.010000 ( 0.074522) udp without connect 0.120000 0.530000 0.650000 ( 13.096515) statsd with connect 0.000000 0.090000 0.090000 ( 0.103520) statsd without connect 0.100000 0.620000 0.720000 ( 13.483539)
WIN!
None
Case 3: THE HOLIDAY SCALE
None
Some times you can throw money at the problem
None
Case 4: SHRINKING THE GAP
Start at the top, work your way down. Starting with
a HITLIST
Number of Requests x 90th Percentile Response Time Total Time
None
None
Using Stackprof flamegraphs on production.
Using Stackprof flamegraphs on production. SET IT ON FIRE!
None
None
None
None
None
Big wins are not the point
If you’re not failing you’re not being honest
Don’t just make tools, learn to use them
twitter: @aq github.com/quirkey github.com/paperlesspost Thanks!