Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Optimizing Go: From 3k req/s/core to 480k req/s...
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Ashish
November 20, 2014
Technology
4.7k
22
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Optimizing Go: From 3k req/s/core to 480k req/s/core
Ashish
November 20, 2014
Other Decks in Technology
See All in Technology
アジャイルな経理と Claude Code と経営の未来
kawaguti
PRO
3
120
攻撃者視点で考えるDetection Engineering
cryptopeg
3
1.9k
【NRUG vol.18】KubernetesにおけるNew Relicデータ取得量削減の考え方
nrug_member
0
140
On-behalf-of Token exchange with AgentCore Identity
hironobuiga
2
220
作って終わりにしない タイミーのセマンティックレイヤー育成の現在地
chanyou0311
4
2.4k
人材育成分科会.pdf
_awache
4
260
SONiCのLinuxベースを活かしたZabbix監視
sonic
0
180
AWS Security Agent といっしょに脅威モデリングをやってみよう
amarelo_n24
0
100
AIソロプレナー時代に2ヶ月で20人増員した事業創造会社の開発組織の話
miyatakoji
0
670
なぜ Platform Engineering の土台に Kubernetes を選ぶのか
r4ynode
2
650
日本 Fintech 未来予測レポート 2027〜2028年(オリジナル版)
8maki
0
2.2k
SONiCの統計情報を取得したい
sonic
0
180
Featured
See All Featured
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
201
75k
Redefining SEO in the New Era of Traffic Generation
szymonslowik
1
340
Building a Modern Day E-commerce SEO Strategy
aleyda
45
9.1k
Heart Work Chapter 1 - Part 1
lfama
PRO
7
36k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
659
62k
Keith and Marios Guide to Fast Websites
keithpitt
413
23k
Visual Storytelling: How to be a Superhuman Communicator
reverentgeek
2
560
How People are Using Generative and Agentic AI to Supercharge Their Products, Projects, Services and Value Streams Today
helenjbeal
1
210
Avoiding the “Bad Training, Faster” Trap in the Age of AI
tmiket
0
180
Design in an AI World
tapps
1
240
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
16k
Raft: Consensus for Rubyists
vanstee
141
7.5k
Transcript
Optimization 3k req/s/core to 480k req/s/core
DDoS
Layers
User Agents 537 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
FREE; .NET CLR 1.1.4322) 272 Mozilla/4.0 (compatible; MSIE 6.0; MSIE 5.5; Windows NT 4.0) Opera 7.0 [en] 269 Mozilla/4.0 (compatible; MSIE 6.0; MSIE 5.5; Windows NT 5.0) Opera 7.02 Bork-edition [en] 264 Opera/8.00 (Windows NT 5.1; U; en) 264 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; KKman2.0) 261 Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0; .NET CLR 1.1.4322) 258 Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727) 255 Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt) 253 Opera/7.60 (Windows NT 5.2; U) [en] (IBM EVV/3.0/EAK01AG9/LE) 251 Opera/7.54 (Windows NT 5.1; U) [pl] 251 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; WOW64; SV1; .NET CLR 2.0.50727) 251 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; T312461) 248 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; Win64; AMD64) 247 Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0) 243 Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)
Referer 172 http://pvppw.ru/ 166 text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,*/*;q=0.5 164 az-us 162 zh, en-us;
q=0.8, en; q=0.6 161 http://zhyk.ru/ 160 en-en,en;q=0.8,en-us;q=0.5,en;q=0.3 157 http://www.niagarastar.ru/ 152 az-ua 150 http://kremlin.ru/ 150 application/xml, image/png, text/html 149 text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 149 http://premier.gov.ru/ 148 text/x-dvi; q=.8; mxb=100000; mxt=5.0, text/x-c 147 text/html, */* 147 en-us,en;q=0.5
Architecture
Architecture
Architecture
Manageable λ • Reduce the set of clients that you
coordinate the state of, say top k
Algorithms • Space Saving algorithm (https://icmi.cs.ucsb.edu/ research/tech_reports/reports/2005-23.pdf) • An implementation
in Go (https://github.com/ cloudflare/golibs)
Perfect!
Benchmark Test • func BenchmarkFoo(b *testing.B) • b.N • http://dave.cheney.net/2013/06/30/how-to-write-
benchmarks-in-go
Slow! (3k req/s/core)
Benchmark CPU Profile • go test -bench=. -cpuprofile=cpu.out
None
Virtual CPU
Real CPU Profile • Copying memory is expensive • Copy
pointers instead
Keeping Elements Sorted • Array worst case: O(n) • Priority
queue: O(log n)
O(n), O(log n)
First Pass Optimization • Reduce the size of memory needed
to be copied • Reduce the number of times copying is needed
75k req/s/core
Wrong Output!
Correctness • Processing rate was approaching workable • But the
rate estimation was grossly inaccurate
The Distribution 0 2.5 5 7.5 10 A F B
O U C D E G H I J K L M N P Q R S T Requests
Lesson • Read the paper properly • Streaming algorithms tend
to output estimates • Know in what scenarios they fail
Naïve Approach • Use a map for counting • There
is no second bullet
Find Top k • Quicksort: O(n log n) • Quickselect:
O(n)
O(n log n), O(n)
Reduce Data Set • Prune the map of ultra low
values • Use sort from standard library
120k req/s/core
Test In Production
More Like
Test In Production • Run in dark mode (no side
effects) • Needs to be faster to keep up with attacks during peak load
Hello perf top
Garbage • String manipulation produces garbage • Cannot use slices
as keys in maps, use arrays
480k req/s/core
None
Takeaway • Start simple • Benchmark • Profile • Verify
correctness • Back-of-the-envelope calculations are helpful
Future Work • This was put into production • Later,
we switched to using lock-free algorithms where possible to reduce the load on CPU
We Are Hiring
–Abraham Lincoln “Join CloudFlare.”
–Me “Special thanks to Albert Strasheim and Stephan Lachowsky.”
The End • Ashish Gandhi • @ashishgandhi_ •
[email protected]
•
Easy questions?