Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
A perfect Storm for legacy migration
Search
ryan lemmer
October 21, 2013
Programming
0
1.6k
A perfect Storm for legacy migration
EuroClojure 2013 - Berlin
ryan lemmer
October 21, 2013
Tweet
Share
More Decks by ryan lemmer
See All by ryan lemmer
Modern Haskell: making sense of the type system
ryanlemmer
1
590
Distributed Computation: dealing with Time and Failure in the wild
ryanlemmer
0
840
Other Decks in Programming
See All in Programming
Testing Trophyは叫ばない
toms74209200
0
870
そのAPI、誰のため? Androidライブラリ設計における利用者目線の実践テクニック
mkeeda
2
300
意外と簡単!?フロントエンドでパスキー認証を実現する WebAuthn
teamlab
PRO
2
740
プロポーザル駆動学習 / Proposal-Driven Learning
mackey0225
2
1.3k
Cache Me If You Can
ryunen344
2
720
Updates on MLS on Ruby (and maybe more)
sylph01
1
180
プロパティベーステストによるUIテスト: LLMによるプロパティ定義生成でエッジケースを捉える
tetta_pdnt
0
330
Navigating Dependency Injection with Metro
zacsweers
3
260
ソフトウェアテスト徹底指南書の紹介
goyoki
1
150
テストコードはもう書かない:JetBrains AI Assistantに委ねる非同期処理のテスト自動設計・生成
makun
0
270
Ruby Parser progress report 2025
yui_knk
1
440
概念モデル→論理モデルで気をつけていること
sunnyone
1
110
Featured
See All Featured
For a Future-Friendly Web
brad_frost
180
9.9k
GitHub's CSS Performance
jonrohan
1032
460k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
Building Better People: How to give real-time feedback that sticks.
wjessup
368
19k
Code Reviewing Like a Champion
maltzj
525
40k
Building Adaptive Systems
keathley
43
2.7k
Large-scale JavaScript Application Architecture
addyosmani
512
110k
Designing for humans not robots
tammielis
253
25k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
252
21k
The Straight Up "How To Draw Better" Workshop
denniskardys
236
140k
A Modern Web Designer's Workflow
chriscoyier
696
190k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3k
Transcript
@ryanlemmer a perfect storm for legacy migration CAPE TOWN @clj_ug_ct
legacy monolith Customer Accounting Billing Product Catalog CRM ... MySQL
Ruby on Rails
legacy Billing Run Customer Accounting Billing Product Catalog CRM ...
Bank Recon MySQL Ruby Ruby
legacy backlog bugs
legacy replacement replace this
legacy replacement replace substitute something that is broken, old or
inoperative
the “legacy problem” can’t fix bugs can’t add features not
performant
a “legacy solution” immutable It’s just too risky to do
in-situ changes
a “legacy solution” vintage the grapes or wine produced in
a particular season
The situation It’s not broken, just Immutable It’s valuable vintage
- still generating revenue We don’t need to “replace” We need to “make the Legacy Problem go away”
vintage migration vintage ?
vintage migration vintage We chose to migrate “financial” parts first
because it posed the highest risk to the business ?
vintage migration vintage statements MySQL Mongo & Redis
feeding off vintage vintage clients invoices ... ...
feeding off vintage statements clients invoices ? ... ...
feeding off vintage clients invoices transform old client write new
client write new invoice transform old invoice ... ...
... ... migration bridge statemen tage Big Run every night
+ incremental run every 10 mins Bridge is one-directional, Statements is read-only Imperative, sequential code
... ... new migration ? full text search stateme vintage
bridge
migration bridge: search clients invoices index- entity index-field index-field index-field
index-field index-field contacts ... ... ...
migration bridge clients invoices index-field index-field index-field index-field index-field write
client write invoice contacts index- entity search statements transform client transform invoice ... ... ... clients invoices ... ... }
... ... ... statements age search statements (batched) bridge search
About 10 million rows several hours to migrate sequentially
first pass solution Batched data migration BUT WHAT NEXT? it
was the easiest thing to do it is not performant not fault tolerant fragile because of data dependencies go parallel and distributed have fault tolerance go real-time served as scaffolding for the next solution
storm Apache Thrift + Nimbus Ingredients: Zookeeper Clojure (> 50%)
* suitable for polyglots
... storm - spouts clients index-field index-field index-field index-field index-field
write client index- entity transform client ... clients
... storm - spout SPOUT TUPLE
storm - data model TUPLE named list of values [“seekoei”
7] [“panda” 10] [147 {:name ‘John’ ...}] [253 {:name ‘Mary’ ...}] word frequency ID client
... storm - spout a SPOUT emits TUPLES UNBOUNDED STREAM
of TUPLES continuously over time a SPOUT is an
... storm - client spout [“client” {:id 147, ...}] CLIENT
SPOUT CLIENT TUPLE periodically emits a entity values
clojure spout (defspout client-‐spout ["entity" “values”] [conf context collector]
(let [next-‐client (next-‐legacy-‐client) tuple [“client” next-‐client]] (spout (nextTuple [] (Thread/sleep 100) (emit-‐spout! collector tuple)) (ack [id])))) creates a pulse
clojure spout (defspout client-‐spout ["entity" “values”] [conf context collector]
(let [next-‐client (next-‐legacy-‐client) tuple [“client” next-‐client]] (spout (nextTuple [] (Thread/sleep 100) (emit-‐spout! collector tuple)) (ack [id]))))
clojure spout [“client” {:id 147, ...}] CLIENT TUPLE (defspout client-‐spout
["entity" “values”] [conf context collector] (let [next-‐client (next-‐legacy-‐client) tuple [“client” next-‐client]] (spout (nextTuple [] (Thread/sleep 100) (emit-‐spout! collector tuple)) (ack [id])))) TUPLE SCHEMA
... storm - spout [“client” {:id 147, ...}] [“client” {:id
201, ...}] [“client” {:id 407, ...}] [“client” {:id 101, ...}] The client SPOUT packages input and emits TUPLES continuously over time
... storm - bolts transform client CLIENT SPOUT BOLT
storm - bolts (defbolt transform-‐client-‐bolt ["client"]
{:prepare true} [conf context collector] (bolt (execute [tuple] (let [h (.getValue tuple 1)] (emit-‐bolt! collector [(transform-‐tuple h)]) (ack! collector tuple)))))
storm - bolts [{:id 147, ...}] OUTGOING TUPLE [“client” {:id
147, ...}] INCOMING TUPLE (defbolt transform-‐client-‐bolt ["client"] {:prepare true} [conf context collector] (bolt (execute [tuple] (let [h (.getValue tuple 1)] (emit-‐bolt! collector [(transform-‐tuple h)]) (ack! collector tuple)))))
storm - topology (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 1)})) 1 2 ...
storm - topology (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 1)})) 1 2 ...
bolt tasks (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 1)})) 1 2 ...
bolt tasks (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 3)})) 1 2 ...
which task? (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 3)})) 1 2 ? ...
grouping - “shuffle” (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 3)})) 1 2 ...
grouping - “ field” 1 2 ... [“active” {:id 147,
...}] [12 {:inv-id 147, ...}] TUPLE SCHEMA ["client-‐id" “invoice-‐vals”] count invoices per client (in memory)
grouping - “ field” 1 2 ... [“active” {:id 147,
...}] [12 {:inv-id 147, ...}] [“active” {:id 147, ...}] [“active” {:id 147, ...}] [401 {:inv-id 32, ...}] [“active” {:id 147, ...}] [“active” {:id 147, ...}] [232 {:inv-id 45, ...}] TUPLE SCHEMA ["client-‐id" “invoice-‐vals”] group by field “client-id”
grouping - “ field” (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" [“client-‐id”]} transform-‐client-‐bolt :p 3)})) 1 2 ...
grouping - “ field” 1 2 ... [“active” {:id 147,
...}] [12 {:inv-id 147, ...}] [“active” {:id 147, ...}] [“active” {:id 147, ...}] [401 {:inv-id 32, ...}] [“active” {:id 147, ...}] [“active” {:id 147, ...}] [232 {:inv-id 45, ...}] 2 2 similar “client-id” vals go to the same Bolt Task
grouping - “ field” ... field compute aggregation
bridge - topology index-field write client write invoice index- fields
transform client transform invoice ... ... ... clients invoices contacts
storm - failure success! oops! a failure! ...
storm reliability Build a tree of tuples so that Storm
knows which tuples are related ack/fail Spouts + Bolts
storm guarantees Storm will re-process the entire tuple tree on
failure First attempt fails Storm retries the tuple tree until it succeeds
failure + idempotency write client transform client x2 x2 side-effects!
...
transactional topologies write client transform client x1 x1 run-once semantics
... strong ordering on data processing Storm Trident
search statements storm topologies real-time bridge age
topology design ... ... ...
topology design ... ... ... design the (directed) graph
grouping + parallelism index-field write client write invoice index- fields
transform client transform invoice :shuffle :shuffle :shuffle :shuffle :shuffle :shuffle :p 1 :p 1 :p 1 :p 10 :p 3 :p 3 ... ... ... tune the runtime by annotating the graph edges
topology - tuple schema [“client”] [“entity” “values”] [“invoice”] [“entity” “values”]
[“entity” “values”] [“client”] [“invoice”] [“key_val_pairs”] [“key_val”] We are actually processing streams of tuples continuously
ntage topology design clients context sales context billing context (queue)
(queue) .. .. .. .. .. ..
storm “real-time, distributed, fault-tolerant, computation system” stream processing realtime analytics
continuous computation distributed RPC ...
reflections
search statements age storm topologies vintage is first- class
search statements age storm topologies transform data
search statements age storm topologies not code refactor if you
can! (but only if it’s worth the effort)
search statements age storm topologies not a picnic because we’re
still replacing code and now we’ve added replication
but worth it Big Replace Smaller replacements In-situ changes Augment:
new alongside old Replace Evolve new Kill Starve (until irrelevant)
EUROCLOJURE Berlin 2013 thanks