Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
A perfect Storm for legacy migration
Search
ryan lemmer
October 21, 2013
Programming
0
1.6k
A perfect Storm for legacy migration
EuroClojure 2013 - Berlin
ryan lemmer
October 21, 2013
Tweet
Share
More Decks by ryan lemmer
See All by ryan lemmer
Modern Haskell: making sense of the type system
ryanlemmer
1
500
Distributed Computation: dealing with Time and Failure in the wild
ryanlemmer
0
770
Other Decks in Programming
See All in Programming
採用事例の少ないSvelteを選んだ理由と それを正解にするためにやっていること
oekazuma
2
1k
Effective Signals in Angular 19+: Rules and Helpers @ngbe2024
manfredsteyer
PRO
0
140
menu基盤チームによるGoogle Cloudの活用事例~Application Integration, Cloud Tasks編~
yoshifumi_ishikura
0
110
「Chatwork」Android版アプリを 支える単体テストの現在
okuzawats
0
180
SymfonyCon Vienna 2025: Twig, still relevant in 2025?
fabpot
3
1.2k
命名をリントする
chiroruxx
1
410
MCP with Cloudflare Workers
yusukebe
2
220
Kaigi on Railsに初参加したら、その日にLT登壇が決定した件について
tama50505
0
100
数十万行のプロジェクトを Scala 2から3に完全移行した
xuwei_k
0
280
ChatGPT とつくる PHP で OS 実装
memory1994
PRO
2
100
fs2-io を試してたらバグを見つけて直した話
chencmd
0
240
RWC 2024 DICOM & ISO/IEC 2022
m_seki
0
210
Featured
See All Featured
What's in a price? How to price your products and services
michaelherold
243
12k
Rebuilding a faster, lazier Slack
samanthasiow
79
8.7k
It's Worth the Effort
3n
183
28k
For a Future-Friendly Web
brad_frost
175
9.4k
Reflections from 52 weeks, 52 projects
jeffersonlam
347
20k
Side Projects
sachag
452
42k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
95
17k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
665
120k
4 Signs Your Business is Dying
shpigford
181
21k
Speed Design
sergeychernyshev
25
670
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
28
900
Building a Scalable Design System with Sketch
lauravandoore
460
33k
Transcript
@ryanlemmer a perfect storm for legacy migration CAPE TOWN @clj_ug_ct
legacy monolith Customer Accounting Billing Product Catalog CRM ... MySQL
Ruby on Rails
legacy Billing Run Customer Accounting Billing Product Catalog CRM ...
Bank Recon MySQL Ruby Ruby
legacy backlog bugs
legacy replacement replace this
legacy replacement replace substitute something that is broken, old or
inoperative
the “legacy problem” can’t fix bugs can’t add features not
performant
a “legacy solution” immutable It’s just too risky to do
in-situ changes
a “legacy solution” vintage the grapes or wine produced in
a particular season
The situation It’s not broken, just Immutable It’s valuable vintage
- still generating revenue We don’t need to “replace” We need to “make the Legacy Problem go away”
vintage migration vintage ?
vintage migration vintage We chose to migrate “financial” parts first
because it posed the highest risk to the business ?
vintage migration vintage statements MySQL Mongo & Redis
feeding off vintage vintage clients invoices ... ...
feeding off vintage statements clients invoices ? ... ...
feeding off vintage clients invoices transform old client write new
client write new invoice transform old invoice ... ...
... ... migration bridge statemen tage Big Run every night
+ incremental run every 10 mins Bridge is one-directional, Statements is read-only Imperative, sequential code
... ... new migration ? full text search stateme vintage
bridge
migration bridge: search clients invoices index- entity index-field index-field index-field
index-field index-field contacts ... ... ...
migration bridge clients invoices index-field index-field index-field index-field index-field write
client write invoice contacts index- entity search statements transform client transform invoice ... ... ... clients invoices ... ... }
... ... ... statements age search statements (batched) bridge search
About 10 million rows several hours to migrate sequentially
first pass solution Batched data migration BUT WHAT NEXT? it
was the easiest thing to do it is not performant not fault tolerant fragile because of data dependencies go parallel and distributed have fault tolerance go real-time served as scaffolding for the next solution
storm Apache Thrift + Nimbus Ingredients: Zookeeper Clojure (> 50%)
* suitable for polyglots
... storm - spouts clients index-field index-field index-field index-field index-field
write client index- entity transform client ... clients
... storm - spout SPOUT TUPLE
storm - data model TUPLE named list of values [“seekoei”
7] [“panda” 10] [147 {:name ‘John’ ...}] [253 {:name ‘Mary’ ...}] word frequency ID client
... storm - spout a SPOUT emits TUPLES UNBOUNDED STREAM
of TUPLES continuously over time a SPOUT is an
... storm - client spout [“client” {:id 147, ...}] CLIENT
SPOUT CLIENT TUPLE periodically emits a entity values
clojure spout (defspout client-‐spout ["entity" “values”] [conf context collector]
(let [next-‐client (next-‐legacy-‐client) tuple [“client” next-‐client]] (spout (nextTuple [] (Thread/sleep 100) (emit-‐spout! collector tuple)) (ack [id])))) creates a pulse
clojure spout (defspout client-‐spout ["entity" “values”] [conf context collector]
(let [next-‐client (next-‐legacy-‐client) tuple [“client” next-‐client]] (spout (nextTuple [] (Thread/sleep 100) (emit-‐spout! collector tuple)) (ack [id]))))
clojure spout [“client” {:id 147, ...}] CLIENT TUPLE (defspout client-‐spout
["entity" “values”] [conf context collector] (let [next-‐client (next-‐legacy-‐client) tuple [“client” next-‐client]] (spout (nextTuple [] (Thread/sleep 100) (emit-‐spout! collector tuple)) (ack [id])))) TUPLE SCHEMA
... storm - spout [“client” {:id 147, ...}] [“client” {:id
201, ...}] [“client” {:id 407, ...}] [“client” {:id 101, ...}] The client SPOUT packages input and emits TUPLES continuously over time
... storm - bolts transform client CLIENT SPOUT BOLT
storm - bolts (defbolt transform-‐client-‐bolt ["client"]
{:prepare true} [conf context collector] (bolt (execute [tuple] (let [h (.getValue tuple 1)] (emit-‐bolt! collector [(transform-‐tuple h)]) (ack! collector tuple)))))
storm - bolts [{:id 147, ...}] OUTGOING TUPLE [“client” {:id
147, ...}] INCOMING TUPLE (defbolt transform-‐client-‐bolt ["client"] {:prepare true} [conf context collector] (bolt (execute [tuple] (let [h (.getValue tuple 1)] (emit-‐bolt! collector [(transform-‐tuple h)]) (ack! collector tuple)))))
storm - topology (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 1)})) 1 2 ...
storm - topology (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 1)})) 1 2 ...
bolt tasks (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 1)})) 1 2 ...
bolt tasks (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 3)})) 1 2 ...
which task? (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 3)})) 1 2 ? ...
grouping - “shuffle” (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 3)})) 1 2 ...
grouping - “ field” 1 2 ... [“active” {:id 147,
...}] [12 {:inv-id 147, ...}] TUPLE SCHEMA ["client-‐id" “invoice-‐vals”] count invoices per client (in memory)
grouping - “ field” 1 2 ... [“active” {:id 147,
...}] [12 {:inv-id 147, ...}] [“active” {:id 147, ...}] [“active” {:id 147, ...}] [401 {:inv-id 32, ...}] [“active” {:id 147, ...}] [“active” {:id 147, ...}] [232 {:inv-id 45, ...}] TUPLE SCHEMA ["client-‐id" “invoice-‐vals”] group by field “client-id”
grouping - “ field” (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" [“client-‐id”]} transform-‐client-‐bolt :p 3)})) 1 2 ...
grouping - “ field” 1 2 ... [“active” {:id 147,
...}] [12 {:inv-id 147, ...}] [“active” {:id 147, ...}] [“active” {:id 147, ...}] [401 {:inv-id 32, ...}] [“active” {:id 147, ...}] [“active” {:id 147, ...}] [232 {:inv-id 45, ...}] 2 2 similar “client-id” vals go to the same Bolt Task
grouping - “ field” ... field compute aggregation
bridge - topology index-field write client write invoice index- fields
transform client transform invoice ... ... ... clients invoices contacts
storm - failure success! oops! a failure! ...
storm reliability Build a tree of tuples so that Storm
knows which tuples are related ack/fail Spouts + Bolts
storm guarantees Storm will re-process the entire tuple tree on
failure First attempt fails Storm retries the tuple tree until it succeeds
failure + idempotency write client transform client x2 x2 side-effects!
...
transactional topologies write client transform client x1 x1 run-once semantics
... strong ordering on data processing Storm Trident
search statements storm topologies real-time bridge age
topology design ... ... ...
topology design ... ... ... design the (directed) graph
grouping + parallelism index-field write client write invoice index- fields
transform client transform invoice :shuffle :shuffle :shuffle :shuffle :shuffle :shuffle :p 1 :p 1 :p 1 :p 10 :p 3 :p 3 ... ... ... tune the runtime by annotating the graph edges
topology - tuple schema [“client”] [“entity” “values”] [“invoice”] [“entity” “values”]
[“entity” “values”] [“client”] [“invoice”] [“key_val_pairs”] [“key_val”] We are actually processing streams of tuples continuously
ntage topology design clients context sales context billing context (queue)
(queue) .. .. .. .. .. ..
storm “real-time, distributed, fault-tolerant, computation system” stream processing realtime analytics
continuous computation distributed RPC ...
reflections
search statements age storm topologies vintage is first- class
search statements age storm topologies transform data
search statements age storm topologies not code refactor if you
can! (but only if it’s worth the effort)
search statements age storm topologies not a picnic because we’re
still replacing code and now we’ve added replication
but worth it Big Replace Smaller replacements In-situ changes Augment:
new alongside old Replace Evolve new Kill Starve (until irrelevant)
EUROCLOJURE Berlin 2013 thanks