Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
lpw-2012
Search
Oleg Komarov
November 24, 2012
Programming
1
340
lpw-2012
Reliable Cron Jobs in Distributed Environment
Oleg Komarov
November 24, 2012
Tweet
Share
More Decks by Oleg Komarov
See All by Oleg Komarov
yapc::eu 2013
komarov
0
190
Exploring Plack Middlewares
komarov
0
180
yapc_eu_2012
komarov
2
540
Other Decks in Programming
See All in Programming
作って理解するGOCACHEPROG / Go Conference 2025(Workshop)
mazrean
0
100
NixOS + Kubernetesで構築する自宅サーバーのすべて
ichi_h3
0
950
Goで実践するドメイン駆動開発 AIと歩み始めた新規プロダクト開発の現在地
imkaoru
4
850
iOSエンジニア向けの英語学習アプリを作る!
yukawashouhei
0
200
CSC305 Lecture 06
javiergs
PRO
0
240
詳しくない分野でのVibe Codingで困ったことと学び/vibe-coding-in-unfamiliar-area
shibayu36
3
5.1k
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
400
Claude Agent SDK を使ってみよう
hyshu
0
890
bootcamp2025_バックエンド研修_WebAPIサーバ作成.pdf
geniee_inc
0
110
Go Conference 2025: Goで体感するMultipath TCP ― Go 1.24 時代の MPTCP Listener を理解する
takehaya
9
1.7k
Go言語の特性を活かした公式MCP SDKの設計
hond0413
1
230
Android16 Migration Stories ~Building a Pattern for Android OS upgrades~
reoandroider
0
110
Featured
See All Featured
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
115
20k
Building a Scalable Design System with Sketch
lauravandoore
463
33k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
285
14k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.7k
GraphQLとの向き合い方2022年版
quramy
49
14k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
48
9.7k
Done Done
chrislema
185
16k
Faster Mobile Websites
deanohume
310
31k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.2k
Leading Effective Engineering Teams in the AI Era
addyosmani
6
450
Intergalactic Javascript Robots from Outer Space
tanoku
273
27k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
23
1.5k
Transcript
Reliable Cron Jobs in Distributed Environment Oleg Komarov 2012-11-24 1/26
Presentation available at https://speakerdeck.com/komarov/lpw-2012 http://bit.ly/VLuT6g 2/26
Context 3 independent projects with shared infrastructure • over 30
boxes • over 200 scripts, 30K+ SLOC • packaged in appr. 20 deb-packages 3/26
TL;DR Reliable Cron Jobs in Distributed Environment 4/26
TL;DR Reliable Cron Jobs in Distributed Environment ... are HARD
to get right 4/26
Cron Jobs in a Vacuum • locks • logging and
output • monitoring • profiling 5/26
Logging and Output • log START and FINISH • log
enough details 6/26
Logging and Output • log START and FINISH • log
enough details • use log + STDERR for important things • use MAILTO to catch that output 6/26
Logging and Output • log START and FINISH • log
enough details • use log + STDERR for important things • use MAILTO to catch that output 6/26
Monitoring • be confident that it actually works • it
must not fail when you system fails • have a plan of action 7/26
Monitoring • be confident that it actually works • it
must not fail when you system fails • have a plan of action 7/26
What to monitor • hardware errors • free disk space
• load • crond is alive • age of generated file, queue size, etc. 8/26
Profiling • Does it need 1GB or 10GB? • What
does it take so long to complete? • How many db queries does it run? 9/26
Profiling • Does it need 1GB or 10GB? • What
does it take so long to complete? • How many db queries does it run? Measure and improve 9/26
More to consider • crash-safe • documentation • parallel execution
• resource limits (ulimit/cgroups) 10/26
Deployment Packages 11/26
Deployment Boxes 12/26
Cron Package Just populate my-project-scriptsN.cron.d file 13/26
Cron Package Just populate my-project-scriptsN.cron.d file Don’t write it by
hand, do it automatically 13/26
Cron Package Just populate my-project-scriptsN.cron.d file Don’t write it by
hand, do it automatically Put some METADATA in your scripts 13/26
Metadata =head1 METADATA <crontab> package: scriptsN params: --mod 2 --rem
0 time: */2 * * * * </crontab> <crontab> package: scriptsN params: --mod 2 --rem 1 time: */2 * * * * </crontab> =cut 14/26
Simple Setup As simple as possible: one box per package
15/26
Simple Setup As simple as possible: one box per package
apt-get purge && kill (or wait) && apt-get install 15/26
!%*#$ Back to Earth Network 16/26
With Extra Boxes Now you have some promblems to solve:
• locks • logs • load 17/26
Net::ZooKeeper::Lock Apache ZooKeeperTM is an effort to develop and maintain
an open-source server which enables highly reliable distributed coordination. Net::ZooKeeper::Lock implements distributed locks via ZooKeeper. 18/26
Introducing Switchman https://github.com/komarov/switchman 19/26
Overview 20/26
Configuration Crontabs are installed everywhere, switchman consults with config in
ZooKeeper: { "groups": { "scripts1": "box1", "scripts2": "box1", "scripts3": ["box1", "box2"] } } 21/26
Description switchman --config /how/to/connect/to/zk --group scriptsN -- CMD ARGS 22/26
Description switchman --config /how/to/connect/to/zk --group scriptsN -- CMD ARGS •
checks configuration • acquires a lock • watches configuration for changes • stops execution when it is not allowed anymore 22/26
Description switchman --config /how/to/connect/to/zk --group scriptsN -- CMD ARGS •
checks configuration • acquires a lock • watches configuration for changes • stops execution when it is not allowed anymore Easy to adopt with METADATA 22/26
One Problem Solved • locks • logs • load 23/26
Further Steps See facebook’s Scribe for collecting decentralized logs Resources
reservation and management A good monitoring system 24/26
Thanks! Questions? https://speakerdeck.com/komarov/lpw-2012 http://bit.ly/VLuT6g http://about.me/komarov om 25/26
Bonus Slide Get file age: # in days perl -E
’say -M $ARGV[0]’ /path/to/file # in seconds expr ‘date +%s‘ - ‘date +%s -r /path/to/file‘ Simple local locks: use Pid::File::Flock qw/:auto/; 26/26