Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
lpw-2012
Search
Oleg Komarov
November 24, 2012
Programming
1
350
lpw-2012
Reliable Cron Jobs in Distributed Environment
Oleg Komarov
November 24, 2012
Tweet
Share
More Decks by Oleg Komarov
See All by Oleg Komarov
yapc::eu 2013
komarov
0
200
Exploring Plack Middlewares
komarov
0
180
yapc_eu_2012
komarov
2
550
Other Decks in Programming
See All in Programming
C-Shared Buildで突破するAI Agent バックテストの壁
po3rin
0
380
モデル駆動設計をやってみようワークショップ開催報告(Modeling Forum2025) / model driven design workshop report
haru860
0
260
なあ兄弟、 余白の意味を考えてから UI実装してくれ!
ktcryomm
11
11k
TestingOsaka6_Ozono
o3
0
140
Microservices rules: What good looks like
cer
PRO
0
1.3k
堅牢なフロントエンドテスト基盤を構築するために行った取り組み
shogo4131
8
2.3k
Building AI Agents with TypeScript #TSKaigiHokuriku
izumin5210
6
1.3k
WebRTC、 綺麗に見るか滑らかに見るか
sublimer
1
160
Flutter On-device AI로 완성하는 오프라인 앱, 박제창 @DevFest INCHEON 2025
itsmedreamwalker
1
110
バックエンドエンジニアによる Amebaブログ K8s 基盤への CronJobの導入・運用経験
sunabig
0
150
令和最新版Android Studioで化石デバイス向けアプリを作る
arkw
0
400
組み合わせ爆発にのまれない - 責務分割 x テスト
halhorn
1
150
Featured
See All Featured
The Power of CSS Pseudo Elements
geoffreycrofte
80
6.1k
Documentation Writing (for coders)
carmenintech
76
5.2k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3.1k
Building a Scalable Design System with Sketch
lauravandoore
463
34k
Unsuck your backbone
ammeep
671
58k
Designing for Performance
lara
610
69k
The Cost Of JavaScript in 2023
addyosmani
55
9.3k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
231
22k
Code Review Best Practice
trishagee
74
19k
Faster Mobile Websites
deanohume
310
31k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
1.8k
Keith and Marios Guide to Fast Websites
keithpitt
413
23k
Transcript
Reliable Cron Jobs in Distributed Environment Oleg Komarov 2012-11-24 1/26
Presentation available at https://speakerdeck.com/komarov/lpw-2012 http://bit.ly/VLuT6g 2/26
Context 3 independent projects with shared infrastructure • over 30
boxes • over 200 scripts, 30K+ SLOC • packaged in appr. 20 deb-packages 3/26
TL;DR Reliable Cron Jobs in Distributed Environment 4/26
TL;DR Reliable Cron Jobs in Distributed Environment ... are HARD
to get right 4/26
Cron Jobs in a Vacuum • locks • logging and
output • monitoring • profiling 5/26
Logging and Output • log START and FINISH • log
enough details 6/26
Logging and Output • log START and FINISH • log
enough details • use log + STDERR for important things • use MAILTO to catch that output 6/26
Logging and Output • log START and FINISH • log
enough details • use log + STDERR for important things • use MAILTO to catch that output 6/26
Monitoring • be confident that it actually works • it
must not fail when you system fails • have a plan of action 7/26
Monitoring • be confident that it actually works • it
must not fail when you system fails • have a plan of action 7/26
What to monitor • hardware errors • free disk space
• load • crond is alive • age of generated file, queue size, etc. 8/26
Profiling • Does it need 1GB or 10GB? • What
does it take so long to complete? • How many db queries does it run? 9/26
Profiling • Does it need 1GB or 10GB? • What
does it take so long to complete? • How many db queries does it run? Measure and improve 9/26
More to consider • crash-safe • documentation • parallel execution
• resource limits (ulimit/cgroups) 10/26
Deployment Packages 11/26
Deployment Boxes 12/26
Cron Package Just populate my-project-scriptsN.cron.d file 13/26
Cron Package Just populate my-project-scriptsN.cron.d file Don’t write it by
hand, do it automatically 13/26
Cron Package Just populate my-project-scriptsN.cron.d file Don’t write it by
hand, do it automatically Put some METADATA in your scripts 13/26
Metadata =head1 METADATA <crontab> package: scriptsN params: --mod 2 --rem
0 time: */2 * * * * </crontab> <crontab> package: scriptsN params: --mod 2 --rem 1 time: */2 * * * * </crontab> =cut 14/26
Simple Setup As simple as possible: one box per package
15/26
Simple Setup As simple as possible: one box per package
apt-get purge && kill (or wait) && apt-get install 15/26
!%*#$ Back to Earth Network 16/26
With Extra Boxes Now you have some promblems to solve:
• locks • logs • load 17/26
Net::ZooKeeper::Lock Apache ZooKeeperTM is an effort to develop and maintain
an open-source server which enables highly reliable distributed coordination. Net::ZooKeeper::Lock implements distributed locks via ZooKeeper. 18/26
Introducing Switchman https://github.com/komarov/switchman 19/26
Overview 20/26
Configuration Crontabs are installed everywhere, switchman consults with config in
ZooKeeper: { "groups": { "scripts1": "box1", "scripts2": "box1", "scripts3": ["box1", "box2"] } } 21/26
Description switchman --config /how/to/connect/to/zk --group scriptsN -- CMD ARGS 22/26
Description switchman --config /how/to/connect/to/zk --group scriptsN -- CMD ARGS •
checks configuration • acquires a lock • watches configuration for changes • stops execution when it is not allowed anymore 22/26
Description switchman --config /how/to/connect/to/zk --group scriptsN -- CMD ARGS •
checks configuration • acquires a lock • watches configuration for changes • stops execution when it is not allowed anymore Easy to adopt with METADATA 22/26
One Problem Solved • locks • logs • load 23/26
Further Steps See facebook’s Scribe for collecting decentralized logs Resources
reservation and management A good monitoring system 24/26
Thanks! Questions? https://speakerdeck.com/komarov/lpw-2012 http://bit.ly/VLuT6g http://about.me/komarov om 25/26
Bonus Slide Get file age: # in days perl -E
’say -M $ARGV[0]’ /path/to/file # in seconds expr ‘date +%s‘ - ‘date +%s -r /path/to/file‘ Simple local locks: use Pid::File::Flock qw/:auto/; 26/26