Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
平和なConsul Cluster運用 / consul-casual-1
Search
FUJIWARA Shunichiro
August 01, 2016
Technology
17
2.6k
平和なConsul Cluster運用 / consul-casual-1
Consul Casual Talks #1
http://connpass.com/event/35836/
FUJIWARA Shunichiro
August 01, 2016
Tweet
Share
More Decks by FUJIWARA Shunichiro
See All by FUJIWARA Shunichiro
ISUCONに強くなるかもしれない日々の過ごしかた/Findy ISUCON 2024-11-14
fujiwara3
8
880
「最高のチューニング」をしないために / hack@delta 24.10
fujiwara3
21
3.9k
AWS Lambdaで実現するスケーラブルで低コストなWebサービス構築/YAPC::Hakodate2024
fujiwara3
10
4.3k
CEL(Common Expression Language)で書いた条件にマッチしたIAM Policyを見つける / iam-policy-finder
fujiwara3
2
1.4k
awslim - Goで実装された高速なAWS CLIの代替品を作った/layerx.go#1
fujiwara3
6
730
AWS CLIの起動が重くてつらいので aws-sdk-client-go を書いた / kamakura.go#6
fujiwara3
7
10k
コードを書く隙間を見つけて生きていく技術/Findy 思考の現在地
fujiwara3
31
7.1k
fujiwara-ware OSSをひたすら紹介する/ya8-2024
fujiwara3
8
770
Amazon ECSで好きなだけ検証環境を起動できるOSSの設計・実装・運用 / YAPC::Hiroshima 2024
fujiwara3
25
8.5k
Other Decks in Technology
See All in Technology
"とにかくやってみる"で始めるAWS Security Hub
maimyyym
2
100
CysharpのOSS群から見るModern C#の現在地
neuecc
2
3.6k
SREが投資するAIOps ~ペアーズにおけるLLM for Developerへの取り組み~
takumiogawa
2
480
Adopting Jetpack Compose in Your Existing Project - GDG DevFest Bangkok 2024
akexorcist
0
120
『Firebase Dynamic Links終了に備える』 FlutterアプリでのAdjust導入とDeeplink最適化
techiro
0
170
ノーコードデータ分析ツールで体験する時系列データ分析超入門
negi111111
0
430
Security-JAWS【第35回】勉強会クラウドにおけるマルウェアやコンテンツ改ざんへの対策
4su_para
0
190
Oracle Cloud Infrastructureデータベース・クラウド:各バージョンのサポート期間
oracle4engineer
PRO
29
13k
Next.jsとNuxtが混在? iframeでなんとかする!
ypresto
1
120
生成AIが変えるデータ分析の全体像
ishikawa_satoru
0
170
Lambda10周年!Lambdaは何をもたらしたか
smt7174
2
130
100 名超が参加した日経グループ横断の競技型 AWS 学習イベント「Nikkei Group AWS GameDay」の紹介/mediajaws202411
nikkei_engineer_recruiting
1
170
Featured
See All Featured
BBQ
matthewcrist
85
9.3k
Optimising Largest Contentful Paint
csswizardry
33
2.9k
How to train your dragon (web standard)
notwaldorf
88
5.7k
How STYLIGHT went responsive
nonsquared
95
5.2k
Building a Modern Day E-commerce SEO Strategy
aleyda
38
6.9k
Git: the NoSQL Database
bkeepers
PRO
427
64k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
0
110
Stop Working from a Prison Cell
hatefulcrawdad
267
20k
Building an army of robots
kneath
302
43k
10 Git Anti Patterns You Should be Aware of
lemiorhan
655
59k
Gamification - CAS2011
davidbonilla
80
5k
A Modern Web Designer's Workflow
chriscoyier
693
190k
Transcript
ฏͳ Consul cluster ӡ༻ Consul Casual Talks #1@fujiwara
౻ݪ ढ़Ұ @fujiwara github.com/fujiwara sfujiwara.hatenablog.com ٕज़෦
Game & Community
Agenda Consulͷ׆༻ࣄྫ ฏʹӡ༻͢ΔͨΊͷϙΠϯτ
Consulͷ׆༻ࣄྫ 1. Internal DNS (node, service) 2. maintͰϝϯςφϯε 3. StretcherʹΑΔσϓϩΠɺChef࣮ߦ
4. consul-templateʹΑΔnginxͷઃఆߋ৽ 5. 1͔͠ಈ͔ͨ͘͠ͳ͍daemonͷഉଞىಈ
Internal DNS
Internal DNS (node, service) node໊ (ྫ kayac-web-i-1234567...) service໊(ྫ) • log-aggregator
: Fluentdͷूαʔό • log-analyzer : Norikra • internal-proxy : ֎ʹग़ͯߦͨ͘ΊͷSquid • internal-mta : ֎ʹग़ͯߦͨ͘ΊͷPostfix
Internal DNS (node, service) dnsmasqΛશͰىಈ .consul υϝΠϯͷ໊લղܾconsul agent 127.0.0.1:53 Λdnsmasq͕Listen͢Δ
# dnsmasq.conf server=/consul/127.0.0.1#8600 bind-interfaces listen-address=127.0.0.1
None
Internal DNS (node, service) resolv.conf Ͱ (node|service).consul ΛݕࡧυϝΠϯʹࢦఆ → node໊ɺservice໊͚ͩͰଓͰ͖Δ
# /etc/resolv.conf search node.consul service.consul nameserver 127.0.0.1 # dnsmasq nameserver 172.16.0.2 # VPC resolver nameserver 172.16.0.254 # Unbound on EC2
bash-completionͰsshͷϗετ໊ิ ~/.bash_profile _known_hosts_real() { local members=$(consul members -status=alive | awk
'!/Node/{printf("%s ", $1)}') COMPREPLY=( $( \ compgen -W "$members" \ ${COMP_WORDS[COMP_CWORD]} \ ) ) return 0 } ੜ͖͍ͯΔϗετͷΈิީิʹͳΔ http://qiita.com/sfujiwara/items/f4fa907ead53ed104e1a
FluentdͷूαʔόૹΔઃఆ ConsulͰఏڙ͢ΔDNS໊ϥϯυϩϏϯͰૹ৴ <match **> type forward expire_dns_cache 15 dns_round_robin true
heartbeat_type tcp <server> host log-aggregator.service.consul </server> </match> ૹ৴ઌͷྻڍෆཁɺࣗಈΓ͠
maintͰϝϯςφϯε
consul maint consul maint -enable [-reason "..."] ͋ΔnodeΛϝϯςφϯεϞʔυʹ͢Δ → serviceͷ໊લղܾ͔Β֎ΕΔ
→ nodeͷ໊લղܾͰ͖Δ (sshͱ͔)
consul maint༻ྫ FluentdूαʔόΛϝϯςφϯε͍ͨ͠߹ʹmaint → DNS͔Β֎ΕΔͷͰૹ৴͕ࢭ·Δ (expire_dns_cache ͷઃఆ͕ඞཁ) NorikraͷΠϯελϯεΛऔΓସ͍͑ͨͱ͖ʹ • ৽͍͠ϗετΛ
maint -enable ͰηοτΞοϓ • چͰ maint -enable, ৽Ͱ maint -disable • DNSͰೖΕସΘΔͷͰૹ৴ઌ͕ΓସΘΔ
PackerͰ AMI ࡞࣌ʹ maint 1. consul cluster ʹ join 2.
maint -enable (ߏஙதʹΈࠐ·Εͳ͍Α͏ʹ) 3. ChefͰߏங 4. maintঢ়ଶͷ·· AMI ࡞ 5. AMI͔Βىಈͨ͠Πϯελϯεmaintͷ·· 6. ىಈޙͷॾʑ͕ऴΘͬͨΒ maint -disable → αʔϏεΠϯ
maintͳΒىಈ͠ͳ͍ daemontools ͷ run script #!/bin/bash maint=$(consul maint) if [[
$maint != "" ]]; then echo "$maint" sleep 10 exit 1 fi exec ... ϝϯς࣌ʹىಈͯ͠ཉ͘͠ͳ͍daemonΛ੍ޚ (maint -enableʹͳͬͯstopͨ͠Γ͠ͳ͍)
StretcherʹΑΔσϓϩΠ
StretcherʹΑΔσϓϩΠ github.com/fujiwara/stretcher Consul / Serf ͱ࿈ܞͯ͠ಈ͘σϓϩΠπʔϧ
None
StretcherͰChef࣮ߦ Chef-Server → Stretcher + Chef-Solo • Chef-Serverr͕ SPOF /
ϘτϧωοΫʹͳΒͳ͍ • શʹಉ͡tar, eventΛˠద༻͢ΔjsonΛ֤ϊʔυͰܾఆ # /etc/sysconfig/hostname-prefix HOSTNAME_PREFIX="xxx-app" → nodes/xxx-app.json Λద༻
ChefͷroleݕࡧΛserviceఆٛͰ /etc/consul.d/role.json { "service": { "name": "role", "tags": [ "batch-server",
"db-client", ... ] } } Serviceͱͯ͠ఆٛͯ͠ݕࡧՄೳʹ http://localhost:8500/v1/catalog/service/role? tag=db-client
http://localhost:8500/v1/catalog/service/role? tag=internal-proxy [ { "Node": "xxx-i-10bf0fe2", "Address": "10.0.0.123", "ServiceID": "role",
"ServiceName": "role", ... }, { "Node": "xxx-i-3c1b72b3", "Address": "10.0.1.234", "ServiceID": "role", "ServiceName": "role", ... } ]
DaemontoolsཧԼͷdaemonserviceఆٛ { "service": { "name": "daemontools", "tags": [ "app", stretcher",
"gunfish", ... ] } }
͋ΔdaemontoolsཧϓϩηεΛ࠶ىಈ͍ͨ͠ curl http://localhost:8500/v1/catalog/service/ daemontools?tag=gunfish | jq -r ".[].Node" xxx-admin-i-0391d6162be552655 xxx-app-i-01a7ff42f4796be4f
xxx-app-i-05bd652734828b522 xxx-batch-i-0095ac858fe87d8e5 Regexp::TrieͰ࠷దͳਖ਼نදݱʹͯ͠ consul exec consul exec -node '(?:xxx\-(?:a(?:dmin|pp)|batch))' "svc -h /service/gunfish"
consul-templateʹΑΔnginxͷઃఆߋ৽
consul-template https://github.com/hashicorp/consul-template • Consul KVͷɺServiceͷղܾ݁ՌͳͲΛτϦΨʹ • ςϯϓϨʔτߋ৽ɺҙscript kick͕Ͱ͖Δ
nginxͷઃఆߋ৽ # config.hcl template { source = "/etc/nginx/spam.ip.conf.ctmpl" destination =
"/etc/nginx/spam.ip.conf" command = "service nginx reload" perms = 0644 backup = true } # spam.ip.conf.ctmpl {{key "spam_ips"}} localhost:8500/v1/kv/spam_ips ʹPUT͢Δ͚ͩͰઃఆߋ৽
1͔͠ಈ͔ͨ͘͠ͳ͍daemonͷഉଞىಈ
1͔͠ಈ͔ͨ͘͠ͳ͍daemonͷഉଞىಈ WebSocketड৴Ͱಈ͘Slack bot→ 2Ҏ্Ͱಈ͘ͱ͢Δ ͰՄ༻ੑΛ͍࣋ͨͤͨ…
1͔͠ಈ͔ͨ͘͠ͳ͍daemonͷഉଞىಈ consul lock Λ͏ ϩοΫΛऔಘͰ͖ͨΒࢦఆͨ͠ίϚϯυ͕࣮ߦ͞ΕΔwrapper consul lock -n 1 nuko
"/path/to/run-nuko.sh" Consul leader͕ೖΕସΘΔͱϩοΫ͕ղ์͞ΕΔͷͰҙ
ฏʹӡ༻͢ΔͨΊͷϙΠϯτ
ฏʹӡ༻͢ΔͨΊͷϙΠϯτ RaftΛ(େ·͔ʹͰ͍͍ͷͰ)͓ͬͯ͘ http://thesecretlivesofdata.com/raft/ ࢄڥͰͷ߹ҙܗΞϧΰϦζϜ • Ϧʔμʔબग़ʹʮաʯͷ߹ҙ͕ඞཁ • 2 = 1མͪΔͱա(=2)͕औΕͳ͍
• 3 = 1མͪͯա(=2)͕औΕΔ • 4 = 2མͪΔͱա(=3)͕औΕͳ͍
Deployment Table ຊ൪Ͱ࠷3, 3 or 5͕ਪ consul.io/docs/internals/consensus.html
ServerʹඞཁͳϦιʔε • CPU: 2CPUͰे • Memory: 20MBʙ • Disk: 2MBʙ
Memory, DiskKVͷར༻ঢ়گ࣍ୈ KV dump JSON 10MB, data_dir/raft 120MB → consul agent RSS 250MB
Serverʹઐ༻ϗετ͕ඞཁʁ consul agentࣗମͦΕ΄ͲϦιʔεΛ༻͠ͳ͍ Disk IO͕ߴෛՙͳ߹ʹRaftͷHeartbeat͕ࣦഊ͍͢͠ • Timeout 500ms • Heartbeatʹࣦഊ͢ΔͱLeaderબग़͕ߦΘΕΔ
• ௨ৗ2,3ඵͰબग़ྃ͢Δ • Consul server tmpfs Λอଘઌʹͯ͠Disk IOͷӨڹճආ
ߴՄ༻ੑͷͨΊʹ ServerʹΑΓಉ࣌ʹোΛىͯ͜͠ ͳ͍node͕มΘΔ • 3 node → 1 • 5
node → 2 3 nodeߏ࣌ɺ2མͪͯΓ1ʹͳ ͬͯ͠·͏ͱLeader͕બग़Ͱ͖ͳ͍ ࣌ؒΓ͢ϝϯςφϯε࣌ʹҰ࣌ తʹServer nodeΛ૿͢ख
ߴՄ༻ੑͷͨΊʹ Server nodeͷfailoverࣗಈ ! Ϣʔβৗʹlocalhostͷagent͚ͩΛΈ͍ͯΕΑ͍
nodeো࣌ͷӨڹ ! LeaderͰͳ͍ → " ଞnodeʹӨڹͳ͠ ! Leader → "
Leader࠶બग़ σϑΥϧτͰͯ͢ͷಡΈॻ͖ΛLeader͕ॲཧ (ڧҰ؏ੑ) Leader͕ܾ·Δ·ͰΞΫηεෆೳ (DNS, HTTP)
Stale mode (DNS) Leader࠶બग़௨ৗ2ʙ3ඵͰྃ ͦͷؒDNSͰNode, Service໊ղܾΛ͍ͨ͠ʁ → Stale mode :
Leaderະબग़ͰԠՄೳ "dns_config":{ "allow_stale": true, // default false "max_stale": "10s" // default 5s } ݁Ռݹ͍Մೳੑ͕͋Δ(݁Ռ߹ੑ)
DNS TTL defaultTTL 0 → cache͞Εͳ͍ node, serviceผʹTTLΛઃఆՄೳ DNS cache(ͨͱ͑dnsmasq)Λલஈʹஔͯ͠cacheͰ͖Δ
"dns_config":{ "node_ttl": "60s", "service_ttl": { "*": "15s" } }
Stale mode (HTTP API) HTTP APIͰstale modeʹ͢Δ߹Ҿ stale $ curl
"http://127.0.0.1:8500/v1/kv/web/key1?stale" staleҾͳ͠ͰLeaderબग़தʹΞΫηε → 500 Internal Server Error
ӡ༻தͷUpgrade consul.io/docs/upgrading.html consul.io/docs/upgrade-specific.html όʔδϣϯผʹҙ͕͋ΔͷͰυΩϡϝϯτΛ ॱ൪ʹAgentΛೖΕସ͑Δ͜ͱͰ Rolling upgradeՄೳ (Leader nodeೖΕସ͑Ͱ࠶બग़ى͖Δ)
҆ఆੑ v0.2͔࣌Β2Ҏ্ӡ༻ Agentϓϩηε͕མͪͨ͜ͱ1ճ͚ͩ(0.4.1࣌) EBS(gp2)ͷΫϨδοτރׇ → IO waitେྔ → panic: Timeout
starting MDB transaction ΦϖϛεͰServerΛམͱ͗͢͠Δͱճ෮ෆೳ
KVͷόοΫΞοϓ ͋Δ֊ͷԼͷΛ࠶ؼతʹऔΓ͍ͨ߹ recurse $ curl -s "http://127.0.0.1:8500/v1/kv/?recurse" [ {"CreateIndex":112,"ModifyIndex":115,"LockIndex":0, "Key":"key1","Flags":123,"Value":"dGVzdA=="},
{"CreateIndex":122,"ModifyIndex":122,"LockIndex":0, "Key":"key2","Flags":0,"Value":"dGVzdDI="}, {"CreateIndex":124,"ModifyIndex":124,"LockIndex":0, "Key":"test/1","Flags":0,"Value":"dGVzdDM="} ] Key, Flags, ValueΛPUT͠ͳ͓͠ͰϨετΞͰ͖Δ
࣮ࡍʹLeader͕ೖΕସΘͬͨͱ͖ͷϩά 2016/07/30 10:07:28 [WARN] raft: Heartbeat timeout reached, starting election
2016/07/30 10:07:28 [INFO] raft: Node at 10.0.2.132:8300 [Candidate] entering Candidate state 2016/07/30 10:07:30 [WARN] raft: Election timeout reached, restarting election 2016/07/30 10:07:30 [INFO] raft: Node at 10.0.2.132:8300 [Candidate] entering Candidate state 2016/07/30 10:07:30 [INFO] raft: Election won. Tally: 3 2016/07/30 10:07:30 [INFO] raft: Node at 10.0.2.132:8300 [Leader] entering Leader state 2016/07/30 10:07:30 [INFO] consul: cluster leadership acquired 2016/07/30 10:07:30 [INFO] consul: New leader elected: xxx-consul-i-ff26ca5a 2ඵఔͰճ෮ DNSͷcache / stale mode ͰαʔϏεӨڹͳ͠ stale໌ࣔ͠ͳ͍HTTP API500ʹͳΔˠৗ࣌ୟ͖·͘Δͷ…?
࣮ࡍʹ͋ͬͨා͍
consul exec Ͱେྔ݁Ռऔಘ consul exec "cat /var/log/foo.log" | grep ...
֤ϗετͷϩάΛconsul execͰऔಘ͠Α͏ͱͨ͠ → consul exec KVʹҰ୴อଘ͢ΔͷͰϝϞϦ/DBංେԽ serverΛ1ͣͭ࠶ىಈͯ͠ճ෮
ΦϖϛεͰΫϥελ่յ upgrade͔ͨͬͨ͠ 3ߏͷserverͷ1Λམͱͯ͠ɺ৽͍͠όΠφϦͰىಈͨ͠ (ͭΓͩͬͨ) ͪΌΜͱىಈ͍ͯ͠ͳ͍ͷʹ2ͷαʔόΛམͱͨ͠ → ่յ
่յͨ͠ΒͲ͏͢Ε 1. མͪண͘ 2. serverΛશ෦ࢭΊΔ 3. σʔλ(data_dir)શ෦ফ͢ 4. serverΛ -bootstrap-expect
N Ͱىಈ • start_join ·ͨ खಈͰ join 5. (ඞཁͳΒ) KVΛόοΫΞοϓ͔Β͢