Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
はじめるCassandra
Search
kakerukaeru
September 08, 2015
1
290
はじめるCassandra
Cassandra Meetup in Tokyo, Summer 2015でお話してきた奴
kakerukaeru
September 08, 2015
Tweet
Share
More Decks by kakerukaeru
See All by kakerukaeru
大規模ImageOptimizer利用案件から学ぶ 安心安全のCDN移行 / Fastly yamagoya 2022
kakerukaeru
1
1.2k
事業と歩む Ameba システム刷新の道 / the-road-to-ameba-system-renovation-aws-summit-online
kakerukaeru
0
1.7k
事業と歩むAmebaシステム刷新の道 / the-road-to-ameba-system-renovation-cadc
kakerukaeru
0
530
The Shining / ~all work and no play makes jack a dull boy~
kakerukaeru
0
330
AmebaとCDNのお付き合いの歴史 / ameba cdn waiwai
kakerukaeru
0
100
fastlyでええかんじにサイトリニューアル @ Yamagoya Meetup 2018 / e-kanzi Website renewal with fastly
kakerukaeru
0
500
ghe_ameba_arekore
kakerukaeru
2
2.1k
20160907_Akamai_Tech_Deep_Dive
kakerukaeru
0
2.1k
ansible is nani
kakerukaeru
1
350
Featured
See All Featured
Designing Dashboards & Data Visualisations in Web Apps
destraynor
229
52k
4 Signs Your Business is Dying
shpigford
180
21k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
44
2.2k
For a Future-Friendly Web
brad_frost
175
9.4k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
27
840
Embracing the Ebb and Flow
colly
84
4.5k
Agile that works and the tools we love
rasmusluckow
327
21k
What’s in a name? Adding method to the madness
productmarketing
PRO
22
3.1k
Building a Modern Day E-commerce SEO Strategy
aleyda
38
6.9k
GraphQLの誤解/rethinking-graphql
sonatard
67
10k
5 minutes of I Can Smell Your CMS
philhawksworth
202
19k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
364
24k
Transcript
͡ΊΔCassandra
who are you
͍Θͳ͕͔͚Δ twitter@kakerukaeru ΏͱΓΠϯϑϥԂࣇʹ͋ @CyberAgent Amebaͷೝূɺ՝ۚɺը૾৴ج൫ͷ໘ݟΔϚϯ Cassandraྺɿ̍ऑ... ۓுͯ͠·͢ ΦΤʔʔ!!!!ɹʊʊ_ ɹɹɹ ʊʊ_ʗɹɹ
ʋ ɹɹ ʗɹ ʗɹʗ˶ʋ| ɹɹ/ (ƅ)/ɹʗ / ɹ /ɹɹ ŪŊ/Ň˶ʋɻ
࠷ॳʹ
ࠓճͷࢿྉ
For administrator
ͩ
Α
͍
ͱ͍͏͜ͱͰ ࠓճͷ͓ͷ༰ͱͯ͠ ҎԼͷࣄΛॏతʹ͓͠Α͏͔ͱࢥ͍·͢ • ӡ༻্͍ͯ͘͠ͰɺݟΔ͖ • ఆৗతʹߦ͏͖ΦϖϨʔγϣϯ • ॳΊͯCassandraΛӡ༻ͯ͠Έͯͷॴײ
agenda
agenda • about CyberAgent&Service • why Cassandra • Operation •
Build,Monitoring,Backup etc... • about Troubleshooting • ·ͱΊ
about CyberAgent
blog
pigg
game
Entertainment
Entertainment
about Cassandra in CyberAgent
about Cassandra in CyberAgent • Cassandra Versionɿ1.1.5, 1.2.13 • Production
Clusterɿ3 • Production nodesɿ about 150node • Total about qps Read&Writeɿ 50000qps • Total about data sizeɿ15TB
about Service
use at Cyberagent Smartphone Platform1 1 ϒϥβͷPlatformͳ
ͳͷ
Ͱ͢
͕
ࠓճ͓͢Δͷɺ
طଘͰͳ͘ ৽͘͠࡞ͬͨClusterͷ͓
Ͱ͢
For Native App iOS & Android auth payment logging
For Native App • ωΟςΟϒΞϓϦ༻ج൫ • ੜ·Εͯ̍ऑͷج൫ • ೝূɺ՝ۚɺloggingͷapiΛఏڙ •
Cassandraͷ༻ओʹidཧͷ෦ • idʹඥ͚ͯɺ՝ۚˍloggingͳͲͷbackendͷSystemʹܨ ͍͛ͯΔ
why Cassandra
why Cassandra • ୲ʹͳͬͨΒطʹ͋ͬͨ • SPOF͕ͳ͍ • ٸܹͳσʔλ૿ʹ͑ΒΕΔscalability • ϊʔυՃʹΑΔɺεέʔϧΞτ
• ฐࣾSmartphone PlatformͰͷӡ༻࣮
about System
Cassandra setting • Versionɿ2.0.8 • Replication Factorɿ3 • Consistency LevelɿQUORUM
• use vnodeɿ256 • use CQL,nodejs༻ಠࣗυϥΠό࣮ • https://github.com/suguru/cql-client
Request • Peak Request • Readɿabout 9,000 qps • Writeɿabout
3,000 qps • Data size • Totalɿ600GB • 1node avgɿ50GB
Latency • Readɿavg 2ms • Writeɿavg 0.1ms
HW Spec • private Cloud Instance • CPUɿ24core • Memoryɿ94GBɺheap
8GB • Diskʀ1TB • 12node • 1 Cluster
HW Spec • ૬ʹ᩵ͳαʔό • ج൫ͱͯ͜͠Ε͔ΒσΧ͘ͳΔ͜ͱΛݟӽͯ͠ͷαʔό • Resourceతʹ·ͩ·ͩ༨༟͕͋Δ • nodeݮΒͯ͠େৎͦ
• privateCloudͷInstance typeͷϥΠϯφοϓʹΑΓɺ͜ͷ Specʹͳͬͨ
about Operation
Cassandraશܠ
None
Build
Build • Cassandraαʔόͷߏங • Jenkins & ansible • ख࡞ۀCluster join࣌ͷCassandra
ϓϩηεͷىಈͷΈ • vnode(Cassandra ver1.2~)Λ༻ ͍ͯ͠ΔͨΊɺख࡞ۀͰͷtokenͷ ܭࢉˍׂΓ͕ͯඞཁͳ͘୯७ͳ ىಈͰOK
Monitoring
Monitoring • threshold • use sensu • how to check
• Community&Original sensu plugin • how to notify • mail & hipchat
Monitoring • trend • use OpsCenter • data size&latency •
use sensu & influxdb & grafana • how to check • Community&Original sensu plugin
Monitoring • check • OS Resource • cpu,memory,disk&nw latency,fd •
JVM • heap,gc
Monitoring • check • Cassandra • read&write_qps,latency • thread pool
• ReadStage • FlushWriter • Compaction • HintedHandoff...
Monitoring Ͳ͏ͬͯCassandraͷಈΛ͏ͷʁʊʁ • CassandraStatusread&writeͷಈΛ͏ • Write&ReadStageɺMutationStageɺFlushWrite • Compaction_status • Cassandra_Clusterͷhealth֬ೝ͍ͨ͠
• Gossiptimeowtɺhintedhandoff
Operation
Operation • repair & cleanup • about 20h / weekly
• backup & restore • snapshot & sstableloader • restore CI • ?? h / weekly
Operation • repair & cleanup • ϨϓϦΧͷෆ߹Λ͙ͨΊʹఆظతͳrepairΛ࣮ߦ • σʔλͷ෮׆Λ͙ͨΊʹಉ࣌ʹcleanup࣮ߦ •
࣮ߦपظ 7days ʻ gc grace seconds(default:10days)2 2 gc grace secondsɿTombstoneͷGarbageCollection࣮ߦ·Ͱͷ࣌ؒ
Operation • backup • 2hຖʹ֤nodeͰsnapshotΛ࡞͠Swiftʹอଘɻ • restore • test-clusterʹͯఆظతʹrestore͕ग़དྷ͍ͯΔ͔֬ೝ •
sstableloaderΛ͍ۭClusterʹdataΛྲྀ͠ࠐ͜Ή3 3 ͨͩsstableloaderΊͬͪΌ͔͔࣌ؒΔ͔Βɺ࣮ࡍͷrestore࡞ۀsnapshotஔ͖ͷ෮چʹͳΔ͔
about Troubleshooting
Կ͔͋ͬͨ࣌ʹΑ͘͏nodetool • nodetool status • nodeͷঢ়ଶΛͬ͞͞ͱݟ͍ͨ • nodetool tpstats •
࣮ߦதͷthreadͷࢹ • nodetool netstats • streamͷใΛݟΔ
Կ͔͋ͬͨ࣌ʹΑ͘͏nodetool • nodetool cfstats • cfຖʹใΛݟ͍ͨ4 • nodetool disablegossip,disablethrift,disablebinary,flush •
disable* : ֤protocolແޮԽ • flushɿmemtable͔Βflushͤ͞Δ 4 CassandraશମͰSlowdownͯ͠Δͷ͔ɺಛఆcfͰ٧·ͬͯΔͷ͔֬ೝ͍ͨ͠ΑͶ
Կ͔͋ͬͨ࣌ʹΑ͘͏nodetool ͳͷͰ༧ఆ͍ͯͨ͠nodeͷ࠶ىಈͳͲԼهΛͬͨΓ͢Δ $ nodetool disablegossip && \ nodetool disablethrift &&
\ nodetool disablebinary && \ nodetool flush && \ /etc/init.d/cassandra restart
͔࣮͠͠ࡍʹ ಥൃతʹnodeʹԿ͔͕ൃੜͨ͠߹ɺ nodetoolͷ݁Ռ͕ฦͬͯ͜ͳ͍ࣄ͕΄ͱΜͲ ͦͷ߹Ͳ͏͢Δ͔
ఘΊͯ࠶ىಈ5 /etc/init.d/cassandra restart 5 ༻๏༻ྔΛकͬͯਖ਼͓͍͍ͩ͘͘͠͞ɻͪΌΜͱlogɺmetricsΛΈͯஅͯ͠·͢Αɺɺɺ
ͦͷଞɺࠔͬͨ͜ͱ
NWো6 6 ͏طʹා͍Ͱ͢Ͷ
۩ମతʹʁ
None
͓͖ͨ͜ͱɺରԠͨ͜͠ͱ • ॠஅ͕ଓ͖L2ϨϕϧͰͷશͳΔஅʹͳΔ • Clusterతʹશnode͕ಠཱͨ͠ঢ়ଶʹɻ • max hint window ms
(default:3h)Λ͑ͨ(!!)ͷͰhint7ͷใ શͯഁغ͞ΕϦηοτ͞ΕΔܗʹɻ • NW෮چޙʹશnodeͰrepairΛ͔͚ͯɺClusterͷ෮چʹɻ 7 hint:ଞnode͕μϯͨ͠ࡍʹॻ͖ࠐ·ΕΔͣͩͬͨσʔλΛଞͷϨϓϦΧ͕อ࣋͢Δ max_hint_window_ms:↑ ͷhintΛഁغ͢Δ·Ͱͷ࣌ؒ
;Γ͔͑Δͱ • σʔλϩετແ͠ • ॠஅ͕ଓ͘ܗͰhintΛอ࣋ͯ͠ΔݶΓࣗಈతʹϨϓϦΧͷ ߹ੑΛ͑Δ͜ͱ͕Մೳ • hint͕ͳͯ͘node͑͞௵Εͳ͚ΕClusterͷ෮چ͕Մೳ8 • NWஅʹ͑ΒΕͨ
8 hintͳ͍nodeμϯͱͳΔͱɺٹ͑ͳ͍σʔλग़ͯ͘ΔɻͨΓલ͔
ͦͷଞઌਓͷݟ
ͦͷଞઌਓͷݟ9 • slow queryΛݟΔ͜ͱ͕ग़དྷͳ͍ͷͰɺࠔΔલʹΞϓϦଆʹ slow logΛ࣮͢Δ • εΩʔϚઃܭେࣄ • wide
rowΛආ͚Δɻࣄલׂग़དྷΔͳΒͪΌΜͱ͠Α͏ • CassandraʹݶͬͨͰͳ͍͚ΕͲ... 9 ઌਓͷي http://www.slideshare.net/oranie/cassandra-summit-jpn-2014-100-node-cluster-admin-operation
·ͱΊ • ࠷ݶͷࣄΛ͓͚͑ͯӡ༻ָ • CassandraɺŠƂŞūŘż • ઌਓͷݟΛ͋Γ͕͍ͨͨͩ͘͜͏ • ͦͯࣗͨͪ͠ੵͯ͠ڞ༗͠Α͏ɻ •
Cassandra CommunityʹߩݙతͳͶ • 1.xxܥͱ͓ผΕΛ͠Α͏ɻ͍ͨ͠ʢന
͜Ε͔Βͷ͜ͱ • ͜Ε͔ΒͷClusterઃܭ • ͦͷ··Ծʁཧʁʊʁ • PITRʹ͍ۙ͜͠ͱɺ͍ͨ͠ɺɺɺ • σʔληϯλʔػೳΛͬͯɺBackupઐ༻ͷCluster࡞ •
Backup͚࣌ͩɺσʔληϯλʔؒͷϨϓϦΛࢭΊͯɺॻ͖ ࠐΈͷͳ͍ঢ়ଶͰBackupΛऔΔͱ͍͏ໝΛ͍ͯ͠Δɻ
͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠ ͳʹ͔͋Ε࠙ձͷ࣌ʹੋඇʂʢʈТʈʣ