Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
LINE ShopチームでのSREの取り組み / SRE in LINE Shop team
Search
LINE Developers
PRO
November 07, 2020
Programming
6
3.5k
LINE ShopチームでのSREの取り組み / SRE in LINE Shop team
2020/11/7に行われたJJUG CCC 2020 Fallでのスポンサーセッションの登壇資料です。
https://ccc2020fall.java-users.jp/
LINE Developers
PRO
November 07, 2020
Tweet
Share
More Decks by LINE Developers
See All by LINE Developers
LINEスタンプのSREing事例集:大きなスパイクアクセスを捌くためのSREing
line_developers
PRO
1
1.9k
Java 21 Overview
line_developers
PRO
6
950
Code Review Challenge: An example of a solution
line_developers
PRO
1
1k
KARTEのAPIサーバ化
line_developers
PRO
1
420
著作権とは何か?〜初歩的概念から権利利用法、侵害要件まで
line_developers
PRO
5
1.9k
生成AIと著作権 〜生成AIによって生じる著作権関連の課題と対処
line_developers
PRO
3
1.9k
マイクロサービスにおけるBFFアーキテクチャでのモジュラモノリスの導入
line_developers
PRO
9
2.9k
A/B Testing at LINE NEWS
line_developers
PRO
2
770
LINEのサポートバージョンの考え方
line_developers
PRO
2
990
Other Decks in Programming
See All in Programming
2024 컴포즈 정원사
jisungbin
0
150
Securify_エンジニア採用資料
3shake
0
100
事業フェーズの変化に対応する 開発生産性向上のゼロイチ
masaygggg
0
200
『ドメイン駆動設計をはじめよう』中核の業務領域
masuda220
PRO
5
1k
connect-go で面倒くささと戦う / 2024-08-27 #newmo_layerx_go
izumin5210
2
650
React + TextAliveでカッコいいLyric Applicatioinを作ろう!!
tosuri13
0
400
Prompt Cachingは本当に効果的なのか検証してみた.pdf
ttnyt8701
0
530
What is Parser
yui_knk
9
4.1k
GenU導入でCDKに初挑戦し、悪戦苦闘した話
hideg
0
170
rbs-inlineを導入してYARDからRBSに移行する
euglena1215
1
290
Architecture Decision Record (ADR)
nearme_tech
PRO
1
690
LangGraphでのHuman-in-the-Loopの実装
os1ma
3
1.1k
Featured
See All Featured
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
0
120
Build The Right Thing And Hit Your Dates
maggiecrowley
30
2.3k
Art, The Web, and Tiny UX
lynnandtonic
294
20k
Music & Morning Musume
bryan
46
6k
Docker and Python
trallard
39
3k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
326
21k
Principles of Awesome APIs and How to Build Them.
keavy
125
16k
The Brand Is Dead. Long Live the Brand.
mthomps
53
38k
Embracing the Ebb and Flow
colly
83
4.4k
Java REST API Framework Comparison - PWX 2021
mraible
PRO
27
7.4k
How GitHub (no longer) Works
holman
310
140k
The Invisible Customer
myddelton
119
13k
Transcript
LINE ShopνʔϜͰͷSREͷऔΓΈ 2020/11/07 JJUG CCC 2020 Fall ( https://ccc2020fall.java-users.jp )
LINE Fukuokaגࣜձࣾ ։ൃ1ࣨ দ࡚ ֶ
ࣗݾհ @matsumana LINE Fukuokaגࣜձࣾ ։ൃ1ࣨ SRE/Server Side Engineer https://github.com/matsumana Manabu
Matsuzaki
• LINE ShopαʔϏεհ • LINE ShopαʔϏεΞʔΩςΫνϟհ • LINE ShopνʔϜͰͷSREͷऔΓΈ Agenda
LINE ShopαʔϏεհ
LINE Shopͱʁ •LINEͷίϯςϯπൢചϓϥοτϑΥʔϜʢLINEͷελϯϓɾ ֆจࣈɾண͔ͤ͑ػೳʣͷࣾͰͷ௨শ • LINEΞϓϦͷελϯϓγϣοϓɺண͔ͤ͑γϣοϓ • WebͷLINE STORE (https://store.line.me/)
ΧελϜελϯϓ • ελϯϓͷςΩετͷҰ෦ΛมߋՄೳͳελϯϓ • https://linecorp.com/ja/pr/news/ja/2019/2664
ϝοηʔδελϯϓ • ΧελϜελϯϓΑΓจΛೖྗՄೳͳελϯϓ • https://linecorp.com/ja/pr/news/ja/2020/3127
LINEελϯϓ ϓϨϛΞϜ • ΫϦΤΠλʔζελϯϓͷαϒεΫϦϓγϣϯ • https://store.line.me/stickers-premium/landing/ja • http://creator-mag.line.me/ja/archives/1075007192.html
αʔϏεن • ελϯϓʹؔ͢Δࣈ *1 • ൢചதͷελϯϓ: 855ສηοτ (20203݄࣌) • 1͋ͨΓͷελϯϓૹ৴:
ฏۉ4ԯ3,300ສճ (2019݄̐࣌) •RPS(requests/sec) *2 • ීஈͷϐʔΫ: ~ 80K RPS (2020/10࣌) • ࢝ͷϐʔΫ: ~ 120K RPS (2020/01࣌) *1 https://linecorp.com/ja/pr/news/ja/2020/3127 *2 https://logmi.jp/tech/articles/322924
LINEελϯϓͷLINEެࣜΞΧϯτͷ ༑ͩͪͷਪҠ • 2018/12: 39,000,000 • 2019/12: 55,000,000 • 2020/10:
63,000,000
LINE ShopαʔϏεΞʔΩςΫνϟհ
LINEʹ͓͚ΔϚΠΫϩαʔϏε •LINE ShopϚΠΫϩαʔϏεͰߏங͞Ε͍ͯΔ •LINEϝοηʔδϯάϓϥοτϑΥʔϜશମ͔ΒݟΔͱɺ LINE Shopࣗମ̍ͭͷϚΠΫϩαʔϏε *1 *1: LINEͷϝοηʔδϯάϓϥοτϑΥʔϜʹ͓͚ΔϚΠΫϩαʔϏεԽͷ͍ಓͷΓ https://linedevday.linecorp.com/jp/2019/sessions/D1-6
LINE Shopʹ͓͚ΔϚΠΫϩαʔϏεʢҰ෦ʣ
ϑϨʔϜϫʔΫɾϞχλϦϯά
Armeria ػೳ֓ཁ •Asynchronous and reactive (like Spring WebFlux) •HTTP/2 •REST
API͚ͩͰͳ͘ɺgRPCͱThriftαϙʔτ •Client side load balancing • https://armeria.dev/docs/client-service-discovery •ϚΠΫϩαʔϏεͰඞཁͳػೳΛఏڙ • Circuit breaker, Service discovery(DNS etc),Distributed tracing(Zipkin integration), etc
Armeria ػೳ֓ཁ •SwaggerͷΑ͏ͳdebug console •Integration • Spring Boot integration •
طଘͷJava webΞϓϦʹΈࠐΜͰ͏ࣄ͕ग़དྷΔ •etc
Armeria ࢀߟࢿྉ •Official site: https://armeria.dev •GitHub repo: https://github.com/line/armeria •LINE DEVELOPER
DAY 2019 ʮArmeriaɿͲ͜ͰཱͭϚΠΫϩαʔϏεϑϨʔϜϫʔΫʯ • https://linedevday.linecorp.com/jp/2019/sessions/D2-2 • https://youtu.be/lii7oNzAOx0 • https://speakerdeck.com/line_devday2019/armeria-a- microservice-framework-well-suited-everywhere
Armeria ࢀߟࢿྉ •JSUGษڧձͰʮSpring BootϢʔβͷͨΊͷArmeriaೖʯͱ ͍͏λΠτϧͰLT͠·ͨ͠ • https://matsumana.info/blog/2020/07/30/introduce-to-armeria- for-spring-users/
LINE ShopͷϚΠΫϩαʔϏεͱ։ൃνʔϜ •౦ژͱԬ߹ΘͤͯνʔϜϝϯόʔ25ਓʢαʔόαΠυ+SREʣ •ϓϩδΣΫτ୯ҐͰνʔϜ͕࡞ΒΕɺෳͷϚΠΫϩαʔϏεΛඞཁʹ Ԡͯ͡ػೳՃɾमਖ਼͍ͯ͠Δ
ϚΠΫϩαʔϏεʹ͓͍ͯݕ౼͕ඞཁͳࣄ •Distributed Tracing •Cascading FailureΛ͙ͨΊͷCircuit Breaker •Graceful DegradationΛߟྀͨ͠αʔϏεׂ •Service Discovery
https://employment.en-japan.com/engineerhub/entry/2018/10/09/110000
Distributed Tracing • APIͷݺͼग़͠ͱɺͦΕʹ͔͔ͬͨ࣌ؒΛՄࢹԽ͢Δ • ϨΠςϯγʹ͕͋Δ߹ͷϘτϧωοΫௐࠪ • LINE ShopͰɺZipkinΛ༻
Circuit Breaker https://armeria.dev/docs/client-circuit-breaker • ݺͼग़͠ઌʹো͕ൃੜͨ͠߹ɺͦΕ͕ղফ͞ΕΔ·Ͱ௨৴ ΛߦΘͳ͍Α͏ʹ͢Δ
Circuit Breaker
Circuit Breaker ※Cascading Failure͕ൃੜ
Circuit Breaker
Circuit Breaker (ArmeriaͷFailFastException)
Graceful Degradation •ো͕ൃੜͨ͠߹ʹɺϨεϙϯεͷҰ෦ͷ࣭ΛԼʢe.g. Ωϟο γϡ͞Εͨݹ͍σʔλΛ͏ʣͤ͞Δ͜ͱͰɺຊʹେࣄͳ෦ΛकΔ •ϚΠΫϩαʔϏεͷ߹ҎԼΛҙࣝͯ͠αʔϏεΛׂ • αʔϏεʹো͕ൃੜͨ͠߹Ͱܧଓ͍ͨ͠ػೳԿ͔ʁ • Ұ෦ͷػೳϨεϙϯεͷ࣭ΛԼͤͯ͞ܧଓ͕Մೳ͔ʁ
ࢀߟࢿྉ: SRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά P281. 22.2.2 ϩʔυγΣσΟϯάͱάϨʔεϑϧσάϥϨʔγϣϯ
Service DiscoveryʢLINE Shopͷ߹ʣ
Central Dogma ػೳ֓ཁ •ઃఆϦϙδτϦαʔϏε •watch͓͚ͯ͠มߋ௨Λड͚औΕΔ •όοΫΤϯυʹGitΛ༻ •ΫϥΠΞϯτϥΠϒϥϦ (Java, Go) •etc
Central Dogma Ϣʔεέʔε •ಈతʹઃఆ͍ͨ͠ͷΛCentral DogmaͰཧ • e.g. Service discovery, Rate
limitઃఆ, A/Bςετ, etc
Central Dogma ࢀߟࢿྉ •Official site: https://line.github.io/centraldogma/ •GitHub repo: https://github.com/line/centraldogma/ •LINE
DEVELOPER DAY 2017 Central DogmaɿLINE ͷ GitΛϕʔεʹͨ͠ߴՄ༻ੑαʔϏεߏϨϙ δτϦ • https://www.slideshare.net/linecorp/central-dogma-lines-gitbacked- highlyavailable-service-configuration-repository • https://www.youtube.com/watch?v=BmgizIFwMq4
LINE ShopνʔϜͰͷSREͷऔΓΈ
SLI
SLI • SLI (Service Level Indicator) • API Availability (ϦΫΤετޭ:
ޭ/τʔλϧϦΫΤετ) • ϨΠςϯγ • etc • SLO (Service Level Objective) • SLIΛϕʔεʹͨ͠αʔϏεͷ৴པੑͷඪ • SLO 100%ؒҧͬͨඪ • ػೳվળɺ৽ػೳՃɺϝϯςφϯε͕ߦ͑ͳ͘ͳΔ • ͍ͬͯΔϓϥοτϑΥʔϜͷSLA͕100%Ͱͳ͍߹͋Δ
SLI • LINE ShopͰAPI availability(ޭ), API latencyΛSLIͱͯ͠༻ • ʮThe Site
Reliability Workbookʢຊޠ൛ɿαΠτϦϥΠΞϏϦ ςΟϫʔΫϒοΫʣʯʹܝࡌ͞Ε͍ͯΔࣄྫΛࢀߟʹͨ͠ • Prometheus+GrafanaͰՄࢹԽ • αʔϏεো͕ൃੜͨ࣌͠ʹϢʔβͷӨڹΛ֬ೝ͍ͯ͠Δ • SREͷϓϥΫςΟεͰɺSLOεςʔΫϗϧμʔͱ ߹ҙ͢Δඞཁ͕͋Δ͕·ͩग़དྷ͍ͯͳ͍ʢࠓޙͷ՝ʣ
SLI Dashboard • ݱࡏͷঢ়ଶΛදࣔ͢ΔμογϡϘʔυ • ͖͍͠ͱഎܠ৭Λઃఆ͍ͯͯ͠ɺ ϝτϦΫεʹΑͬͯഎܠ৭͕มΘΔ
SLI Dashboard
ʮαʔϏεͷ৴པੑͷ֊ʯʹ͍ͭͯ
ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊ • 7. ϓϩμΫτ
• 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊ • 7. ϓϩμΫτ
• 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊ • 7. ϓϩμΫτ
• 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
ϞχλϦϯά - Alerting • ϢʔβʹతͳӨڹ͕͋ΔAlertͳͷ͔Ͳ͏͔ʹج͍ͮͯ AlertϨϕϧͱ௨͢ΔChannelΛ͚͍ͯΔ
ϞχλϦϯά - Alerting • ErrorʢϢʔβʹతͳӨڹ͋Γʣ • LatencyͷѱԽ • Error response/secͷ૿Ճ
• etc • WarnʢͷݪҼͱͳΔͷ or αʔϏεӨڹ͕͍ʣ • CPU usage • JVM GC • ΞϓϦέʔγϣϯαʔό͕མͪͨʢͳΒαʔϏεӨڹແ͍ʣ • etc
ϞχλϦϯά - ऩू͍ͯ͠ΔϝτϦΫε • API͝ͱͷϝτϦΫε • Server/Client latency (50th, 90th,
99th percentile, etc) • Requests/sec • Error responses/sec • ϩάͷྔ (Warn, Error) • JVM (GC, Heap, etc) • DB client metrics (HikariCP, etc) • Server load (CPU, Memory, Network Traffic, etc) • etc…
Armeria͕export͢ΔϝτϦΫεͷྫʢҰ෦ʣ • Server/Client latency (50th, 90th, 99th percentile, etc) •
Requests/sec • Error response/sec • Circuit breaker(CLOSED, OPEN, HALF_OPEN, etc)
Armeria͕export͢ΔϝτϦΫεͷྫʢҰ෦ʣ • Request/Response size • ݺͼग़͠ଆͷͰRequest size͕૿͑ͯαʔόͷෛՙ্͕͕ ΔՄೳੑΛϞχλϦϯά • αʔόଆͷͰෆਖ਼ͳʢۃʹখ͞ͳʣϨεϙϯεΛฦͯ͠
͠·͏ՄೳੑΛϞχλϦϯά • Armeria client͕DNS໊લղܾʹ͔͔ͬͨ࣌ؒ • DNS͕ݪҼͰ໊લղܾʹ͕͔͔࣌ؒΓϨΠςϯγ͕ѱԽͯ͠͠ ·͏ՄೳੑΛϞχλϦϯά
ΞϓϦέʔγϣϯݻ༗ͷϝτϦΫεͷྫ • όʔδϣϯʢ͍ͭɺͲͷόʔδϣϯ͕σϓϩΠ͞Εͨͷ͔ʁʣ • ΞϓϦέʔγϣϯ • ϑϨʔϜϫʔΫ • Armeria •
Spring Boot • JVM • etc…
ΞϓϦέʔγϣϯݻ༗ͷϝτϦΫεͷྫ https://github.com/line/armeria/blob/armeria-1.1.0/core/src/main/java/com/linecorp/armeria/server/Server.java#L375
ΞϓϦέʔγϣϯݻ༗ͷϝτϦΫεͷྫ Metrics: armeria_build_info{version=“1.1.0", …} 1.0 PromQL: sum(armeria_build_info{project=~”$project”, …}) by (project,
version)
ϞχλϦϯά - Batch job Metrics: shop_batch_successful_time_seconds{job=“foo”, period="10min"} 1601313019 ※”1601313019”ͷ෦job͕ਖ਼ৗʹྃͨ࣌͠ͷUNIX time
Alert rule: time() - shop_batch_successful_time_seconds{period="10min"} > 60 * 10 * 3 ※”3”Ұ࣌తͳΤϥʔͰΞϥʔτΛ্͛ͳ͍ͨΊͷόοϑΝɻ͓ΈͰɻ ※͜ͷྫͰɺperiod=“10min”ϥϕϧΛ͚ͨbatch job͕લճऴྃҎ߱ɺ30Ҏʹਖ਼ৗऴ ྃ͠ͳ͚ΕΞϥʔτ্͕͕Δ https://www.robustperception.io/monitoring-batch-jobs-in-python
ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊ • 7. ϓϩμΫτ
• 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
on-call • on-call୲Λຖि2ਓͰ࣋ͪճΓ • νʔϜͰ࡞ͨ͠ΨΠυΛݩʹΞϥʔτͷରԠΛߦ͏ • ൃੜͨ͠ΞϥʔτΛνΣοΫ͠ɺϝϯόʔʹΤεΧϨʔγϣϯ • on-call୲Ҏ֎ͷϝϯόʔΞϥʔτରԠʹࢀՃ •
֤छμϯϓɾϩάͳͲΛऔಘ • JVM Thread dump • JVM Heap dump • JFRϩά • etc
on-call • αʔϏεӨڹ͕͋Δ߹ؔ͢ΔChannelʹΤεΧϨʔγϣϯ • ൃੜͨ͠issueͷνέοτొ • αʔϏεো͕ൃੜͨ͠߹ϨϙʔτΛ࡞ • ͍ΘΏΔSREϓϥΫςΟεʹ͓͚ΔϙετϞʔςϜ •
etc
ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊ • 7. ϓϩμΫτ
• 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
ϙετϞʔςϜ • ϙετϞʔςϜͱɺαʔϏεো͕ൃੜͨ͠߹ʹॻ͘ ϨϙʔτɺͦͷऔΓΈͷࣄ • LINEͰҎલ͔ΒϙετϞʔςϜͷจԽ͕͋Δ • LINE Shopͷ߹ɺϙετϞʔςϜΛॻ͍ͯؔνʔϜͱ ϛʔςΟϯάΛ։࠵͍ͯ͠Δ
ϙετϞʔςϜ • ϙετϞʔςϜʹ·ͱΊΔ߲ • Өڹൣғ • োͷݪҼ • ঢ়گͷ࣌ܥྻ·ͱΊ •
࠶ൃࢭࡦͷݕ౼ • োݕʹ͕ͳ͔͔ͬͨʁͲ͏վળ͢Δ͔ʁ • োͷϋϯυϦϯάʹ͕ͳ͔͔ͬͨʁͲ͏վળ͢Δ͔ʁ
ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊ • 7. ϓϩμΫτ
• 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
ΩϟύγςΟϓϥϯχϯά • ݩ୴ͷ0:00ޙ࠷τϥϑΟοΫ͕૿͑ΔΠϕϯτͷ̍ͭ • ຖࣄલ४උΛߦ͍ͬͯΔ • ڈͷݩ୴ɺͦͷଞͷΠϕϯτͷϝτϦΫεΛݩʹαʔόͷ εέʔϧΞοϓεέʔϧΞτΛߦ͏ • LINE
Developer MeetupͰൃදͨ࣌͠ͷॻ͖ى͕͋͜͠Γ·͢ • https://logmi.jp/tech/articles/322924
ΩϟύγςΟϓϥϯχϯά • ػೳՃɺLINEελϯϓͷLINEެࣜΞΧϯτͷ༑ͩͪͷ ૿ՃʹΑͬͯɺීஈͷϦΫΤετ૿͑ଓ͚͍ͯΔ • ݩ୴Ҏ֎ͷλΠϛϯάͰඞཁʹԠͯ͡ਵ࣌εέʔϧΞτ
ͦͷଞͷτϐοΫ
ͦͷଞͷτϐοΫ • ಥൃతͳաෛՙͷରॲ • k8sΛ͏࣌ʹݕ౼ɾ४උͨ͠ࣄ
ಥൃతͳաෛՙͷରॲ • ಥൃతʹൃੜͨ͠աෛՙ͕ݪҼͰαʔϏεো͕ൃੜ͢ΔՄೳੑ • ߟ͑ΒΕΔ͍͔ͭ͘ͷཁҼ • TVͰऔΓ্͛ΒΕͯಥൃతͳεύΠΫ͕ൃੜ • ෆ۩߹ΛؚΉόʔδϣϯͷσϓϩΠʹΑΔϦΫΤετ૿Ճ •
etc • Ͳ͏ରॲ͢Δ͔ʁ • Rate limit ʢҰఆͷϦΫΤετΛrejectͯ͠աෛՙঢ়ଶʹͳΔࣄΛ͙ʣ • αʔόͷεέʔϧΞτ
LINE ShopͰͷRate limit • Rate limitͷઃఆCentral DogmaͰཧ • Rate limitॲཧArmeriaͷThrottlingServiceΛͬͯΞϓϦʹ࣮
https://armeria.dev/docs/advanced-production-checklist/
k8sΛ͏࣌ʹݕ౼ɾ४උͨ͠ࣄ
ͳͥLINE ShopͰk8sΛ͍͍ͨͷ͔ʁ • ΄ͱΜͲͷϚΠΫϩαʔϏεVerdaʢPrivate CloudʣͷVMͱ ཧαʔόΛ༻த • AnsibleͰϓϩϏδϣχϯά • εέʔϧΞτʹ͕͔͔࣌ؒΔ
• ಥൃతͳϦΫΤετ૿Ճ࣌ʹૉૣ͘εέʔϧΞτ͍ͨ͠ • VerdaͰk8sͷαʔϏεఏڙ͞Ε͍ͯΔ
ݕ౼ɾ४උͨ͠ࣄʢϞχλϦϯάʣ • k8sڥʹ͓͚ΔPrometheusϞχλϦϯάͷҰൠతͳύλʔϯ → ΫϥελʹPrometheusΛσϓϩΠ͢Δ
ݕ౼ɾ४උͨ͠ࣄʢϞχλϦϯάʣ • PrometheusαʔόΛࣗୡͰӡ༻͢ΔࣄΛආ͚͍ͨ • LINE ShopαʔϏεશମͰऩू͞Ε͍ͯΔϝτϦΫεͷ૯ (scrape_samples_scraped)6,000,000Ҏ্ • PrometheusαʔόΛ҆ఆՔಇͤ͞ΔͨΊʹPrometheusͷ ݟɾϊϋ͕ඞཁ
• ࠓͬͯΔPrometheus/Alertmanager/alert rule/Grafanaμο γϡϘʔυΛk8sڥͰͦͷ··͍͍ͨ
ݕ౼ɾ४උͨ͠ࣄʢϞχλϦϯάʣ • k8s API serverܦ༝ͰΫϥελ֎ͷPrometheusαʔό͔ΒϝτϦ ΫεΛऩू͢ΔࣄՄೳ
ݕ౼ɾ४උͨ͠ࣄʢϞχλϦϯάʣ • σϓϩΠ͢ΔPodͷϝτϦΫεΛk8s APIαʔόܦ༝Ͱऩू͢Δͷ ආ͚͍ͨ • Pod͕૿͑ͨ߹Ͱɺk8s APIαʔόʹͰ͖Δ͚ͩෛՙΛ͔ ͚ͨ͘ͳ͍
ݕ౼ɾ४උͨ͠ࣄʢϞχλϦϯάʣ • Reverse ProxyΞϓϦΛ։ൃͯ͠ɺk8sΫϥελʹσϓϩΠ • PodͷϝτϦΫεऩूReverse ProxyΞϓϦܦ༝Ͱߦ͏
·ͱΊ • LINE ShopαʔϏεΛ͝հ͠·ͨ͠ • LINE ShopαʔϏεΞʔΩςΫνϟΛ͝հ͠·ͨ͠ • ArmeriaͱCentral Dogma
• ϚΠΫϩαʔϏεʹ͓͍ͯݕ౼͕ඞཁͳࣄ • Distributed Tracing • Cascading FailureΛ͙ͨΊͷCircuit Breaker • Graceful DegradationΛߟྀͨ͠αʔϏεׂ • Service Discovery
·ͱΊ • LINE ShopνʔϜͰͷSREͷऔΓΈΛ͝հ͠·ͨ͠ • αʔϏεͷ৴པੑͷ֊ • ϞχλϦϯά • ΠϯγσϯτରԠ
• ϙετϞʔςϜ • ΩϟύγςΟϓϥϯχϯά
LINE DEVELOPER DAY 2020ͷ͝Ҋ • ެࣜαΠτ: https://linedevday.linecorp.com/2020/ja • ఔ: 2020/11/25~27ʢΦϯϥΠϯ։࠵ʣ
• ߹ܭ150Ҏ্ͷηογϣϯ • શηογϣϯӳ௨༁ରԠ
We are hiring • LINE Fukuokaגࣜձࣾ • αʔόʔαΠυΤϯδχΞ https://linefukuoka.co.jp/ja/career/list/engineer/ development_engineer_server-side
• LINEגࣜձࣾ • γχΞαʔόʔαΠυΤϯδχΞ/ίϯςϯπൢചϓϥοτϑΥʔϜ https://linecorp.com/ja/career/position/665 • Site Reliability Engineer/ίϯςϯπൢചϓϥοτϑΥʔϜ https://linecorp.com/ja/career/position/1535
͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠ʂ