Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
IVRyエンジニア忘年LT大会2024 LLM監視の最前線
Search
Hiroyuki Moriya
December 11, 2024
1
300
IVRyエンジニア忘年LT大会2024 LLM監視の最前線
Hiroyuki Moriya
December 11, 2024
Tweet
Share
More Decks by Hiroyuki Moriya
See All by Hiroyuki Moriya
kueueに新しいPriorityClassを足した話
gekko0114
0
680
JobSet超入門
gekko0114
1
890
Featured
See All Featured
XXLCSS - How to scale CSS and keep your sanity
sugarenia
248
1.3M
Mobile First: as difficult as doing things right
swwweet
223
9.6k
The Art of Programming - Codeland 2020
erikaheidi
53
13k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
160
15k
Statistics for Hackers
jakevdp
798
220k
Intergalactic Javascript Robots from Outer Space
tanoku
270
27k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
49k
Gamification - CAS2011
davidbonilla
81
5.2k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
52
2.4k
Git: the NoSQL Database
bkeepers
PRO
430
65k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
47
5.3k
The Straight Up "How To Draw Better" Workshop
denniskardys
232
140k
Transcript
confidencial LLMࢹͷ࠷લઢ IVRy ΤϯδχΞLTେձ 2024/12/11 Moriya Hiroyuki
confidencial 2 Զͷ໊લɺߴߍੜ୳ఁ౻৽Ұ
confidencial 3 Զͷ໊લɺߴߍੜ୳ఁ౻৽Ұ AIΤϯδχΞͱͯ͠ಇ͖࢝ΊͨԶɺLLMΛͬͯɺͨ͘͞Μ͓ۚΛՔ͍Ͱ͍Δε λʔτΞοϓΛܸͨ͠ɻ
confidencial 4 Զͷ໊લɺߴߍੜ୳ఁ౻৽Ұ AIΤϯδχΞͱͯ͠ಇ͖࢝ΊͨԶɺLLMΛͬͯɺͨ͘͞Μ͓ۚΛՔ͍Ͱ͍Δε λʔτΞοϓΛܸͨ͠ɻ ʮͪΐͪΐͬͱ։ൃͨ͠ΒϘϩṶ͚Ͱ͖ΔΜʂʯͱؾ͍ͮͨԶɺىۀͯ͠ɺͻͨ ͢ΒPoCϓϩδΣΫτΛΫϥΠΞϯτʹఏڙ͢Δ͜ͱʹͨ͠ɻ
confidencial 5 Զͷ໊લɺߴߍੜ୳ఁ౻৽Ұ AIΤϯδχΞͱͯ͠ಇ͖࢝ΊͨԶɺLLMΛͬͯɺͨ͘͞Μ͓ۚΛՔ͍Ͱ͍Δε λʔτΞοϓΛܸͨ͠ɻ ʮͪΐͪΐͬͱ։ൃͨ͠ΒϘϩṶ͚Ͱ͖ΔΜʂʯͱؾ͍ͮͨԶɺىۀͯ͠ɺͻͨ ͢ΒPoCϓϩδΣΫτΛΫϥΠΞϯτʹఏڙ͢Δ͜ͱʹͨ͠ɻ Զɺഎޙ͔Β͍ۙͮͯ͘Δrate limit੍ݶͱɺLatencyͷѱԽʹؾ͕͍͍ͭͯͳ͔ͬ ͨɻ
confidencial 6 Զͷ໊લɺߴߍੜ୳ఁ౻৽Ұ AIΤϯδχΞͱͯ͠ಇ͖࢝ΊͨԶɺLLMΛͬͯɺͨ͘͞Μ͓ۚΛՔ͍Ͱ͍Δε λʔτΞοϓΛܸͨ͠ɻ ʮͪΐͪΐͬͱ։ൃͨ͠ΒϘϩṶ͚Ͱ͖ΔΜʂʯͱؾ͍ͮͨԶɺىۀͯ͠ɺͻͨ ͢ΒPoCϓϩδΣΫτΛΫϥΠΞϯτʹఏڙ͢Δ͜ͱʹͨ͠ɻ Զɺഎޙ͔Β͍ۙͮͯ͘Δrate limit੍ݶͱɺLatencyͷѱԽʹؾ͕͍͍ͭͯͳ͔ͬ ͨɻ
ؾ͕͍ͭͨΒɺԶͷϓϩμΫτղ͕૬࣍͗ɺձࣾ࢈ͯ͠͠·͍ͬͯͨ...
confidencial 7
confidencial 8 ࠓɺ౻৽Ұ܅͕ɺ͜Μͳ݁Λܴ͑ͳ͍ͨΊʹͰ͖Δ͜ͱΛ͓͠͠·͢ɻ
confidencial ࣗݾհ 2024/08 ೖࣾ SWEɾػցֶशΤϯδχΞͳͲΛܦݧ LLM͕ίΞʹͳΓͦ͏ͳαʔϏεͩͱࢥͬͯIVRyʹೖࣾ Moriya Hiroyuki 9 AI
engineer
confidencial IVRyͰͷLLMΛར༻ͨ͠AIର 10 WebsocketΛར༻͠ΤϯυϢʔβʔͱLLM͕ϦΞϧλΠϜʹΓऔΓ͍ͯ͠Δ
confidencial LLM Fallback 11 ෳͷLLMΛར༻͢Δ͜ͱΛલఏʹFallbackػߏΛߏங APIͷStatus, Ratelimitσʔλ੍(ཧ੍)ΛͱʹৼΓ͚
confidencial LLM Fallback 12 ෳͷLLMΛར༻͢Δ͜ͱΛલఏʹFallbackػߏΛߏங APIͷStatus, Ratelimitσʔλ੍(ཧ੍)ΛͱʹৼΓ͚ ࢹ͢Ε ྑ͍ͷ͡Ό
confidencial ํ๏ 1ɿDataDog LLM observability 13 DataDog͕Ӷҙ։ൃதͷLLMࢹʹಛԽͨ͠ػೳɻ Latency, token, promptͳͲΛऔಘͰ͖Δɻ
confidencial 14 ʮ͜ΕͰɺOpenAIͷlatency͕ࢹͰ͖ΔΑ͏ʹͳͬͨͥʂʯ
confidencial 15 ͋ΕΕʙɺ͓͔͍͠Αʙ
confidencial 16 ͋ΕΕʙɺ͓͔͍͠Αʙ ΒͷϓϩμΫτɺfallbackػߏΛ ࣮͍ͯ͠Δͷʹɺ OpenAIͷlatency͔͠ࢹͰ͖ͯͳ͍Αʙ
confidencial ํ๏ 2ɿOpenLIT (OpenTelemetry) 17 OpenTelemetryن֨ʹଇͬͨɺLLMࢹʹಛԽͨ͠πʔϧɻ ༷ʑͳLLMΛࢹ͢Δ͜ͱ͕Ͱ͖Δɻ
confidencial 18 ʮ͜ΕͰɺ৭ʑͳmodelͷlatency͕ࢹͰ͖ΔΑ͏ʹͳͬͨͥʂʯ
confidencial 19 ͋ΕΕʙɺ͓͔͍͠Αʙ
confidencial 20 ͋ΕΕʙɺ͓͔͍͠Αʙ Βɺ৭ʑͳϞσϧΛ͏Μ͔ͩΒɺ provider͝ͱʹɺlatencyΛܭଌ͢Δඞཁ͕͋Δͷʹ LiteLLMશମͰͷlatency͔͠औΕͯͳ͍Αʙ
confidencial ํ๏ 3ɿDataDog Inferred services 21 DataDogʹࡌ͞ΕͨɺApp֎ͷϦΫΤετΛࢹͯ͘͠ΕΔػߏ
confidencial 22 ʮ͜ΕͰɺLiteLLMͰ͍ͬͯΔͯ͢ͷmodelΛࢹͰ͖ΔΑ͏ʹͳͬͨͥʂʯ
confidencial 23 ͋ΕΕʙɺ͓͔͍͠Αʙ
confidencial 24 ͋ΕΕʙɺ͓͔͍͠Αʙ ΒɺGeminiɺOpenAIͰ̍ͭͷmodelΛ ͏ͱݶΒͳ͍ͷʹɺ ݸผͷmodelͷlatencyΛऔಘ͢Δ͜ͱ Ͱ͖ͯͳ͍Αʙ
confidencial ·ͱΊ LLMࢹɺ·ͩ·ͩൃల్্Ͱݟ͕͋Γ·ͤΜʂ AIɾLLMΛ͍͜ͳͯ͠ϓϩμΫτʹೖΕ͍ͯ͘աఔͰɺ ࣗΒ͕Γ։͍͍ͯ͘ඞཁ͕͋Γ·͢ɻ ͥͻҰॹʹAIࢹΛ͍͖ͬͯ·͠ΐ͏ʂ 25