Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
IVRyエンジニア忘年LT大会2024 LLM監視の最前線
Search
Hiroyuki Moriya
December 11, 2024
1
150
IVRyエンジニア忘年LT大会2024 LLM監視の最前線
Hiroyuki Moriya
December 11, 2024
Tweet
Share
More Decks by Hiroyuki Moriya
See All by Hiroyuki Moriya
kueueに新しいPriorityClassを足した話
gekko0114
0
630
JobSet超入門
gekko0114
1
820
Featured
See All Featured
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
2
170
The Power of CSS Pseudo Elements
geoffreycrofte
73
5.4k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
48
2.2k
Being A Developer After 40
akosma
87
590k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
28
9.1k
Making Projects Easy
brettharned
116
5.9k
[RailsConf 2023] Rails as a piece of cake
palkan
53
5k
Designing on Purpose - Digital PM Summit 2013
jponch
116
7k
Thoughts on Productivity
jonyablonski
67
4.4k
Automating Front-end Workflow
addyosmani
1366
200k
Git: the NoSQL Database
bkeepers
PRO
427
64k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
95
17k
Transcript
confidencial LLMࢹͷ࠷લઢ IVRy ΤϯδχΞLTେձ 2024/12/11 Moriya Hiroyuki
confidencial 2 Զͷ໊લɺߴߍੜ୳ఁ౻৽Ұ
confidencial 3 Զͷ໊લɺߴߍੜ୳ఁ౻৽Ұ AIΤϯδχΞͱͯ͠ಇ͖࢝ΊͨԶɺLLMΛͬͯɺͨ͘͞Μ͓ۚΛՔ͍Ͱ͍Δε λʔτΞοϓΛܸͨ͠ɻ
confidencial 4 Զͷ໊લɺߴߍੜ୳ఁ౻৽Ұ AIΤϯδχΞͱͯ͠ಇ͖࢝ΊͨԶɺLLMΛͬͯɺͨ͘͞Μ͓ۚΛՔ͍Ͱ͍Δε λʔτΞοϓΛܸͨ͠ɻ ʮͪΐͪΐͬͱ։ൃͨ͠ΒϘϩṶ͚Ͱ͖ΔΜʂʯͱؾ͍ͮͨԶɺىۀͯ͠ɺͻͨ ͢ΒPoCϓϩδΣΫτΛΫϥΠΞϯτʹఏڙ͢Δ͜ͱʹͨ͠ɻ
confidencial 5 Զͷ໊લɺߴߍੜ୳ఁ౻৽Ұ AIΤϯδχΞͱͯ͠ಇ͖࢝ΊͨԶɺLLMΛͬͯɺͨ͘͞Μ͓ۚΛՔ͍Ͱ͍Δε λʔτΞοϓΛܸͨ͠ɻ ʮͪΐͪΐͬͱ։ൃͨ͠ΒϘϩṶ͚Ͱ͖ΔΜʂʯͱؾ͍ͮͨԶɺىۀͯ͠ɺͻͨ ͢ΒPoCϓϩδΣΫτΛΫϥΠΞϯτʹఏڙ͢Δ͜ͱʹͨ͠ɻ Զɺഎޙ͔Β͍ۙͮͯ͘Δrate limit੍ݶͱɺLatencyͷѱԽʹؾ͕͍͍ͭͯͳ͔ͬ ͨɻ
confidencial 6 Զͷ໊લɺߴߍੜ୳ఁ౻৽Ұ AIΤϯδχΞͱͯ͠ಇ͖࢝ΊͨԶɺLLMΛͬͯɺͨ͘͞Μ͓ۚΛՔ͍Ͱ͍Δε λʔτΞοϓΛܸͨ͠ɻ ʮͪΐͪΐͬͱ։ൃͨ͠ΒϘϩṶ͚Ͱ͖ΔΜʂʯͱؾ͍ͮͨԶɺىۀͯ͠ɺͻͨ ͢ΒPoCϓϩδΣΫτΛΫϥΠΞϯτʹఏڙ͢Δ͜ͱʹͨ͠ɻ Զɺഎޙ͔Β͍ۙͮͯ͘Δrate limit੍ݶͱɺLatencyͷѱԽʹؾ͕͍͍ͭͯͳ͔ͬ ͨɻ
ؾ͕͍ͭͨΒɺԶͷϓϩμΫτղ͕૬࣍͗ɺձࣾ࢈ͯ͠͠·͍ͬͯͨ...
confidencial 7
confidencial 8 ࠓɺ౻৽Ұ܅͕ɺ͜Μͳ݁Λܴ͑ͳ͍ͨΊʹͰ͖Δ͜ͱΛ͓͠͠·͢ɻ
confidencial ࣗݾհ 2024/08 ೖࣾ SWEɾػցֶशΤϯδχΞͳͲΛܦݧ LLM͕ίΞʹͳΓͦ͏ͳαʔϏεͩͱࢥͬͯIVRyʹೖࣾ Moriya Hiroyuki 9 AI
engineer
confidencial IVRyͰͷLLMΛར༻ͨ͠AIର 10 WebsocketΛར༻͠ΤϯυϢʔβʔͱLLM͕ϦΞϧλΠϜʹΓऔΓ͍ͯ͠Δ
confidencial LLM Fallback 11 ෳͷLLMΛར༻͢Δ͜ͱΛલఏʹFallbackػߏΛߏங APIͷStatus, Ratelimitσʔλ੍(ཧ੍)ΛͱʹৼΓ͚
confidencial LLM Fallback 12 ෳͷLLMΛར༻͢Δ͜ͱΛલఏʹFallbackػߏΛߏங APIͷStatus, Ratelimitσʔλ੍(ཧ੍)ΛͱʹৼΓ͚ ࢹ͢Ε ྑ͍ͷ͡Ό
confidencial ํ๏ 1ɿDataDog LLM observability 13 DataDog͕Ӷҙ։ൃதͷLLMࢹʹಛԽͨ͠ػೳɻ Latency, token, promptͳͲΛऔಘͰ͖Δɻ
confidencial 14 ʮ͜ΕͰɺOpenAIͷlatency͕ࢹͰ͖ΔΑ͏ʹͳͬͨͥʂʯ
confidencial 15 ͋ΕΕʙɺ͓͔͍͠Αʙ
confidencial 16 ͋ΕΕʙɺ͓͔͍͠Αʙ ΒͷϓϩμΫτɺfallbackػߏΛ ࣮͍ͯ͠Δͷʹɺ OpenAIͷlatency͔͠ࢹͰ͖ͯͳ͍Αʙ
confidencial ํ๏ 2ɿOpenLIT (OpenTelemetry) 17 OpenTelemetryن֨ʹଇͬͨɺLLMࢹʹಛԽͨ͠πʔϧɻ ༷ʑͳLLMΛࢹ͢Δ͜ͱ͕Ͱ͖Δɻ
confidencial 18 ʮ͜ΕͰɺ৭ʑͳmodelͷlatency͕ࢹͰ͖ΔΑ͏ʹͳͬͨͥʂʯ
confidencial 19 ͋ΕΕʙɺ͓͔͍͠Αʙ
confidencial 20 ͋ΕΕʙɺ͓͔͍͠Αʙ Βɺ৭ʑͳϞσϧΛ͏Μ͔ͩΒɺ provider͝ͱʹɺlatencyΛܭଌ͢Δඞཁ͕͋Δͷʹ LiteLLMશମͰͷlatency͔͠औΕͯͳ͍Αʙ
confidencial ํ๏ 3ɿDataDog Inferred services 21 DataDogʹࡌ͞ΕͨɺApp֎ͷϦΫΤετΛࢹͯ͘͠ΕΔػߏ
confidencial 22 ʮ͜ΕͰɺLiteLLMͰ͍ͬͯΔͯ͢ͷmodelΛࢹͰ͖ΔΑ͏ʹͳͬͨͥʂʯ
confidencial 23 ͋ΕΕʙɺ͓͔͍͠Αʙ
confidencial 24 ͋ΕΕʙɺ͓͔͍͠Αʙ ΒɺGeminiɺOpenAIͰ̍ͭͷmodelΛ ͏ͱݶΒͳ͍ͷʹɺ ݸผͷmodelͷlatencyΛऔಘ͢Δ͜ͱ Ͱ͖ͯͳ͍Αʙ
confidencial ·ͱΊ LLMࢹɺ·ͩ·ͩൃల్্Ͱݟ͕͋Γ·ͤΜʂ AIɾLLMΛ͍͜ͳͯ͠ϓϩμΫτʹೖΕ͍ͯ͘աఔͰɺ ࣗΒ͕Γ։͍͍ͯ͘ඞཁ͕͋Γ·͢ɻ ͥͻҰॹʹAIࢹΛ͍͖ͬͯ·͠ΐ͏ʂ 25