Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Enhanced EC Recommendations: Trustworthy Valida...
Search
LINE Developers Taiwan
PRO
September 23, 2024
Technology
0
34
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for Two-Tower Model
Event: iThome Hello World Dev Conference
Speaker: Dan Chen
LINE Developers Taiwan
PRO
September 23, 2024
Tweet
Share
More Decks by LINE Developers Taiwan
See All by LINE Developers Taiwan
LINE 實習分享 & 國際黑客松參賽分享
line_developers_tw
PRO
0
19
在 GCP 運用 Parse 全家餐管理那堆 AI 應用的資料
line_developers_tw
PRO
0
22
40歲的我會給20歲的自己,關於軟體開發的7個建議
line_developers_tw
PRO
0
7.6k
從零到一:轉碼仔的實習攻略
line_developers_tw
PRO
0
34
如何在團隊發揮數據影響力: 以電商資料科學家為例
line_developers_tw
PRO
1
45
做Data超讚的 誰懂?
line_developers_tw
PRO
0
33
iOS Live Activity: Opportunities & Challenges
line_developers_tw
PRO
1
120
掌握 Feature Toggle 與 OpenFeature 規範
line_developers_tw
PRO
0
240
用 AI 和 LINE Bot 簡化生活:讓圖片告訴你何時該忙!-- LINE 工作坊
line_developers_tw
PRO
0
770
Other Decks in Technology
See All in Technology
明日からできる!技術的負債の返済を加速するための実践ガイド~『ホットペッパービューティー』の事例をもとに~
recruitengineers
PRO
3
400
インフラをつくるとはどういうことなのか、 あるいはPlatform Engineeringについて
nwiizo
5
2.6k
バックエンドエンジニアのためのフロントエンド入門 #devsumiC
panda_program
18
7.5k
あれは良かった、あれは苦労したB2B2C型SaaSの新規開発におけるCloud Spanner
hirohito1108
2
610
2025-02-21 ゆるSRE勉強会 Enhancing SRE Using AI
yoshiiryo1
1
360
エンジニアが加速させるプロダクトディスカバリー 〜最速で価値ある機能を見つける方法〜 / product discovery accelerated by engineers
rince
4
360
RSNA2024振り返り
nanachi
0
580
個人開発から公式機能へ: PlaywrightとRailsをつなげた3年の軌跡
yusukeiwaki
11
3k
開発スピードは上がっている…品質はどうする? スピードと品質を両立させるためのプロダクト開発の進め方とは #DevSumi #DevSumiB / Agile And Quality
nihonbuson
2
3k
Larkご案内資料
customercloud
PRO
0
650
分解して理解する Aspire
nenonaninu
1
180
「海外登壇」という 選択肢を与えるために 〜Gophers EX
logica0419
0
710
Featured
See All Featured
For a Future-Friendly Web
brad_frost
176
9.5k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
49k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
174
51k
Designing on Purpose - Digital PM Summit 2013
jponch
117
7.1k
Product Roadmaps are Hard
iamctodd
PRO
50
11k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.1k
RailsConf 2023
tenderlove
29
1k
Visualization
eitanlees
146
15k
Java REST API Framework Comparison - PWX 2021
mraible
28
8.4k
Docker and Python
trallard
44
3.3k
Navigating Team Friction
lara
183
15k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
21
2.5k
Transcript
None
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for
Two-Tower Model EC Data Dev / Data Scientists Dan Chen
Dan LINE Taiwan EC Dev - Data Scientis Work Experience
Side Project
01 02 03 04 Evaluation Framework Offline & Online Evaluation
LLM on Recommendation What is Trustworthy 05 Q&A CONTENT
Why it’s so important 01 What is Trustworthy
Element of trustworthy 特點項目文字 特點項目 Trustworthy 特點項目文字 特點項目 特點項目文字 特點項目
Four Perspective 特點項目文字 特點項目 Trustworthy Recommendation 特點項目文字 特點項目 特點項目文字 特點項目
Data Preparation Data Representation Recommendation Generation Performance Evaluation
How to Correctly Evaluate AI 02 Evaluation Framework
Two - Stage Recommendation system Brickmaster Scalable Scenario-wise KPI -
Oriented Trustworthy
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to Correctly Evaluate AI 03 Offline & Online Evaluation
Key point to show how your algorithms can contribute to
your business Offline Evaluation
Key point to show how your algorithms can contribute to
your business Online Evaluation
Avoid pitfalls In Practice If experiment isn’t’ significant ?? Sample
ratio mismatch ?? Novelty effect ?? Key point to show how your algorithms can contribute to your business A/B test
Case – EC Shop recommendation
04 LLM On Recommendation
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Evaluate & Challenge 05 Conclusion
Conclusion Business Value OpenAI, Claude, Gemini XGBoost or OpenSource 來源:https://zh.wikipedia.org/zh-
tw/%E7%BE%8E%E5%9C%8B%E9%9A%8A%E9%95%B72%EF%BC%9A%E9%85%B7%E5%AF%9 2%E6%88%B0%E5%A3%AB 來源:https://images.app.goo.gl/HCygtJVtoPaU2KgX6
Conclusion & Challenge 1. Data Quality 2. Multiple – Metrics
evaluation 3. Conduct A/B test Experiment 4. Human Perception Evaluation Challenge
Q&A 聯絡資訊 (Linkedin – Dan Chen)
None
None