Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Enhanced EC Recommendations: Trustworthy Valida...
Search
LINE Developers Taiwan
PRO
September 23, 2024
Technology
0
12
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for Two-Tower Model
Event: iThome Hello World Dev Conference
Speaker: Dan Chen
LINE Developers Taiwan
PRO
September 23, 2024
Tweet
Share
More Decks by LINE Developers Taiwan
See All by LINE Developers Taiwan
從零到一:轉碼仔的實習攻略
line_developers_tw
PRO
0
4
如何在團隊發揮數據影響力: 以電商資料科學家為例
line_developers_tw
PRO
1
27
做Data超讚的 誰懂?
line_developers_tw
PRO
0
15
iOS Live Activity: Opportunities & Challenges
line_developers_tw
PRO
1
87
掌握 Feature Toggle 與 OpenFeature 規範
line_developers_tw
PRO
0
160
用 AI 和 LINE Bot 簡化生活:讓圖片告訴你何時該忙!-- LINE 工作坊
line_developers_tw
PRO
0
620
Scaling The E-Commerce Recommendation System
line_developers_tw
PRO
0
28
揭秘LLMOps: 讓LLM服務像火箭 般穩定高效的祕密!
line_developers_tw
PRO
0
69
ML Life Cycle for LINE SHOPPING Recommender
line_developers_tw
PRO
0
17
Other Decks in Technology
See All in Technology
アジャイルでの品質の進化 Agile in Motion vol.1/20241118 Hiroyuki Sato
shift_evolve
0
170
誰も全体を知らない ~ ロールの垣根を超えて引き上げる開発生産性 / Boosting Development Productivity Across Roles
kakehashi
1
230
飲食店データの分析事例とそれを支えるデータ基盤
kimujun
0
110
The Role of Developer Relations in AI Product Success.
giftojabu1
1
130
なぜ今 AI Agent なのか _近藤憲児
kenjikondobai
4
1.4k
OS 標準のデザインシステムを超えて - より柔軟な Flutter テーマ管理 | FlutterKaigi 2024
ronnnnn
0
160
Security-JAWS【第35回】勉強会クラウドにおけるマルウェアやコンテンツ改ざんへの対策
4su_para
0
180
OCI Vault 概要
oracle4engineer
PRO
0
9.7k
AGIについてChatGPTに聞いてみた
blueb
0
130
OCI Security サービス 概要
oracle4engineer
PRO
0
6.5k
Amazon Personalizeのレコメンドシステム構築、実際何するの?〜大体10分で具体的なイメージをつかむ〜
kniino
1
100
Terraform CI/CD パイプラインにおける AWS CodeCommit の代替手段
hiyanger
1
240
Featured
See All Featured
Building an army of robots
kneath
302
43k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
131
33k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
4
370
Side Projects
sachag
452
42k
How STYLIGHT went responsive
nonsquared
95
5.2k
Put a Button on it: Removing Barriers to Going Fast.
kastner
59
3.5k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
232
17k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
226
22k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
31
2.7k
Fontdeck: Realign not Redesign
paulrobertlloyd
82
5.2k
Teambox: Starting and Learning
jrom
133
8.8k
Rails Girls Zürich Keynote
gr2m
94
13k
Transcript
None
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for
Two-Tower Model EC Data Dev / Data Scientists Dan Chen
Dan LINE Taiwan EC Dev - Data Scientis Work Experience
Side Project
01 02 03 04 Evaluation Framework Offline & Online Evaluation
LLM on Recommendation What is Trustworthy 05 Q&A CONTENT
Why it’s so important 01 What is Trustworthy
Element of trustworthy 特點項目文字 特點項目 Trustworthy 特點項目文字 特點項目 特點項目文字 特點項目
Four Perspective 特點項目文字 特點項目 Trustworthy Recommendation 特點項目文字 特點項目 特點項目文字 特點項目
Data Preparation Data Representation Recommendation Generation Performance Evaluation
How to Correctly Evaluate AI 02 Evaluation Framework
Two - Stage Recommendation system Brickmaster Scalable Scenario-wise KPI -
Oriented Trustworthy
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to Correctly Evaluate AI 03 Offline & Online Evaluation
Key point to show how your algorithms can contribute to
your business Offline Evaluation
Key point to show how your algorithms can contribute to
your business Online Evaluation
Avoid pitfalls In Practice If experiment isn’t’ significant ?? Sample
ratio mismatch ?? Novelty effect ?? Key point to show how your algorithms can contribute to your business A/B test
Case – EC Shop recommendation
04 LLM On Recommendation
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Evaluate & Challenge 05 Conclusion
Conclusion Business Value OpenAI, Claude, Gemini XGBoost or OpenSource 來源:https://zh.wikipedia.org/zh-
tw/%E7%BE%8E%E5%9C%8B%E9%9A%8A%E9%95%B72%EF%BC%9A%E9%85%B7%E5%AF%9 2%E6%88%B0%E5%A3%AB 來源:https://images.app.goo.gl/HCygtJVtoPaU2KgX6
Conclusion & Challenge 1. Data Quality 2. Multiple – Metrics
evaluation 3. Conduct A/B test Experiment 4. Human Perception Evaluation Challenge
Q&A 聯絡資訊 (Linkedin – Dan Chen)
None
None