$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Enhanced EC Recommendations: Trustworthy Valida...
Search
LINE Developers Taiwan
PRO
September 23, 2024
Technology
0
68
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for Two-Tower Model
Event: iThome Hello World Dev Conference
Speaker: Dan Chen
LINE Developers Taiwan
PRO
September 23, 2024
Tweet
Share
More Decks by LINE Developers Taiwan
See All by LINE Developers Taiwan
NTUAI企業參訪
line_developers_tw
PRO
0
2k
Data TECH FRESH企業參訪- Amber
line_developers_tw
PRO
0
3.7k
Data Team 實習分享
line_developers_tw
PRO
0
4.2k
Backend Intern之旅
line_developers_tw
PRO
0
7.5k
清大企業參訪- Ben
line_developers_tw
PRO
0
1.6k
LLM 商品規格萃取大冒險- Vila
line_developers_tw
PRO
0
1.4k
Playwright/MCP/AI -Winter
line_developers_tw
PRO
0
1.4k
LINE EC Product Catalog Development- Rei
line_developers_tw
PRO
0
1.4k
LINE 與 AI 機器人技術應用現況
line_developers_tw
PRO
0
24
Other Decks in Technology
See All in Technology
Bedrock AgentCore Evaluationsで学ぶLLM as a judge入門
shichijoyuhi
2
250
Connection-based OAuthから学ぶOAuth for AI Agents
flatt_security
0
380
20251222_サンフランシスコサバイバル術
ponponmikankan
2
140
20251203_AIxIoTビジネス共創ラボ_第4回勉強会_BP山崎.pdf
iotcomjpadmin
0
140
Oracle Database@Google Cloud:サービス概要のご紹介
oracle4engineer
PRO
1
770
オープンソースKeycloakのMCP認可サーバの仕様の対応状況 / 20251219 OpenID BizDay #18 LT Keycloak
oidfj
0
180
2025-12-18_AI駆動開発推進プロジェクト運営について / AIDD-Promotion project management
yayoi_dd
0
160
普段使ってるClaude Skillsの紹介(by Notebooklm)
zerebom
8
2.3k
テストセンター受験、オンライン受験、どっちなんだい?
yama3133
0
170
なぜ あなたはそんなに re:Invent に行くのか?
miu_crescent
PRO
0
210
Strands AgentsとNova 2 SonicでS2Sを実践してみた
yama3133
1
1.9k
Amazon Bedrock Knowledge Bases × メタデータ活用で実現する検証可能な RAG 設計
tomoaki25
6
2.4k
Featured
See All Featured
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
1
1.3k
Lessons Learnt from Crawling 1000+ Websites
charlesmeaden
PRO
0
960
It's Worth the Effort
3n
187
29k
Winning Ecommerce Organic Search in an AI Era - #searchnstuff2025
aleyda
0
1.8k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.9k
More Than Pixels: Becoming A User Experience Designer
marktimemedia
2
260
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
So, you think you're a good person
axbom
PRO
0
1.8k
Building a A Zero-Code AI SEO Workflow
portentint
PRO
0
200
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
Navigating Team Friction
lara
191
16k
Utilizing Notion as your number one productivity tool
mfonobong
2
190
Transcript
None
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for
Two-Tower Model EC Data Dev / Data Scientists Dan Chen
Dan LINE Taiwan EC Dev - Data Scientis Work Experience
Side Project
01 02 03 04 Evaluation Framework Offline & Online Evaluation
LLM on Recommendation What is Trustworthy 05 Q&A CONTENT
Why it’s so important 01 What is Trustworthy
Element of trustworthy 特點項目文字 特點項目 Trustworthy 特點項目文字 特點項目 特點項目文字 特點項目
Four Perspective 特點項目文字 特點項目 Trustworthy Recommendation 特點項目文字 特點項目 特點項目文字 特點項目
Data Preparation Data Representation Recommendation Generation Performance Evaluation
How to Correctly Evaluate AI 02 Evaluation Framework
Two - Stage Recommendation system Brickmaster Scalable Scenario-wise KPI -
Oriented Trustworthy
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to Correctly Evaluate AI 03 Offline & Online Evaluation
Key point to show how your algorithms can contribute to
your business Offline Evaluation
Key point to show how your algorithms can contribute to
your business Online Evaluation
Avoid pitfalls In Practice If experiment isn’t’ significant ?? Sample
ratio mismatch ?? Novelty effect ?? Key point to show how your algorithms can contribute to your business A/B test
Case – EC Shop recommendation
04 LLM On Recommendation
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Evaluate & Challenge 05 Conclusion
Conclusion Business Value OpenAI, Claude, Gemini XGBoost or OpenSource 來源:https://zh.wikipedia.org/zh-
tw/%E7%BE%8E%E5%9C%8B%E9%9A%8A%E9%95%B72%EF%BC%9A%E9%85%B7%E5%AF%9 2%E6%88%B0%E5%A3%AB 來源:https://images.app.goo.gl/HCygtJVtoPaU2KgX6
Conclusion & Challenge 1. Data Quality 2. Multiple – Metrics
evaluation 3. Conduct A/B test Experiment 4. Human Perception Evaluation Challenge
Q&A 聯絡資訊 (Linkedin – Dan Chen)
None
None