Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for Two-Tower Model

Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for
Two-Tower Model EC Data Dev / Data Scientists Dan Chen

Dan LINE Taiwan EC Dev - Data Scientis Work Experience
Side Project

01 02 03 04 Evaluation Framework Offline & Online Evaluation
LLM on Recommendation What is Trustworthy 05 Q&A CONTENT

Why it’s so important 01 What is Trustworthy

Element of trustworthy 特點項目文字特點項目 Trustworthy 特點項目文字特點項目特點項目文字特點項目

Four Perspective 特點項目文字特點項目 Trustworthy Recommendation 特點項目文字特點項目特點項目文字特點項目
Data Preparation Data Representation Recommendation Generation Performance Evaluation

How to Correctly Evaluate AI 02 Evaluation Framework

Two - Stage Recommendation system Brickmaster Scalable Scenario-wise KPI -
Oriented Trustworthy

How to truly comprehensive understand performance Evaluation Framework (1/2)

How to Correctly Evaluate AI 03 Offline & Online Evaluation

Key point to show how your algorithms can contribute to
your business Offline Evaluation

Key point to show how your algorithms can contribute to
your business Online Evaluation

Avoid pitfalls In Practice If experiment isn’t’ significant ?? Sample
ratio mismatch ?? Novelty effect ?? Key point to show how your algorithms can contribute to your business A/B test

Case – EC Shop recommendation

04 LLM On Recommendation

Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs

Evaluate & Challenge 05 Conclusion

Conclusion Business Value OpenAI, Claude, Gemini XGBoost or OpenSource 來源：https://zh.wikipedia.org/zh-
tw/%E7%BE%8E%E5%9C%8B%E9%9A%8A%E9%95%B72%EF%BC%9A%E9%85%B7%E5%AF%9 2%E6%88%B0%E5%A3%AB 來源：https://images.app.goo.gl/HCygtJVtoPaU2KgX6

Conclusion & Challenge 1. Data Quality 2. Multiple – Metrics
evaluation 3. Conduct A/B test Experiment 4. Human Perception Evaluation Challenge

Q&A 聯絡資訊 (Linkedin – Dan Chen)

Enhanced EC Recommendations: Trustworthy Valida...

Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for Two-Tower Model

LINE Developers Taiwan PRO

More Decks by LINE Developers Taiwan

Other Decks in Technology

Featured

Transcript