Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Enhanced EC Recommendations: Trustworthy Valida...
Search
LINE Developers Taiwan
PRO
September 23, 2024
Technology
0
23
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for Two-Tower Model
Event: iThome Hello World Dev Conference
Speaker: Dan Chen
LINE Developers Taiwan
PRO
September 23, 2024
Tweet
Share
More Decks by LINE Developers Taiwan
See All by LINE Developers Taiwan
在 GCP 運用 Parse 全家餐管理那堆 AI 應用的資料
line_developers_tw
PRO
0
12
40歲的我會給20歲的自己,關於軟體開發的7個建議
line_developers_tw
PRO
0
4.2k
從零到一:轉碼仔的實習攻略
line_developers_tw
PRO
0
11
如何在團隊發揮數據影響力: 以電商資料科學家為例
line_developers_tw
PRO
1
32
做Data超讚的 誰懂?
line_developers_tw
PRO
0
17
iOS Live Activity: Opportunities & Challenges
line_developers_tw
PRO
1
97
掌握 Feature Toggle 與 OpenFeature 規範
line_developers_tw
PRO
0
190
用 AI 和 LINE Bot 簡化生活:讓圖片告訴你何時該忙!-- LINE 工作坊
line_developers_tw
PRO
0
700
Scaling The E-Commerce Recommendation System
line_developers_tw
PRO
0
58
Other Decks in Technology
See All in Technology
大幅アップデートされたRagas v0.2をキャッチアップ
os1ma
2
530
ガバメントクラウドのセキュリティ対策事例について
fujisawaryohei
0
530
ブラックフライデーで購入したPixel9で、Gemini Nanoを動かしてみた
marchin1989
1
530
サービスでLLMを採用したばっかりに振り回され続けたこの一年のあれやこれや
segavvy
2
410
社内イベント管理システムを1週間でAKSからACAに移行した話し
shingo_kawahara
0
180
How to be an AWS Community Builder | 君もAWS Community Builderになろう!〜2024 冬 CB募集直前対策編?!〜
coosuke
PRO
2
2.8k
NilAway による静的解析で「10 億ドル」を節約する #kyotogo / Kyoto Go 56th
ytaka23
3
380
1等無人航空機操縦士一発試験 合格までの道のり ドローンミートアップ@大阪 2024/12/18
excdinc
0
160
re:Invent 2024 Innovation Talks(NET201)で語られた大切なこと
shotashiratori
0
310
Wvlet: A New Flow-Style Query Language For Functional Data Modeling and Interactive Data Analysis - Trino Summit 2024
xerial
1
120
新機能VPCリソースエンドポイント機能検証から得られた考察
duelist2020jp
0
220
サイボウズフロントエンドエキスパートチームについて / FrontendExpert Team
cybozuinsideout
PRO
5
38k
Featured
See All Featured
Side Projects
sachag
452
42k
Building Applications with DynamoDB
mza
91
6.1k
Build your cross-platform service in a week with App Engine
jlugia
229
18k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
507
140k
Imperfection Machines: The Place of Print at Facebook
scottboms
266
13k
Testing 201, or: Great Expectations
jmmastey
40
7.1k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
159
15k
Designing Experiences People Love
moore
138
23k
Building Better People: How to give real-time feedback that sticks.
wjessup
365
19k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
48
2.2k
Optimizing for Happiness
mojombo
376
70k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
17
2.3k
Transcript
None
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for
Two-Tower Model EC Data Dev / Data Scientists Dan Chen
Dan LINE Taiwan EC Dev - Data Scientis Work Experience
Side Project
01 02 03 04 Evaluation Framework Offline & Online Evaluation
LLM on Recommendation What is Trustworthy 05 Q&A CONTENT
Why it’s so important 01 What is Trustworthy
Element of trustworthy 特點項目文字 特點項目 Trustworthy 特點項目文字 特點項目 特點項目文字 特點項目
Four Perspective 特點項目文字 特點項目 Trustworthy Recommendation 特點項目文字 特點項目 特點項目文字 特點項目
Data Preparation Data Representation Recommendation Generation Performance Evaluation
How to Correctly Evaluate AI 02 Evaluation Framework
Two - Stage Recommendation system Brickmaster Scalable Scenario-wise KPI -
Oriented Trustworthy
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to Correctly Evaluate AI 03 Offline & Online Evaluation
Key point to show how your algorithms can contribute to
your business Offline Evaluation
Key point to show how your algorithms can contribute to
your business Online Evaluation
Avoid pitfalls In Practice If experiment isn’t’ significant ?? Sample
ratio mismatch ?? Novelty effect ?? Key point to show how your algorithms can contribute to your business A/B test
Case – EC Shop recommendation
04 LLM On Recommendation
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Evaluate & Challenge 05 Conclusion
Conclusion Business Value OpenAI, Claude, Gemini XGBoost or OpenSource 來源:https://zh.wikipedia.org/zh-
tw/%E7%BE%8E%E5%9C%8B%E9%9A%8A%E9%95%B72%EF%BC%9A%E9%85%B7%E5%AF%9 2%E6%88%B0%E5%A3%AB 來源:https://images.app.goo.gl/HCygtJVtoPaU2KgX6
Conclusion & Challenge 1. Data Quality 2. Multiple – Metrics
evaluation 3. Conduct A/B test Experiment 4. Human Perception Evaluation Challenge
Q&A 聯絡資訊 (Linkedin – Dan Chen)
None
None