Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Paper Reading: Sampling-Based Approximations to...
Search
Hiroyuki Deguchi
February 15, 2023
Research
220
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Paper Reading: Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation
Hiroyuki Deguchi
February 15, 2023
More Decks by Hiroyuki Deguchi
See All by Hiroyuki Deguchi
20250226 NLP colloquium: "SoftMatcha: 10億単語規模コーパス検索のための柔らかくも高速なパターンマッチャー"
de9uch1
1
770
20240820: Minimum Bayes Risk Decoding for High-Quality Text Generation Beyond High-Probability Text
de9uch1
0
350
サブセット探索を用いた高速なkNNニューラル機械翻訳
de9uch1
0
170
20240226_AAMT-Japio
de9uch1
0
200
Searching for Needles in a Haystack: On the Role of Incidental Bilingualism in PaLM’s Translation Capability
de9uch1
0
160
My Research Environmental Setup
de9uch1
0
340
Nearest Neighbor Machine Translation
de9uch1
0
290
Paper Reading - Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation
de9uch1
0
310
paper reading - Tree Transformer
de9uch1
0
290
Other Decks in Research
See All in Research
R&Dチームを起ち上げる
shibuiwilliam
1
270
ScoreMatchingRiesz for Automatic Debiased Machine Learning and Policy Path Estimation with an Application to Japanese Monetary Policy Evaluation
masakat0
0
290
National high-resolution cropland classification of Japan with agricultural census information and multi-temporal multi-modality datasets
satai
3
290
老舗ものづくり企業でリサーチが変革を起こすまで - 三菱重工DXの実践
skydats
0
190
AIで最適化を解けるか?
mickey_kubo
0
120
Ankylosing Spondylitis
ankh2054
0
180
Unified Audio Source Separation (Defense Slides)
kohei_1979
1
620
AI Agentの精度改善に見るML開発との共通点 / commonalities in accuracy improvements in agentic era
shimacos
6
1.7k
COFFEE-Japan PROJECT Impact Report(海ノ向こうコーヒー)
ontheslope
0
2k
社内データ分析AIエージェントを できるだけ使いやすくする工夫
fufufukakaka
1
1.1k
量子コンピュータの紹介
oqtopus
0
330
ブレグマン距離最小化に基づくリース表現量推定:バイアス除去学習の統一理論
masakat0
0
290
Featured
See All Featured
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
287
14k
Keith and Marios Guide to Fast Websites
keithpitt
413
23k
Groundhog Day: Seeking Process in Gaming for Health
codingconduct
0
210
Claude Code どこまでも/ Claude Code Everywhere
nwiizo
65
56k
The Curious Case for Waylosing
cassininazir
1
390
Deep Space Network (abreviated)
tonyrice
0
210
Principles of Awesome APIs and How to Build Them.
keavy
128
18k
Measuring Dark Social's Impact On Conversion and Attribution
stephenakadiri
2
220
First, design no harm
axbom
PRO
2
1.2k
My Coaching Mixtape
mlcsv
0
150
The Illustrated Guide to Node.js - THAT Conference 2024
reverentgeek
1
390
Agile Actions for Facilitating Distributed Teams - ADO2019
mkilby
0
210
Transcript
(Bryan Eikema and Wilker Aziz, EMNLP2022)
◼ ⚫ ⚫ 𝒚MAP = argmax 𝒉∈𝒴 log 𝑝 𝒉
| 𝒙, 𝜃 𝒴 ▶ ⚫ 𝒚MBR = argmax 𝒉∈𝒴 𝔼 𝑢 𝒚∗, 𝒉 | 𝒙, 𝜃 = argmax 𝒉∈𝒴 𝜇𝑢 𝒉; 𝒙, 𝜃 ▶ 𝑢 𝒉 ∈ 𝒴 𝒚∗ ∈ 𝒴 ◼ 𝒴 𝜇𝑢 ⚫ ▶ ▶ 𝜇𝑢
(Eikema&Aziz, COLING2020) ◼ 𝑁 ഥ ℋ 𝒙 = 𝒚 1
, … , 𝒚 𝑁 ⚫ ◼ 𝜇𝑢 𝒉; 𝒙, 𝜃 ⚫ ො 𝜇𝑢 𝒉; 𝒙, 𝑁 ≔ 1 𝑁 σ𝑛=1 𝑁 𝑢 𝒚 𝑛 , 𝒉 ⚫ 𝒚NbyN ≔ argmax𝒉∈ ഥ ℋ 𝒙 ො 𝜇𝑢 𝒉; 𝒙, 𝑁 ◼ ⚫ 𝑁2 ▶ ▶ 𝒪 𝑁2 × 𝑈 , 𝑈 is the uppperbound cost to assess the utility function once. ⚫ “Is MAP Decoding All You Need? The Inadequacy of the Mode in Neural Machine Translation”, Eikema&Aziz, COLING2020
◼ 𝑆 < 𝑁 ො 𝜇𝑢 𝒪 𝑁2 × 𝑈
→ 𝒪 𝑁 × 𝑆 × 𝑈 ◼ 𝑇 ො 𝜇𝑢proxy ⚫ ഥ ℋ𝑇 𝒙 ≔ top𝑇𝒉∈ ഥ ℋ 𝒙 ො 𝜇𝑢proxy 𝒉; 𝒙, 𝑆 ⚫ 𝒚C2F ≔ argmax𝒉∈ ഥ ℋ𝑇 𝒙 ො 𝜇𝑢target 𝒉; 𝒙, 𝐿 ▶ 𝒪 𝑁 × 𝑆 × 𝑈proxy + 𝑇 × 𝐿 × 𝑈target ▶ 𝑆 = 5 𝑆 = 50
◼ ⚫ ⚫ ⚫ ◼ ◼ (Stanojević&Sima’an, WMT2014) ⚫ ◼
“BEER: BEtter Evaluation as Ranking”, Stanojević&Sima’an, WMT2014
◼ ⚫
◼ ◼ ◼
◼ 𝒚NbyS ≔ argmax 𝒉∈ 𝒚 𝑘 𝑘=1 𝑁 ො
𝜇𝑢 𝒉; 𝒙, 𝑆 ◼ 𝑆 ◼ 𝑆
◼ 𝑁 ⚫ ഥ ℋ 𝒙 ◼ ⚫ ▶ ഥ
ℋ 𝒙 𝑁
◼ ⚫ 𝑆 𝑆 ⚫ ⚫ ◼ ⚫ ⚫ ▶
◼ ⚫ ▶ 𝑁 = 405 ▶ 𝑆 = 13
⚫ ▶ top𝑇 = 50 ▶ ▶ 𝐿 = 100 ⚫ 𝑁 = 405 ◼ ⚫
◼ ⚫ ▶ ◼ ⚫ ⚫
◼ ⚫ ⚫ 𝑁 = 405, 𝑆 = 13, 𝑆large
= 100 ⚫ ◼ ⚫ ⚫
◼ ⚫ ⚫ ◼ ⚫ ⚫