Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Paper Reading: Sampling-Based Approximations to...
Search
Hiroyuki Deguchi
February 15, 2023
Research
220
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Paper Reading: Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation
Hiroyuki Deguchi
February 15, 2023
More Decks by Hiroyuki Deguchi
See All by Hiroyuki Deguchi
20250226 NLP colloquium: "SoftMatcha: 10億単語規模コーパス検索のための柔らかくも高速なパターンマッチャー"
de9uch1
1
770
20240820: Minimum Bayes Risk Decoding for High-Quality Text Generation Beyond High-Probability Text
de9uch1
0
350
サブセット探索を用いた高速なkNNニューラル機械翻訳
de9uch1
0
170
20240226_AAMT-Japio
de9uch1
0
200
Searching for Needles in a Haystack: On the Role of Incidental Bilingualism in PaLM’s Translation Capability
de9uch1
0
160
My Research Environmental Setup
de9uch1
0
340
Nearest Neighbor Machine Translation
de9uch1
0
290
Paper Reading - Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation
de9uch1
0
310
paper reading - Tree Transformer
de9uch1
0
290
Other Decks in Research
See All in Research
SOTAのさらに先へ:厳しい推論制約下での高性能モデルのPost-Training
analokmaus
0
1.3k
PGDM: Physically Guided Diffusion Model for L Downscaling
satai
2
280
COFFEE-Japan PROJECT Impact Report(Uminomukou Coffee)
ontheslope
0
200
コーディングエージェントとABNを再考
hf149
2
720
Φ-Sat-2のAutoEncoderによる情報圧縮系論文
satai
4
780
羽田新ルート運用6年の検証
1manken
0
160
2026-01-30-MandSL-textbook-jp-cos-lod
yegusa
1
1.3k
Sleuthcon Keynote - How Cybercriminals (ab)use AI
fr0gger
0
130
AGI4OPT:自然言語から数理最適化を導くエ ージェントスキル Translating Human Intent into Mathematical Optimization
mickey_kubo
0
140
(SIGQS17) Frasco-VS:フラグメントに基づく薬剤候補化合物選抜の量子アニーリングによる実現
keisukeyanagisawa
PRO
0
120
Anthropic が提案する LLM の内部状態を自然言語で説明可能にした Natural Language Autoencoders / Natural Language Autoencoders Produce Unsupervised Explanations of LLM Activations
shunk031
0
130
人間中心の意思決定支援AI
yukinobaba
PRO
6
2.9k
Featured
See All Featured
Git: the NoSQL Database
bkeepers
PRO
432
67k
30 Presentation Tips
portentint
PRO
1
330
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
55k
Unsuck your backbone
ammeep
672
58k
Jess Joyce - The Pitfalls of Following Frameworks
techseoconnect
PRO
1
170
What Being in a Rock Band Can Teach Us About Real World SEO
427marketing
0
260
Facilitating Awesome Meetings
lara
57
7k
エンジニアに許された特別な時間の終わり
watany
107
250k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
34
2.8k
Jamie Indigo - Trashchat’s Guide to Black Boxes: Technical SEO Tactics for LLMs
techseoconnect
PRO
0
170
Why Our Code Smells
bkeepers
PRO
340
58k
Art, The Web, and Tiny UX
lynnandtonic
304
22k
Transcript
(Bryan Eikema and Wilker Aziz, EMNLP2022)
◼ ⚫ ⚫ 𝒚MAP = argmax 𝒉∈𝒴 log 𝑝 𝒉
| 𝒙, 𝜃 𝒴 ▶ ⚫ 𝒚MBR = argmax 𝒉∈𝒴 𝔼 𝑢 𝒚∗, 𝒉 | 𝒙, 𝜃 = argmax 𝒉∈𝒴 𝜇𝑢 𝒉; 𝒙, 𝜃 ▶ 𝑢 𝒉 ∈ 𝒴 𝒚∗ ∈ 𝒴 ◼ 𝒴 𝜇𝑢 ⚫ ▶ ▶ 𝜇𝑢
(Eikema&Aziz, COLING2020) ◼ 𝑁 ഥ ℋ 𝒙 = 𝒚 1
, … , 𝒚 𝑁 ⚫ ◼ 𝜇𝑢 𝒉; 𝒙, 𝜃 ⚫ ො 𝜇𝑢 𝒉; 𝒙, 𝑁 ≔ 1 𝑁 σ𝑛=1 𝑁 𝑢 𝒚 𝑛 , 𝒉 ⚫ 𝒚NbyN ≔ argmax𝒉∈ ഥ ℋ 𝒙 ො 𝜇𝑢 𝒉; 𝒙, 𝑁 ◼ ⚫ 𝑁2 ▶ ▶ 𝒪 𝑁2 × 𝑈 , 𝑈 is the uppperbound cost to assess the utility function once. ⚫ “Is MAP Decoding All You Need? The Inadequacy of the Mode in Neural Machine Translation”, Eikema&Aziz, COLING2020
◼ 𝑆 < 𝑁 ො 𝜇𝑢 𝒪 𝑁2 × 𝑈
→ 𝒪 𝑁 × 𝑆 × 𝑈 ◼ 𝑇 ො 𝜇𝑢proxy ⚫ ഥ ℋ𝑇 𝒙 ≔ top𝑇𝒉∈ ഥ ℋ 𝒙 ො 𝜇𝑢proxy 𝒉; 𝒙, 𝑆 ⚫ 𝒚C2F ≔ argmax𝒉∈ ഥ ℋ𝑇 𝒙 ො 𝜇𝑢target 𝒉; 𝒙, 𝐿 ▶ 𝒪 𝑁 × 𝑆 × 𝑈proxy + 𝑇 × 𝐿 × 𝑈target ▶ 𝑆 = 5 𝑆 = 50
◼ ⚫ ⚫ ⚫ ◼ ◼ (Stanojević&Sima’an, WMT2014) ⚫ ◼
“BEER: BEtter Evaluation as Ranking”, Stanojević&Sima’an, WMT2014
◼ ⚫
◼ ◼ ◼
◼ 𝒚NbyS ≔ argmax 𝒉∈ 𝒚 𝑘 𝑘=1 𝑁 ො
𝜇𝑢 𝒉; 𝒙, 𝑆 ◼ 𝑆 ◼ 𝑆
◼ 𝑁 ⚫ ഥ ℋ 𝒙 ◼ ⚫ ▶ ഥ
ℋ 𝒙 𝑁
◼ ⚫ 𝑆 𝑆 ⚫ ⚫ ◼ ⚫ ⚫ ▶
◼ ⚫ ▶ 𝑁 = 405 ▶ 𝑆 = 13
⚫ ▶ top𝑇 = 50 ▶ ▶ 𝐿 = 100 ⚫ 𝑁 = 405 ◼ ⚫
◼ ⚫ ▶ ◼ ⚫ ⚫
◼ ⚫ ⚫ 𝑁 = 405, 𝑆 = 13, 𝑆large
= 100 ⚫ ◼ ⚫ ⚫
◼ ⚫ ⚫ ◼ ⚫ ⚫