Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
20240226_AAMT-Japio
Search
Hiroyuki Deguchi
February 26, 2024
Research
200
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
20240226_AAMT-Japio
Hiroyuki Deguchi
February 26, 2024
More Decks by Hiroyuki Deguchi
See All by Hiroyuki Deguchi
20250226 NLP colloquium: "SoftMatcha: 10億単語規模コーパス検索のための柔らかくも高速なパターンマッチャー"
de9uch1
1
770
20240820: Minimum Bayes Risk Decoding for High-Quality Text Generation Beyond High-Probability Text
de9uch1
0
350
サブセット探索を用いた高速なkNNニューラル機械翻訳
de9uch1
0
170
Searching for Needles in a Haystack: On the Role of Incidental Bilingualism in PaLM’s Translation Capability
de9uch1
0
160
Paper Reading: Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation
de9uch1
0
220
My Research Environmental Setup
de9uch1
0
340
Nearest Neighbor Machine Translation
de9uch1
0
290
Paper Reading - Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation
de9uch1
0
310
paper reading - Tree Transformer
de9uch1
0
290
Other Decks in Research
See All in Research
Data Visualization Tools in the Age of AI
flekschas
0
160
データセンター事業者を取り巻く近年の状況とその中での研究開発動向、テストベッドへの貢献の可能性
kikuzo
1
200
敵対生成プロンプト同時探索による内省型プロンプト最適化
kinoue_smarthr
0
210
The mathematics of transformers
gpeyre
0
330
東京大学工学部計数工学科、計数工学特別講義の説明資料
kikuzo
0
500
英語教育 “研究” のあり方:学術知とアウトリーチの緊張関係
terasawat
1
990
LLM の Attention 機構まとめ — 数式・計算量・メモリ
puwaer
8
2.1k
論文紹介 "ReSim: Reliable World Simulation for Autonomous Driving"
kogo
0
640
第66回コンピュータビジョン勉強会@関東 Epona: Autoregressive Diffusion World Model for Autonomous Driving
kentosasaki
0
630
[BlackHatAsia2026] Hidden Telemetry: Uncovering TraceLogging ETW Providers You're Not Using (Yet)
asuna_jp
1
530
NLP colloquium: AI Safety Survey
kanekomasahiro
0
740
typst の使い方:言語学を研究する学生のために
gitomochang
0
460
Featured
See All Featured
A Guide to Academic Writing Using Generative AI - A Workshop
ks91
PRO
1
330
DevOps and Value Stream Thinking: Enabling flow, efficiency and business value
helenjbeal
1
240
Scaling GitHub
holman
464
140k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
37
6.5k
Become a Pro
speakerdeck
PRO
31
6k
Groundhog Day: Seeking Process in Gaming for Health
codingconduct
0
210
GraphQLとの向き合い方2022年版
quramy
50
15k
B2B Lead Gen: Tactics, Traps & Triumph
marketingsoph
0
160
Build your cross-platform service in a week with App Engine
jlugia
234
18k
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
230
Discover your Explorer Soul
emna__ayadi
2
1.1k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
10
1.2k
Transcript
None
◼ ⚫ ⚫ ◼ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫
◼ ⚫ ⇔ ⇔ ⇔ ⇔ ⇔ ⇔ ⚫ ◼
⚫ ⚫ ⚫ ⚫ ◼ ⚫ ▶ ▶ ▶ ⚫
◼ ◼ ◼
◼ ◼ ⚫ 𝑘 ⚫ 𝑝 ◼ ⚫ (Lee+, ACL2021)
⚫ (Fernandes+, NAACL2022) Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’. Fernandes+, NAACL2022, ``Quality-Aware Decoding for Neural Machine Translation’’.
None
◼ ⚫ ⚫ ⚫
◼ ◼ ◼ ⚫ ▶ ▶ ▶ ⚫ ▶ ▶
◼ ⚫
◼ ⚫ ⚫ ⚫ ⚫ ▶ ◼ ⚫ × ⚫
◼ ◼ ◼ 𝑘 (Khandelwal+, ICLR2021) ▶ ▶ 𝑃 =
1−𝜆 7 𝑝MT1 + ⋯ + 𝑝MT7 + 𝜆𝑝𝑘NN ▶ 𝑘 = 64, 𝜆 = 0.1, 𝜏 = 100 Khandelwal+, ICLR2021, ``Nearest Neighbor Machine Translation’’.
𝒌 (Khandelwal+, ICLR2021) ◼ ⚫ ◼ (Deguchi+, ACL2023) ⚫ ◼
Khandelwal+, ICLR2021, ``Nearest Neighbor Machine Translation’’. Deguchi+, ACL2023, ``Subset Retrieval Nearest Neighbor Machine Translation’’.
𝒌 ◼ ⚫ ▶ ∈ ℝ𝐷 ▶ ∈ 𝒱𝑌 ⚫
𝑓 𝒙, 𝒚<𝑡 ∈ ℝ𝐷 𝑦𝑡 ∈ 𝒱𝑌 ℳ ⊆ ℝ𝐷 × 𝒱𝑌 𝒙 𝒚
𝑘 𝒌 ◼ 𝑞 ∈ ℝ𝐷 ◼ 𝑞 𝑘 ◼
𝑝𝑘NN 𝑦𝑡 𝒙, 𝒚<𝑡 ∝ 𝑖=1 𝑘 𝟙𝑦𝑡=𝑣𝑖 exp − 𝒒 − 𝒌𝑖 2 2 𝜏 ◼ ⚫
◼ ⚫ ⚫ 𝑝 𝑝 = 0.5~0.7 ▶ ▶ ◼
× ×
None
◼ (Lee+, ACL2021) ⚫ ⚫ ▶ ⚫ ◼ (Fernandes+, NAACL2022)
⚫ ⚫ ▶ ⚫ Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’. Fernandes+, NAACL2022, ``Quality-Aware Decoding for Neural Machine Translation’’.
(Lee+, ACL2021) ◼ ◼ ℒ 𝜃 = − σ 𝑗=1
𝑛 𝑝𝑇 𝑢𝑗 log 𝑝𝑀 𝑢𝑗 ∣ 𝑥; 𝜃 ⚫ 𝜇 ⋅,⋅ ∈ [0, 1] 𝑝𝑇 𝑢𝑖 ∝ exp 𝜇(𝑢𝑖,𝑟) 𝑇 ⚫ 𝑝𝑀 𝑢𝑖 𝑥; 𝜃) ∝ exp 𝑜𝑖 𝑢𝑖 𝑥; 𝜃 ◼ ⚫ ⚫ 𝑟 𝑢𝑖 𝜇 𝑢𝑖 , 𝑟 ▶ 𝑝𝑀 𝑝𝑇 Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’.
(Lee+, ACL2021) ◼ ⚫ ⚫ ⚫ ⚫ ⚫ 𝑇 =
0.5 ▶ 𝑝𝑇 𝑢𝑖 ∝ exp 𝜇(𝑢𝑖,𝑟) 𝑇 ⚫ 𝛽1 = 0.9, 𝛽2 = 0.98 ⚫ ◼ ⚫ ▶ Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’.
(Fernandes+, ACL2021) ◼ ◼ ⚫ 𝑦MAP ∗ = argmax𝑦∈𝒴 log
𝑝𝜃 𝑦|𝑥 ⚫ 𝑦MBR ∗ = argmaxℎ∈ℋ 𝔼ො 𝑦~𝑃 𝑦|𝑥 𝑢 ℎ, ො 𝑦 ≈ 1 𝑁 𝑖=1 𝑁 𝑢 ℎ, ො 𝑦𝑖 ▶ 𝑢 ⋅,⋅ ◼ ⚫ Fernandes+ (NAACL2022) 𝑝 ▶ Fernandes+, NAACL2022, ``Quality-Aware Decoding for Neural Machine Translation’’.
(Goel & Byrne, CS&L 2000; Kumar & Byrne, NAACL2004) Goel
& Byrne, CS&L Vol14., 2000, ``Minimum Bayes-risk automatic speech recognition’’. Kumar & Byrne, NAACL2004, ``Minimum Bayes-Risk Decoding for Statistical Machine Translation’’. , 1 4 5 , ◼ 𝑦MBR ∗ ≔ argmaxℎ∈ℋ 𝔼ො 𝑦~𝑃 𝑦|𝑥 𝑢 ℎ, ො 𝑦 ⚫ ℋ ⊂ 𝒴 𝑢: 𝒴 × 𝒴 → ℝ ⚫ 𝑃 𝑦|𝑥 𝑥 𝑦 ◼ 𝒴 ∈ 𝒴 𝑦MBR ∗ ≈ argmax ℎ∈ℋ 𝔼ො 𝑦∈ 𝒴 𝑢 ℎ, ො 𝑦 ⚫ ▶ 𝒴 ≔ ℋ ⚫ 𝑁 ≔ ℋ 𝒪 𝑁2 ▶
(Fernandes+, NAACL2022) ⚫ ⚫ 𝑓: 𝒳 ∪ 𝒴 → ℝ𝐷
𝐷 ▶ 𝑥 ∈ 𝒳 ▶ ℎ ∈ 𝒴 ▶ ො 𝑦 ∈ 𝒴 ⚫ 𝑠: ℝ𝐷 × ℝ𝐷 × ℝ𝐷 → ℝ ◼ 𝑦COMET_MBR ∗ = argmaxℎ∈ℋ 𝔼ො 𝑦∈ 𝒴 𝑠 𝑓 𝑥 , 𝑓 ℎ , 𝑓 ො 𝑦 ⚫ (Fernandes+, NAACL2022) ⚫ 𝒪 𝑁2 : ( ) ( ) : × × Fernandes+, NAACL2022, ``Quality-Aware Decoding for Neural Machine Translation’’. ◼
◼ (Lee+, ACL2021) ⚫ ⚫ ▶ ⚫ ◼ (Fernandes+, NAACL2022)
⚫ ⚫ ▶ ⚫ Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’. Fernandes+, NAACL2022 , ``Quality-Aware Decoding for Neural Machine Translation’’.
None
◼ ⚫ ⚫
◼ 𝑘 ⚫ 𝑘
◼ ⚫ ⚫ ※
◼ ⚫ ⚫ ◼ ⚫ ⚫ ⚫ ▶ ▶
◼ ⚫ ⚫ ◼ ⚫ ⚫ ⚫ ▶ ▶ ▶
▶ +, NLP2024, `` ’’. Deguchi+, arXiv, ``Centroid-Based Efficient Minimum Bayes Risk Decoding’’. https://arxiv.org/abs/2402.11197