20240226_AAMT-Japio

◼ ⚫ ⚫ ◼ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫

◼ ⚫ ⇔ ⇔ ⇔ ⇔ ⇔ ⇔ ⚫ ◼
⚫ ⚫ ⚫ ⚫ ◼ ⚫ ▶ ▶ ▶ ⚫

◼ ◼ ◼

◼ ◼ ⚫ 𝑘 ⚫ 𝑝 ◼ ⚫ (Lee+, ACL2021)
⚫ (Fernandes+, NAACL2022) Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’. Fernandes+, NAACL2022, ``Quality-Aware Decoding for Neural Machine Translation’’.

◼ ⚫ ⚫ ⚫

◼ ◼ ◼ ⚫ ▶ ▶ ▶ ⚫ ▶ ▶
◼ ⚫

◼ ⚫ ⚫ ⚫ ⚫ ▶ ◼ ⚫ × ⚫

◼ ◼ ◼ 𝑘 (Khandelwal+, ICLR2021) ▶ ▶ 𝑃 =
1−𝜆 7 𝑝MT1 + ⋯ + 𝑝MT7 + 𝜆𝑝𝑘NN ▶ 𝑘 = 64, 𝜆 = 0.1, 𝜏 = 100 Khandelwal+, ICLR2021, ``Nearest Neighbor Machine Translation’’.

𝒌 (Khandelwal+, ICLR2021) ◼ ⚫ ◼ (Deguchi+, ACL2023) ⚫ ◼
Khandelwal+, ICLR2021, ``Nearest Neighbor Machine Translation’’. Deguchi+, ACL2023, ``Subset Retrieval Nearest Neighbor Machine Translation’’.

𝒌 ◼ ⚫ ▶ ∈ ℝ𝐷 ▶ ∈ 𝒱𝑌 ⚫
𝑓 𝒙, 𝒚<𝑡 ∈ ℝ𝐷 𝑦𝑡 ∈ 𝒱𝑌 ℳ ⊆ ℝ𝐷 × 𝒱𝑌 𝒙 𝒚

𝑘 𝒌 ◼ 𝑞 ∈ ℝ𝐷 ◼ 𝑞 𝑘 ◼
𝑝𝑘NN 𝑦𝑡 𝒙, 𝒚<𝑡 ∝ ෍ 𝑖=1 𝑘 𝟙𝑦𝑡=𝑣𝑖 exp − 𝒒 − 𝒌𝑖 2 2 𝜏 ◼ ⚫

◼ ⚫ ⚫ 𝑝 𝑝 = 0.5~0.7 ▶ ▶ ◼
× ×

◼ (Lee+, ACL2021) ⚫ ⚫ ▶ ⚫ ◼ (Fernandes+, NAACL2022)
⚫ ⚫ ▶ ⚫ Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’. Fernandes+, NAACL2022, ``Quality-Aware Decoding for Neural Machine Translation’’.

(Lee+, ACL2021) ◼ ◼ ℒ 𝜃 = − σ 𝑗=1
𝑛 𝑝𝑇 𝑢𝑗 log 𝑝𝑀 𝑢𝑗 ∣ 𝑥; 𝜃 ⚫ 𝜇 ⋅,⋅ ∈ [0, 1] 𝑝𝑇 𝑢𝑖 ∝ exp 𝜇(𝑢𝑖,𝑟) 𝑇 ⚫ 𝑝𝑀 𝑢𝑖 𝑥; 𝜃) ∝ exp 𝑜𝑖 𝑢𝑖 𝑥; 𝜃 ◼ ⚫ ⚫ 𝑟 𝑢𝑖 𝜇 𝑢𝑖 , 𝑟 ▶ 𝑝𝑀 𝑝𝑇 Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’.

(Lee+, ACL2021) ◼ ⚫ ⚫ ⚫ ⚫ ⚫ 𝑇 =
0.5 ▶ 𝑝𝑇 𝑢𝑖 ∝ exp 𝜇(𝑢𝑖,𝑟) 𝑇 ⚫ 𝛽1 = 0.9, 𝛽2 = 0.98 ⚫ ◼ ⚫ ▶ Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’.

(Fernandes+, ACL2021) ◼ ◼ ⚫ 𝑦MAP ∗ = argmax𝑦∈𝒴 log
𝑝𝜃 𝑦|𝑥 ⚫ 𝑦MBR ∗ = argmaxℎ∈ℋ 𝔼ො 𝑦~𝑃 𝑦|𝑥 𝑢 ℎ, ො 𝑦 ≈ 1 𝑁 ෍ 𝑖=1 𝑁 𝑢 ℎ, ො 𝑦𝑖 ▶ 𝑢 ⋅,⋅ ◼ ⚫ Fernandes+ (NAACL2022) 𝑝 ▶ Fernandes+, NAACL2022, ``Quality-Aware Decoding for Neural Machine Translation’’.

(Goel & Byrne, CS&L 2000; Kumar & Byrne, NAACL2004) Goel
& Byrne, CS&L Vol14., 2000, ``Minimum Bayes-risk automatic speech recognition’’. Kumar & Byrne, NAACL2004, ``Minimum Bayes-Risk Decoding for Statistical Machine Translation’’. , 1 4 5 , ◼ 𝑦MBR ∗ ≔ argmaxℎ∈ℋ 𝔼ො 𝑦~𝑃 𝑦|𝑥 𝑢 ℎ, ො 𝑦 ⚫ ℋ ⊂ 𝒴 𝑢: 𝒴 × 𝒴 → ℝ ⚫ 𝑃 𝑦|𝑥 𝑥 𝑦 ◼ ෠ 𝒴 ∈ 𝒴 𝑦MBR ∗ ≈ argmax ℎ∈ℋ 𝔼ො 𝑦∈ ෠ 𝒴 𝑢 ℎ, ො 𝑦 ⚫ ▶ ෠ 𝒴 ≔ ℋ ⚫ 𝑁 ≔ ℋ 𝒪 𝑁2 ▶

(Fernandes+, NAACL2022) ⚫ ⚫ 𝑓: 𝒳 ∪ 𝒴 → ℝ𝐷
𝐷 ▶ 𝑥 ∈ 𝒳 ▶ ℎ ∈ 𝒴 ▶ ො 𝑦 ∈ 𝒴 ⚫ 𝑠: ℝ𝐷 × ℝ𝐷 × ℝ𝐷 → ℝ ◼ 𝑦COMET_MBR ∗ = argmaxℎ∈ℋ 𝔼ො 𝑦∈ ෠ 𝒴 𝑠 𝑓 𝑥 , 𝑓 ℎ , 𝑓 ො 𝑦 ⚫ (Fernandes+, NAACL2022) ⚫ 𝒪 𝑁2 : ( ) ( ) : × × Fernandes+, NAACL2022, ``Quality-Aware Decoding for Neural Machine Translation’’. ◼

◼ (Lee+, ACL2021) ⚫ ⚫ ▶ ⚫ ◼ (Fernandes+, NAACL2022)
⚫ ⚫ ▶ ⚫ Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’. Fernandes+, NAACL2022 , ``Quality-Aware Decoding for Neural Machine Translation’’.

◼ ⚫ ⚫

◼ 𝑘 ⚫ 𝑘

◼ ⚫ ⚫ ※

◼ ⚫ ⚫ ◼ ⚫ ⚫ ⚫ ▶ ▶

◼ ⚫ ⚫ ◼ ⚫ ⚫ ⚫ ▶ ▶ ▶
▶ +, NLP2024, `` ’’. Deguchi+, arXiv, ``Centroid-Based Efficient Minimum Bayes Risk Decoding’’. https://arxiv.org/abs/2402.11197

20240226_AAMT-Japio

20240226_AAMT-Japio

Hiroyuki Deguchi

More Decks by Hiroyuki Deguchi

Other Decks in Research

Featured

Transcript

◼ ⚫ ⚫ ◼ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫

◼ ⚫ ⇔ ⇔ ⇔ ⇔ ⇔ ⇔ ⚫ ◼

◼ ◼ ◼

◼ ◼ ⚫ 𝑘 ⚫ 𝑝 ◼ ⚫ (Lee+, ACL2021)

◼ ⚫ ⚫ ⚫

◼ ◼ ◼ ⚫ ▶ ▶ ▶ ⚫ ▶ ▶

◼ ⚫ ⚫ ⚫ ⚫ ▶ ◼ ⚫ × ⚫

◼ ◼ ◼ 𝑘 (Khandelwal+, ICLR2021) ▶ ▶ 𝑃 =

𝒌 (Khandelwal+, ICLR2021) ◼ ⚫ ◼ (Deguchi+, ACL2023) ⚫ ◼

𝒌 ◼ ⚫ ▶ ∈ ℝ𝐷 ▶ ∈ 𝒱𝑌 ⚫

𝑘 𝒌 ◼ 𝑞 ∈ ℝ𝐷 ◼ 𝑞 𝑘 ◼

◼ ⚫ ⚫ 𝑝 𝑝 = 0.5~0.7 ▶ ▶ ◼

◼ (Lee+, ACL2021) ⚫ ⚫ ▶ ⚫ ◼ (Fernandes+, NAACL2022)

(Lee+, ACL2021) ◼ ◼ ℒ 𝜃 = − σ 𝑗=1

(Lee+, ACL2021) ◼ ⚫ ⚫ ⚫ ⚫ ⚫ 𝑇 =

(Fernandes+, ACL2021) ◼ ◼ ⚫ 𝑦MAP ∗ = argmax𝑦∈𝒴 log

(Goel & Byrne, CS&L 2000; Kumar & Byrne, NAACL2004) Goel

(Fernandes+, NAACL2022) ⚫ ⚫ 𝑓: 𝒳 ∪ 𝒴 → ℝ𝐷

◼ (Lee+, ACL2021) ⚫ ⚫ ▶ ⚫ ◼ (Fernandes+, NAACL2022)

◼ ⚫ ⚫

◼ 𝑘 ⚫ 𝑘

◼ ⚫ ⚫ ※

◼ ⚫ ⚫ ◼ ⚫ ⚫ ⚫ ▶ ▶

◼ ⚫ ⚫ ◼ ⚫ ⚫ ⚫ ▶ ▶ ▶