Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
サブセット探索を用いた高速なkNNニューラル機械翻訳
Search
Hiroyuki Deguchi
March 22, 2024
Research
0
46
サブセット探索を用いた高速なkNNニューラル機械翻訳
第8回AAMTセミナー
AAMT若手翻訳研究会
最優秀賞
Hiroyuki Deguchi
March 22, 2024
Tweet
Share
More Decks by Hiroyuki Deguchi
See All by Hiroyuki Deguchi
20240820: Minimum Bayes Risk Decoding for High-Quality Text Generation Beyond High-Probability Text
de9uch1
0
100
20240226_AAMT-Japio
de9uch1
0
72
Searching for Needles in a Haystack: On the Role of Incidental Bilingualism in PaLM’s Translation Capability
de9uch1
0
96
Paper Reading: Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation
de9uch1
0
110
My Research Environmental Setup
de9uch1
0
220
Nearest Neighbor Machine Translation
de9uch1
0
180
Paper Reading - Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation
de9uch1
0
230
paper reading - Tree Transformer
de9uch1
0
180
Dependency-Based Self-Attention for Transformer NMT (RANLP2019)
de9uch1
0
52
Other Decks in Research
See All in Research
TransformerによるBEV Perception
hf149
1
350
Tietovuoto Social Design Agency (SDA) -trollitehtaasta
hponka
0
1.7k
[第62回NLPコロキウム]「なりきり」を促すHCI設計:対話型接客ロボットの遠隔操作者へのリアルタイム変換音声フィードバックの適用
nami_ogawa
0
300
「確率的なオウム」にできること、またそれがなぜできるのかについて
eumesy
PRO
7
3k
論文読み会 SNLP2024 Instruction-tuned Language Models are Better Knowledge Learners. In: ACL 2024
s_mizuki_nlp
1
330
CSER 2024 Keynote
tsantalis
0
210
テキストマイニングことはじめー基本的な考え方からメディアディスコース研究への応用まで
langstat
1
110
129 2 th
0325
0
160
アジャイルコミュニティが、宗教ポイと云われるのは何故なのか?
fujiihideo
0
420
20240710_熊本県議会・熊本市議会_都市交通勉強会
trafficbrain
0
770
Kaggle役立ちアイテム紹介(入門編)
k951286
13
4.4k
DiscordにおけるキャラクターIPを活用したUGCコンテンツ生成サービスの ラピッドプロトタイピング ~国際ハッカソンでの事例研究
o_ob
0
260
Featured
See All Featured
Happy Clients
brianwarren
97
6.7k
The Power of CSS Pseudo Elements
geoffreycrofte
72
5.3k
Building Better People: How to give real-time feedback that sticks.
wjessup
363
19k
Code Reviewing Like a Champion
maltzj
519
39k
RailsConf 2023
tenderlove
29
870
Designing the Hi-DPI Web
ddemaree
280
34k
KATA
mclloyd
29
13k
How GitHub (no longer) Works
holman
311
140k
5 minutes of I Can Smell Your CMS
philhawksworth
202
19k
GraphQLとの向き合い方2022年版
quramy
43
13k
No one is an island. Learnings from fostering a developers community.
thoeni
19
3k
The Pragmatic Product Professional
lauravandoore
31
6.3k
Transcript
𝒌
◼ ⚫ ⚫ ◼ ⚫ (Zhang+, NAACL2018; Gu+, AAAI2018; Khandelwal+,
ICLR2021) ▶ (Nagao, 1984) ▶ ⚫ 𝑘 (Khandelwal+, ICLR2021) ▶ ▶ ▶ Guiding Neural Machine Translation with Retrieved Translation Pieces (Zhang+, NAACL2018) Search Engine Guided Neural Machine Translation (Gu+, AAAI2018) Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) A framework for a mechanical translation between Japanese and English by analogy principle (Nagao, 1984)
◼ ◼ ⚫ ⚫
𝒌 (Khandelwal+, ICLR2021) ◼ ⚫ ⚫ ⚫ ◼ ⚫ ▶
⚫ ▶ ≈ Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) 𝒙 𝒚
𝒌 (Khandelwal+, ICLR2021) 𝒌𝑖 ∈ ℝ𝐷 𝑓 𝒙, 𝒚<𝑡 ∈
ℝ𝐷 Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) ◼ 𝑘 ◼ ⚫ ⚫ 𝑝𝑘NN 𝑦𝑡 𝒙, 𝒚<𝑡 ∝ 𝑖=1 𝑘 𝟙𝑦𝑡=𝑣𝑖 exp − 𝒌𝑖 − 𝑓 𝒙, 𝒚<𝑡 2 2 𝜏 ◼ 𝑘
𝒌 ◼ (Martins+, EMNLP2022) ◼ (Meng+, ACLFindings2022) ⚫ 𝑘 𝑘
𝜆 = 0.5 𝑘 = 16 Chunk-based Nearest Neighbor Machine Translation (Martins+, EMNLP2022) Fast Nearest Neighbor Machine Translation (Meng+, ACL Findings2022)
𝒌 ◼ 𝑘 ◼ ⚫ 𝑘 (Matsui+, ACMMM2018) ⚫ 𝑘
𝑘 𝑘 Reconfigurable Inverted Index (Matsui+, ACMMM2018) 𝒌
◼ ⚫ 𝑘 ⚫ 𝑘 ◼ ◼ 𝑘
𝑛 𝑘 1 1 1 1 1 1 1 1
1
𝑛 𝑘 1 1 1 1 1 1 1 1
1
𝑛 𝑘 1 1 1 1 1 1 1 1
1
⚫ ⚫ ⚫ ⚫ ⚫ 𝑘 𝜆 = 0.5 𝑘
= 16 𝑛 = 56
𝑘 𝑘 ◼ 𝑘 ⚫ ▶ ⚫ ▶
◼ 𝑘 𝒌 𝒌
◼ ⚫ 𝑘
𝒌 𝒌 ◼ ⚫ ⚫ ◼ 𝑘 ⚫ ⚫ ◼
⚫
⚫ ⚫ ▶ ⚫ ▶