Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
サブセット探索を用いた高速なkNNニューラル機械翻訳
Search
Hiroyuki Deguchi
March 22, 2024
Research
0
150
サブセット探索を用いた高速なkNNニューラル機械翻訳
第8回AAMTセミナー
AAMT若手翻訳研究会
最優秀賞
Hiroyuki Deguchi
March 22, 2024
Tweet
Share
More Decks by Hiroyuki Deguchi
See All by Hiroyuki Deguchi
20250226 NLP colloquium: "SoftMatcha: 10億単語規模コーパス検索のための柔らかくも高速なパターンマッチャー"
de9uch1
0
610
20240820: Minimum Bayes Risk Decoding for High-Quality Text Generation Beyond High-Probability Text
de9uch1
0
300
20240226_AAMT-Japio
de9uch1
0
170
Searching for Needles in a Haystack: On the Role of Incidental Bilingualism in PaLM’s Translation Capability
de9uch1
0
140
Paper Reading: Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation
de9uch1
0
190
My Research Environmental Setup
de9uch1
0
310
Nearest Neighbor Machine Translation
de9uch1
0
270
Paper Reading - Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation
de9uch1
0
290
paper reading - Tree Transformer
de9uch1
0
260
Other Decks in Research
See All in Research
空間音響処理における物理法則に基づく機械学習
skoyamalab
0
120
Remote sensing × Multi-modal meta survey
satai
4
640
EcoWikiRS: Learning Ecological Representation of Satellite Images from Weak Supervision with Species Observation and Wikipedia
satai
3
430
超高速データサイエンス
matsui_528
1
310
AI in Enterprises - Java and Open Source to the Rescue
ivargrimstad
0
1k
AlphaEarth Foundations: An embedding field model for accurate and efficient global mapping from sparse label data
satai
3
540
EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues
satai
3
410
第二言語習得研究における 明示的・暗示的知識の再検討:この分類は何に役に立つか,何に役に立たないか
tam07pb915
0
400
地域丸ごとデイサービス「Go トレ」の紹介
smartfukushilab1
0
630
Nullspace MPC
mizuhoaoki
1
490
J-RAGBench: 日本語RAGにおける Generator評価ベンチマークの構築
koki_itai
0
1.1k
20250725-bet-ai-day
cipepser
3
550
Featured
See All Featured
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
21
1.3k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
132
19k
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
3
390
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Building Adaptive Systems
keathley
44
2.9k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.6k
How to Ace a Technical Interview
jacobian
281
24k
Building Better People: How to give real-time feedback that sticks.
wjessup
370
20k
Being A Developer After 40
akosma
91
590k
Visualization
eitanlees
150
16k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
286
14k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
254
22k
Transcript
𝒌
◼ ⚫ ⚫ ◼ ⚫ (Zhang+, NAACL2018; Gu+, AAAI2018; Khandelwal+,
ICLR2021) ▶ (Nagao, 1984) ▶ ⚫ 𝑘 (Khandelwal+, ICLR2021) ▶ ▶ ▶ Guiding Neural Machine Translation with Retrieved Translation Pieces (Zhang+, NAACL2018) Search Engine Guided Neural Machine Translation (Gu+, AAAI2018) Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) A framework for a mechanical translation between Japanese and English by analogy principle (Nagao, 1984)
◼ ◼ ⚫ ⚫
𝒌 (Khandelwal+, ICLR2021) ◼ ⚫ ⚫ ⚫ ◼ ⚫ ▶
⚫ ▶ ≈ Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) 𝒙 𝒚
𝒌 (Khandelwal+, ICLR2021) 𝒌𝑖 ∈ ℝ𝐷 𝑓 𝒙, 𝒚<𝑡 ∈
ℝ𝐷 Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) ◼ 𝑘 ◼ ⚫ ⚫ 𝑝𝑘NN 𝑦𝑡 𝒙, 𝒚<𝑡 ∝ 𝑖=1 𝑘 𝟙𝑦𝑡=𝑣𝑖 exp − 𝒌𝑖 − 𝑓 𝒙, 𝒚<𝑡 2 2 𝜏 ◼ 𝑘
𝒌 ◼ (Martins+, EMNLP2022) ◼ (Meng+, ACLFindings2022) ⚫ 𝑘 𝑘
𝜆 = 0.5 𝑘 = 16 Chunk-based Nearest Neighbor Machine Translation (Martins+, EMNLP2022) Fast Nearest Neighbor Machine Translation (Meng+, ACL Findings2022)
𝒌 ◼ 𝑘 ◼ ⚫ 𝑘 (Matsui+, ACMMM2018) ⚫ 𝑘
𝑘 𝑘 Reconfigurable Inverted Index (Matsui+, ACMMM2018) 𝒌
◼ ⚫ 𝑘 ⚫ 𝑘 ◼ ◼ 𝑘
𝑛 𝑘 1 1 1 1 1 1 1 1
1
𝑛 𝑘 1 1 1 1 1 1 1 1
1
𝑛 𝑘 1 1 1 1 1 1 1 1
1
⚫ ⚫ ⚫ ⚫ ⚫ 𝑘 𝜆 = 0.5 𝑘
= 16 𝑛 = 56
𝑘 𝑘 ◼ 𝑘 ⚫ ▶ ⚫ ▶
◼ 𝑘 𝒌 𝒌
◼ ⚫ 𝑘
𝒌 𝒌 ◼ ⚫ ⚫ ◼ 𝑘 ⚫ ⚫ ◼
⚫
⚫ ⚫ ▶ ⚫ ▶