Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Unified Language Model Pre-training for Natural...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Scatter Lab Inc.
April 10, 2020
Research
2.3k
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Unified Language Model Pre-training for Natural Language Understanding and Generation
Scatter Lab Inc.
April 10, 2020
More Decks by Scatter Lab Inc.
See All by Scatter Lab Inc.
zeta introduction
scatterlab
0
1.9k
SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
scatterlab
0
4.4k
Adversarial Filters of Dataset Biases
scatterlab
0
2.3k
Sparse, Dense, and Attentional Representations for Text Retrieval
scatterlab
0
2.3k
Weight Poisoning Attacks on Pre-trained Models
scatterlab
0
2.2k
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
scatterlab
0
2.5k
Beyond Accuracy: Behavioral Testing of NLP Models with CheckList
scatterlab
0
2.3k
Open-Retrieval Conversational Question Answering
scatterlab
0
2.3k
What Can Neural Networks Reason About?
scatterlab
0
2.3k
Other Decks in Research
See All in Research
AIを叩き台として、 「検証」から「共創」へと進化するリサーチ
mela_dayo
0
280
2026 東京科学大 情報通信系 研究室紹介 (すずかけ台)
icttitech
0
3.8k
英語教育 “研究” のあり方:学術知とアウトリーチの緊張関係
terasawat
1
990
Research Engineerという仕事 / Research Engineering: Bridging Research and Business
chck
1
210
社内データ分析AIエージェントを できるだけ使いやすくする工夫
fufufukakaka
1
1.1k
2026-01-30-MandSL-textbook-jp-cos-lod
yegusa
1
1.3k
適応的スパムフィルタのための軽量な類似メッセージカウンタ / jsai2026-adaptive-spam-filter
monochromegane
0
3.6k
衛星×エッジAI勉強会 衛星上におけるAI処理制約とそ取組について
satai
4
560
Any-Optical-Model: A Universal Foundation Model for Optical Remote Sensing
satai
3
830
National high-resolution cropland classification of Japan with agricultural census information and multi-temporal multi-modality datasets
satai
3
290
東京大学工学部計数工学科、計数工学特別講義の説明資料
kikuzo
0
480
討議:RACDA設立30周年記念都市交通フォーラム2026
trafficbrain
0
960
Featured
See All Featured
The Art of Programming - Codeland 2020
erikaheidi
57
14k
The State of eCommerce SEO: How to Win in Today's Products SERPs - #SEOweek
aleyda
2
11k
Skip the Path - Find Your Career Trail
mkilby
1
150
How to Ace a Technical Interview
jacobian
281
24k
Agile that works and the tools we love
rasmusluckow
331
21k
Visual Storytelling: How to be a Superhuman Communicator
reverentgeek
2
560
Evolving SEO for Evolving Search Engines
ryanjones
0
220
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
Reflections from 52 weeks, 52 projects
jeffersonlam
356
21k
Learning to Love Humans: Emotional Interface Design
aarron
275
41k
Paper Plane (Part 1)
katiecoart
PRO
0
9k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
47
8.2k
Transcript
Unified Language Model Pre-training for Natural Language Understanding and Generation
Li Dong et al., NeurIPS 2019 (Microsoft) ࢲ࢚ (ML Research Scientist, Pingpong)
ݾର ݾର 1. Pre-training Language Model ѐਃ 2. Unified Language
Model 1. Method 2. Pre-training step 3. Fine-tuning step 3. Experiments 1. NLG Task 2. NLU Task
Pre-training Language Model ѐਃ Pre-training Language Model ѐਃ
Pre-training Language Model ѐਃ Pre-training Language Model ѐਃ • BERT,
GPT, ELMOח п ߑधਵ۽ જ ࢿҗܳ ਵա ױ ઓೠ. • (e.g. BERTח নߑೱۄח ౠࢿਵ۽ ੋ೧ ֫ ࢿמਸ ഛࠁೞ݅ NLG taskীࢲח ॶ ࣻ হ.)
•пп LM objectiveח ܲ ݾਸ о. •Bidrectional => NLU •Undirectional
=> NLG •Seq-to-Seq => summarization, Generative question answering Pre-training Language Model ѐਃ
Unified Language Model Pre-training Language Model ѐਃ
Unified Language Model Unified Language Model •unified pre-training ৈ۞ ఋੑ
LMਸ ਤೠ parameterܳ ҕਬೞӝ ٸޙী single transformer݅ ਸ ਃ۽ ೞҊ ৈ۞ LMܳ ߹ب णೡ ਃо হ. •parameter ҕਬо text അਸ ખ ؊ general ೞѱ णೡ ࣻ ѱ ೠ. (زदী optimizeೞӝ ٸ ޙী single LMী ೞৈ ؏ overfitting) •NLU৬ NLG ܳ زदী ࢎਊ оמ
•UNILM ӝઓ LMਸ ా •пп LM ೠ п taskо ઓೞӝ
ٸޙী ܳ multi-task learningਸ ా೧ زदী ण Unified Language Model
•пӝ ܲ LMܳ ण ೞӝ ਤ೧ࢲ parameterח shareೞ݅ Maskingਸ ࢎਊ
•seq-to-seqܳ ೞա transformer ղࠗী ҳ അೞӝ ਤ೧ࢲ ౠೠ ഋక Maskingਸ ࢎ ਊ •पઁ ण షਸ [MASK]۽ ജ ೠ റী ܳ ݏ୶ח taskܳ п LM߹۽ द ೯ •bidirectional LMೡٸח ө NSPೠ. Unified Language Model
•[SOS]ח scpecial start-of-sequence •[EOS]ח NLU task ޙ ҃҅ scpecial end-of-sequence
•Embedding BERTܳ ٮܰݴ textח WordPieceܳ ా೧ tokenize •пп LM task߹۽ ܲ segment embedding ࢎਊػ. Unified Language Model
ࣻधਵ۽ ࢤп೧ࠁݶ п objective ߹۽ M ч ׳ۄ. Unified Language
Model
Pre-training Setup Unified Language Model • training objectiveח п LM
sum •ೞա ߓ ղীח নߑೱ LM objectiveܳ 1/3, द௫झ-द௫झ LM objectiveܳ 1/3, left-to- right and right-to-left LM objectiveח 1/6 ࠺ਯ۽ ࢠ݂ • ۄఠח BERT_largre۽ ୡӝച •pre-trainingীח English Wikipedia2৬ BookCorpusܳ ࢎਊ
Pre-training Setup Unified Language Model •vocabulary size is 28, 996,
maximum length of input sequence is 512, batch size 330 •15% tokenਸ ࣁ о case ೞա۽ ജ • 80% ҃ : tokenਸ [MASK]۽ ജ •10% ҃ : tokenਸ random word۽ ߄Է •10% ҃ : tokenਸ ਗې ױয۽ Ӓ۽ م •݃झఊ दఃח ߑߨ BERTی Ѣ زੌೞա ೞաо ୶оػ Ѫ 80%ח ݒߣ ೞա షਸ ݃झఊೞҊ 20%ח bigramա trigramਸ ݃झఊೠ. •770, 000 stepө ण೮Ҋ 7 hoursبݶ 1݅ stepب ت ( 8ѐ V100ীࢲ)
Fine-tuning on Downstream NLU and NLG Tasks Unified Language Model
•NLUীࢲ fine-tuning दীח [SOS] షਸ representationਵ۽ ࢎਊ ( BERT [CLS] ৬ زੌ ) •NLGܳ fine-tuning दীח target sequenceী ೠ maskingਸ ೞҊ ݏ୶ח taskܳ ೯ೠ. • җীࢲ [EOS] ژೠ ਕ ࣻ ӝ ٸޙী ݽ؛ ઁ [EOS]ܳ ஏ೧ঠ ೞחب ߓ ࣻ Ҋ ೠ.
Experiments Experiments
•CNN/DailyMail => News ӝࢎܳ ࠁҊࢲ ਃডೞח task •RG-N N-gram F1-score
•seq-to-seqܳ ా೧ fine-tuning (masking റী ݏ୶ח task ೯) •beam searchܳ ా೧ decoding ( beam search ী duplicated trigramਸ remove ) •10K training sample ࢎਊदী MASS ખ ؊ ରܳ ࠁੋ. Experiments : Abstractive Summarizaiton
•খী ف ѐח span ஏҊ ӝઓ ߡ৬ زੌೠ ߑधਵ۽ ೯
•ࣁߣ૩ח free-formೠ ߑधਸ ࢎਊਵ۽ seq-to-seqܳ ా೧ answerܳ generationೠ. •inputܳ ݅٘ח ߑध ച ӝ۾, ޙ, passageܳ concatೞৈ first sequenceী ֍Ҋ second segment ܳ ా೧ ਸ ஏ Experiments: QA
•Question generation squad ؘఠ ࣇ җ passageܳ Ҋ ޙਸ ࢤࢿೞח
task •فߣ૩ח DSCT7 ؘఠ ࣇী ೠ ࢿמ Experiments: Question/ Response Generation
•GLUEীࢲ BERT_largeܳ outperform Experiments: GLUE
хࢎפ✌ ୶о ޙ ژח ҾӘೠ ݶ ઁٚ ইې োۅ۽
োۅ ࣁਃ! ࢲ࢚ (ML Research Scientist, Pingpong)
[email protected]