Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Frotiers of Natural Language Processing
Search
Mamoru Komachi
April 23, 2015
Technology
0
16
Frotiers of Natural Language Processing
Recruit Technologies Open Lab #01 (テーマ: 自然言語処理)で話したときに使ったスライドです。
https://atnd.org/events/64383
Mamoru Komachi
April 23, 2015
Tweet
Share
More Decks by Mamoru Komachi
See All by Mamoru Komachi
IM2024
mamoruk
0
240
大規模言語モデルのインパクトと課題/oc2023
mamoruk
0
40
Exploring and Adapting Chinese GPT to Pinyin Input Method
mamoruk
0
110
Recent advances in natural language understanding and natural language generation
mamoruk
0
100
Introduction to Natural Language Processing
mamoruk
0
33
Generative Adversarial Network for Natural Language Processing
mamoruk
0
45
Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning
mamoruk
2
740
Sequence-to-Dependency Neural Machine Translation
mamoruk
0
34
Visualizing and Understanding Neural Machine Translation
mamoruk
0
34
Other Decks in Technology
See All in Technology
AIエージェント開発のノウハウと課題
pharma_x_tech
10
5.8k
最近のSRE支援ニーズ考察 | sogaoh's LT @ Road to SRE NEXT@札幌
sogaoh
PRO
1
170
Roomの監視可能なクエリのカスタマイズとレガシーコードへの適用
shiita0903
2
140
Real World Nix CI/CD編
asa1984
1
160
Amazon Bedrock Knowledge basesにLangfuse導入してみた
sonoda_mj
2
340
サイト信頼性エンジニアリングとAmazon Web Services / SRE and AWS
ymotongpoo
8
2k
エンジニアのキャリアパスと、 その中で自分が大切にしていること
noteinc
4
3.2k
失敗しないAIエージェント開発:階層的タスク分解の実践
kworkdev
PRO
0
550
20250307_エンジニアじゃないけどAzureはじめてみた
ponponmikankan
2
270
入門 PEAK Threat Hunting @SECCON
odorusatoshi
0
190
マネコン操作いらず! TerraformでAWSインフラのコーディングに入門しよう
minorun365
PRO
2
280
ライフステージの変化を乗り越える 探索型のキャリア選択
tenshoku_draft
2
380
Featured
See All Featured
The Invisible Side of Design
smashingmag
299
50k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
356
29k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.2k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
129
19k
A Tale of Four Properties
chriscoyier
158
23k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
248
1.3M
Fantastic passwords and where to find them - at NoRuKo
philnash
51
3k
Faster Mobile Websites
deanohume
306
31k
Building a Modern Day E-commerce SEO Strategy
aleyda
38
7.1k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
6
590
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
44
7.1k
Making the Leap to Tech Lead
cromwellryan
133
9.1k
Transcript
ࣗવݴޠॲཧͷ৽ల։ 20154݄21 टେֶ౦ژ γεςϜσβΠϯֶ෦ খொक
ࣗݾհ: খொकʢ͜·ͪ·Δʣ 2 ß 2005.03 ౦ژେֶڭཆֶ෦جૅՊֶՊ Պֶ࢙ɾՊֶֶՊଔۀ ß 2010.03 ಸྑઌେɾത࢜ޙظ՝ఔमྃ
ത࢜ʢֶʣ ઐ: ࣗવݴޠॲཧ ß 2010.04ʙ2013.03 ಸྑઌେ ॿڭʢদຊ༟࣏ݚڀࣨʣ ß 2013.04〜 टେֶ౦ژ ।ڭतʢࣗવݴޠॲཧݚڀࣨʣ
ຊͷ࣍ ß ਂֶश͕ࣗવݴޠॲཧʹ༩͑ΔΠϯύ Ϋτ ß ࣗવݴޠॲཧͷ৽ͨͳൃల 3
ਂֶशʢdeep learningʣ ß ෳϨΠϠʔͷχϡʔϥϧωοτϫʔΫ ʹΑͬͯෳࡶͳϞσϧΛֶश͢ΔΈ ß ༷ʑͳύλʔϯೝࣝλεΫͰେ෯ͳੑೳ ্Λୡ͠ɺGoogle, Facebook, Microsoft,
Baidu ͳͲ͞·͟·ͳاۀ͕͜ ͧͬͯݚڀ 4
Lee et al., ICML 2009. 5
ਂֶशͷॴ ß ૉੑֶʢfeature engineeringʣ͕ෆཁɻ ϥϕϧͳ͠σʔλ͔Βࣗಈతʹ༗ޮͳૉ ੑͷΈ߹Θֶ͕ͤशՄೳɻ →ϋΠύʔύϥϝʔλଘࡏ ß σʔλ͔ΒେҬతͳදݱֶशʢdistributed representationʣ͕Մೳ
→ΫϥελϦϯάہॴతͳදݱֶश 6
χϡʔϥϧωοτϫʔΫ ͷϒϨΠΫεϧʔ ß Hinton et al., A Fast Learning Algorithm
for Deep Belief Nets, Neural Computing, 2006. ß χϡʔϥϧωοτϫʔΫ1950͔Β ͕͋ͬͨɺදݱೳྗ͕ߴ͗ͯ͢ʢσʔλ ྔʹରͯ͠ʣաֶशʹͳΓ͔ͬͨ͢ɻ →͝ͱʹֶशΛߦ͍ɺෳΛॏͶΔ ͜ͱͰաֶशͷ͕ղܾͰ͖ͨʂ 7
࠶ؼతχϡʔϥϧωοτϫʔΫ Λ༻͍ͨը૾ೝࣝͱߏจղੳ 8 • Parsing Natural Scenes and Natural Language
with Recursive Neural Networks, Socher et al., ICML 2011. • ྡ͢Δը૾ྖҬɾ୯ ޠ͔Β࠶ؼతʹߏΛ ೝࣝ͢Δ →Staford Parser ʹ౷ ߹ (ACL 2013)
࠶ؼతχϡʔϥϧωοτϫʔΫͰ ϑϨʔζͷײۃੑྨ࣮ݱ 9 • Recursive Deep Models for Semantic Compositionality
Over a Sentiment Treebank, Socher et al., EMNLP 2013.
Socher et al. (NIPS 2011): ୯ޠϕΫ τϧ͔ΒจͷҙຯΛ࠶ؼతʹܭࢉ 10
ϦΧϨϯτχϡʔϥϧωοτ ϫʔΫͰແݶͷจ຺ΛߟྀՄೳ 11 • Recurrent Neural Network based Language Model,
Mikolov et al., InterSpeech 2010. →աڈͷཤྺΛߟྀͯ͠ݱࡏͷ୯ޠΛ༧ଌ͢ΔϞσϧ
ػց༁ܥྻ͔ΒܥྻΛੜ͢ ΔϞσϧͱͯ͠ਂֶशͰѻ͑Δ ß Sequence to Sequence Learning with Neural Networks,
Sutskever et al., NIPS 2014. →LSTM (Long-Short Term Memory) Λ2ͭ༻ ͍ɺೖྗܥྻΛݻఆͷϕΫτϧʹม ͠ɺͦͷϕΫτϧ͔Βग़ྗܥྻΛੜ 12
จࣈ͚͔ͩΒਂֶशͰςΩετ ྨϓϩάϥϜ͕Ͱ͖ͯ͠·͏ ß Text Understanding from Scratch, Zhang and LeCun,
arXiv 2015. →จࣈ͚͔ͩΒதӳͷςΩετྨثΛֶश ß Learning to Execute, Zaremba and Sutskever, arXiv 2015. →RNNͱLTSM͚͔ͩΒPythonϓϩάϥϜΛ ʮֶशʯ࣮ͯ͠ߦ 13
ਂֶशΛͬͯϚϧνϞʔμϧ ͳೖग़ྗΛࣗવʹ౷߹ ß ը૾͚͔ͩΒΩϟϓγϣϯΛੜ http://deeplearning.cs.toronto.edu/i2t http://googleresearch.blogspot.jp/2014/11/a-picture-is- worth-thousand-coherent.html 14
ຊͷ࣍ ß ਂֶश͕ࣗવݴޠॲཧʹ༩͑ΔΠϯύ Ϋτ ß ࣗવݴޠॲཧͷ৽ͨͳൃల 15
ࣗવݴޠॲཧͷޭ ß ࣝผϞσϧ Þ λά͖ͭίʔύεΛ༻ҙͯ͠ڭࢣ͋Γֶश Þ ܗଶૉղੳɺݻ༗දݱೝࣝɺߏจղੳɺetc ß ࠷దԽ Þ
ϥϯΩϯάΈ߹Θͤ࠷దԽʹఆࣜԽ Þ Σϒݕࡧɺػց༁ɺจॻཁɺetc 16
ੈքΛڍ͛ͨଟݴޠॲཧͷͨΊͷ ཁૉٕज़ͷݚڀ։ൃ ß CoNLL: Conference on Natural Language Learning ͷڞ௨λεΫʢຖ։࠵ʣ
Þ 2012: ଟݴޠஊղੳ Þ 2009: ଟݴޠߏจɾҙຯղੳ Þ 2006, 2007: ଟݴޠߏจղੳ ß ಉ͡ΞϧΰϦζϜΛෳͷݴޠʹద༻͠ɺ ݴޠʹΑΒͳ͍ղੳख๏Λ୳ٻ 17
Java ʹΑΔଟݴޠॲཧπʔϧ ʢ༻ͷϞσϧϥΠηϯεཁަবʣ ß Stanford CoreNLP (Java) Þ ӳޠɺεϖΠϯޠɺதࠃޠͷܗଶૉղੳɾݻ ༗දݱೝࣝɾߏจղੳɾஊղੳπʔϧ
ß Apache OpenNLP (Java) Þ σϯϚʔΫޠɺυΠπޠɺӳޠɺεϖΠϯޠɺ ΦϥϯμޠɺϙϧτΨϧޠɺεΣʔσϯޠ Λαϙʔτ ß LingPipe (Java) Þ ӳޠʢࢺ༩ɾݻ༗දݱநग़ʣɾதࠃޠ ʢ୯ޠׂʣͷϞσϧ 18
ଟݴޠܗଶૉղੳͷͨΊͷ λά༷ͱίʔύε ß A Universal Part-of-Speech Tagset, Petrov et al.,
LREC 2012. Þ 22ݴޠ: ӳޠɺதࠃޠɺຊޠɺؖࠃޠɺetc Þ ଟݴޠɾݴޠΛ·͍ͨͩߏจղੳͷݚڀ։ൃ ͷͨΊʹɺ·ͣࢺΛҰ؏͚͍ͯͭͨ͠ Þ ຊޠຊޠॻ͖ݴ༿ۉߧίʔύε ʢBCCWJʣͷ୯Ґʹ४ڌͨ͠୯ޠׂ 19
ଟݴޠΓड͚ղੳͷͨΊͷ λά༷ͱίʔύε ß Universal Dependency Annotation for Multilingual Parsing, McDonald
et al., ACL 2013. Þ υΠπޠɾӳޠɾεΣʔσϯޠɾεϖΠϯޠɾ ϑϥϯεޠɾؖࠃޠɾetc Þ ຊޠ Universal Dependencies ͷࢼҊ, ۚࢁΒ, ݴ ޠॲཧֶձ࣍େձ 2015. 20
ࣗવݴޠॲཧͷཁૉٕज़ख़ظ ཁૉٕज़ ਫ਼ ܗଶૉղੳʢ͔ͪॻ͖ʣ 99% ߏจղੳʢΓड͚ʣ 90% ҙຯղੳʢड़ޠ߲ߏʣ 60% ஊղੳʢจΛ͑ͨؔʣ
30% 21 ղ ੳ ͷ ྲྀ Ε จਖ਼ղʹ͢Δͱ5ׂ ཁૉٕज़୯ମͰͷਫ਼্಄ଧͪ ᶃΞϓϦέʔγϣϯʹଈͨ͠ੑೳධՁͷඞཁ ᶄਫ਼Ҏ֎ͷ໘ͰͷΞϐʔϧ
ӳޠͷݴޠղੳ৽ฉهࣄ͔Β ΣϒςΩετ ß Workshop on Syntactic Analysis on Non- Canonical
Language (SANCL 2012) ß Google English Web Treebank (2012) Þ ΣϒςΩετʢϒϩάɺχϡʔεάϧʔϓɺ ϝʔϧɺϦϏϡʔɺQA ʣʹܗଶૉɾߏจʢ Γड͚ʣใΛλά͚ͮ 22
ΣϒςΩετɺΑΓ͍͠ ϢʔβੜܕͷςΩετղੳ ß Tweet NLPʢӳޠͷΈʣ http://www.ark.cs.cmu.edu/TweetNLP/ Þ Twokenizer: ܗଶૉղੳ Þ
Tweeboparser: Γड͚ղੳ Þ Tweebank: Twitter ίʔύε Þ Twitter Word Clusters: ୯ޠΫϥελ 23
ޠऀ͕ॻ͍ͨจ๏తʹਖ਼͍͠ςΩ ετ͔ΒɺݴޠֶशऀͷςΩετ ß 2011લޙ͔ΒຖͷΑ͏ʹӳޠֶशऀ ͷ࡞จͷจ๏ޡΓగਖ਼ڞ௨λεΫ͕։࠵ Þ Helping Our Own (HOO)
2011, 2012 Þ CoNLL 2013, 2014 ß ӳޠֶशऀίʔύεଟϦϦʔε Þ NUS Corpus of Learner English Þ Lang-8 Learner Corpora 24
ݻ༗දݱೝࣝɾޠٛᐆດੑղফ ͔Β entity linking ß ݻ༗දݱೝࣝ Þ ݻ༗දݱͷՕॴΛಉఆ ß
entity linking Þ ݻ༗දݱ͕ԿΛࢦ͔͢ᐆດੑղফ Þ Wikify (Wikification) 25 ҆ഒट૬͕ࣄ࣮ޡೝΛೝΊɺҨ״Λද໌ͨ͠ɻ
ຊͷ·ͱΊ ß ਂֶश͕ݴޠॲཧʹ༩͑ΔΠϯύΫτ Þ ߏจղੳ͔Βҙຯղੳ·Ͱ end-to-end Þ ϚϧνϞʔμϧʢը૾ɾԻɾݴޠʣॲཧ Þ ςΩετੜ͕ࠓޙരൃతʹීٴͦ͠͏
ß ࣗવݴޠॲཧͷ৽ͨͳൃల Þ ݴޠඇґଘͳख๏ͷݕ౼ͱͷੳ Þ ؤ݈ͳղੳख๏ͷࡧ Þ ΣϒͷొʹΑΔݹͯ͘৽͍͠ઃఆ 26