Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
抽出的文書要約における hetero graph の応用 Heterogeneous Grap...
Search
uchi_k
September 06, 2020
Programming
0
1.2k
抽出的文書要約における hetero graph の応用 Heterogeneous Graph Neural Networks for Extractive Document Summarization
ACL 2020 に採択された Heterogeneous Graph Neural Networks for Extractive Document Summarization を読んでいます。
uchi_k
September 06, 2020
Tweet
Share
More Decks by uchi_k
See All by uchi_k
ACL2020 Category Survey: Sentiment Analysis
uchi_k
2
3.3k
前処理が単語埋め込みに与える影響 A Comprehensive Analysis of Preprocessing for Word Representation Learning in Affective Tasks
uchi_k
2
1.1k
Graph Neural Networks のビジネス応用可能性 heterogeneous graph と論文再現性について
uchi_k
1
3.3k
ACL精神医療論文まとめ 8min LT
uchi_k
0
1.3k
【論文紹介】医用画像への転移学習の有効性について Transfusion: Understanding Transfer Learning for Medical Imaging
uchi_k
4
3.6k
Graph: A Survey of Graph Neural Networks, Embedding, Tasks and Applications
uchi_k
1
1.2k
Other Decks in Programming
See All in Programming
Improve my own Ruby
sisshiki1969
1
110
In geheimer Mission: AI Agents entwickeln
joergneumann
0
110
Lambda(Python)の リファクタリングが好きなんです
komakichi
5
270
Embracing Ruby magic
vinistock
2
210
個人開発の学生アプリが企業譲渡されるまで
akidon0000
2
1.2k
Flutterでllama.cppをつかってローカルLLMを試してみた
sakuraidayo
0
140
スモールスタートで始めるためのLambda×モノリス(Lambdalith)
akihisaikeda
2
400
今話題のMCPサーバーをFastAPIでサッと作ってみた
yuukis
0
130
データと事例で振り返るDevin導入の"リアル" / The Realities of Devin Reflected in Data and Case Studies
rkaga
1
710
AIコーディングの理想と現実
tomohisa
36
39k
Cursor/Devin全社導入の理想と現実
saitoryc
29
22k
Cursorを活用したAIプログラミングについて 入門
rect
0
180
Featured
See All Featured
The Illustrated Children's Guide to Kubernetes
chrisshort
48
49k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
30
2.3k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
507
140k
jQuery: Nuts, Bolts and Bling
dougneiner
63
7.7k
The Language of Interfaces
destraynor
158
25k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
26k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
5
600
Testing 201, or: Great Expectations
jmmastey
42
7.5k
Unsuck your backbone
ammeep
671
57k
For a Future-Friendly Web
brad_frost
177
9.7k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
19
1.2k
KATA
mclloyd
29
14k
Transcript
Heterogeneous Graph Neural Networks for Extractive Document Summarization
ڮ ݎࢤ uchi_k @__uchi_k__ About me yuni, inc. ද nlpaper.challenge
ӡӦ Freelance Machine Learning ɹɹɹɹɹEngineer / Researcher former ژେใӃ, ະ౿16 FreakOut Machine Learning Engineer
nlpaper.challenge ࣗવݴޠॲཧͷΛ͍Ζ͍Ζ͢ΔࣾձਓɾֶੜɾݚڀऀͷίϛϡχςΟ ʢϘϥϯςΟΞத৺ͰӡӦʣ "$-ͷશཏΛࢦͯ͠ɺ"$-ެࣜʹ͋Δʹै͍ɺͷ Λઃఆͯ͠ɺͦΕͧΕͷνʔϜʹ͔ΕͯαʔϕΠ ຊఔͷจΛಡΈɺٞ-5ձͳͲΛ͍ͯ͠·ͨ͠
ACL2020 ੜܥɺάϥϑܥͷจ͕͔ͳΓ૿͑ͨҹ #&35 3P#&35BͷࣄલֶशݴޠϞσϧʹؔ͢Δݴٴ͕΄΅ඞͣ͋Δ ࠶ݱੑͷࢹ࣮ͷԠ༻͔Βɺࢦඪͷݟ͕͠ਐΜͩ ϕετϖʔύʔɺ/-1λεΫͷςετέʔεΈ͍ͨͳͷΛఆ ٛͯ͠௨աΛݟΑ͏Έ͍ͨͳΛ͍ͯͨ͠Γ ,OPXMFEHFHSBQIʹճؼͯ͠ɺάϥϑ্Ͱͷԋࢉάϥϑߏɺֶ शΛߦ͏Α͏ͳ͕૿Ճ Ҏ্ɺࢲݟͰͨ͠
)FUFSPHFOFPVT(SBQI/FVSBM/FUXPSLT GPS&YUSBDUJWF%PDVNFOU4VNNBSJ[BUJPO #abstract จॻཁͰɺηϯςϯεؒͷؔੑͷϞσϧԽ͕ ඇৗʹॏཁɻैདྷɺ3//ϕʔεͷख๏ͰܥྻͰ ϞσϧԽ͍ͯͨ͠ %BORJOH8BOH 4IBOHIBJ,FZ-BCPSBUPSZPG*OUFMMJHFOU*OGPSNBUJPO1SPDFTTJOH 'VEBO6OJWFSTJUZ FUBM
"$- நग़తจॻཁͰηϯςϯεؒͷؔੑΛදݱ͢ΔͨΊʹ IFUFSPHFOFPVTHSBQIΛಋೖ͠ɺ4P5"Λୡ֦ுੑͳͲʹ͍ͭͯݕূͨ͠ɻ จॻͷҙຯߏܥྻΑΓάϥϑߏͷํ͕దͯ͠ ͍Δ͜ͱ͕࠷ۙͷݚڀͰΘ͔͖͍ͬͯͯΔ͕ɺྑ͍ άϥϑߏ·ͩఏҊ͞Ε͍ͯͳ͔ͬͨ ୯ޠϊʔυͱจϊʔυΛ࣋ͭIFUFSPͳHSBQIߏ ΛఏҊ͠ɺ୯จॻɾଟจॻཁͦΕͧΕͰ 4P5"Λୡɻ֦ுੑʹ͍ͭͯٞͨ͠
#abstract #extractive document summarization ݩͷจॻ͔Βؔ࿈͢ΔจॻΛऔΓग़ͯ͠ɺཁ ͱͯ͠࠶ߏ͢ΔλεΫ நग़తจॻཁ ୯ޠΛܦ༝ͨ͠จͷؔੑΛදݱ͢ΔIFUFSPHSBQIΛఆٛ υΩϡϝϯτͷ֤ηϯςϯεΛ#JEJSFDUJPOBM-45.ͰϕΫτϧԽɻ͜Ε ʹΑͬͯηϯςϯεͷҙຯΛଊ͑ͨϕΫτϧ͕࡞ΒΕΔʢXPSEMBZFSʣ
நग़ܕͱɺදݱΛநԽͯ͠θϩ͔ΒཁจΛ ࡞ΔੜܕɺͦΕΒͷࠞ߹ͷύλʔϯ͕͋Δ ͞Βʹ͜ͷϕΫτϧಉ࢜ͷؔੑΛ#JEJSFDUJPOBM-45.Ͱֶश͢Δ ʢTFOUFODFMBZFSʣ ηϯςϯεΛநग़͢Δ֬Λग़ྗ 4VNNB3V//FS ॳظͷݚڀ
)FUFSPHFOFPVT(SBQI ࣮ੈքͷάϥϑIFUFSPHFOFPVTͳͷ͕ଟ͍ ࣮ੈքͷάϥϑɺҟͳΔಛۭؒͷ༷ʑͳλΠϓͷϊʔυɾΤοδͰ ߏ͞Ε͍ͯΔ #abstract #heterogeneous graph
#model overview ηϯςϯεͷΈΛϊʔυͱͯ͠άϥϑΛߏங͢ ΔͷͰͳ͘ɺηϯςϯεΛͭͳ͙հͷΑ ͏ͳϊʔυΛՃ 1SPQPTFE(SBQI ୯ޠΛܦ༝ͨ͠จͷؔੑΛදݱ͢ΔIFUFSPHSBQIΛఆٛ จใͰ୯ޠϊʔυΛߋ৽Ͱ͖Δ ଞͷϊʔυλ ΠϓΛՃ͢ΔͳͲͷ֦ுੑ͕͋ΔɺͳͲͷར
͜ͷจͰɺ࠷খҙຯ୯ҐΛ୯ޠʹ͍ͯ͠ Δɻྫ͑ɺΑΓநԽͯ͠୯ޠͷҙຯ֓೦ ΛϊʔυλΠϓͱ͢Δ͜ͱ໘നͦ͏ HSBQIJOJUJBMJ[Fˠ("5Ͱߋ৽ˠηϯςϯε ಛ͔ΒཁจʹՃ͢Δ͔൱͔ͷྨΛ ղ͘ɺͱ͍͏खॱ
#model overview #learning step HSBQIJOJUJBMJ[FSͰɺจʹΧʔωϧαΠζͷҟ ͳΔ$//Λద༻ͯ͠OHSBNಛΛநग़ʢہ ॴಛʣɺ࣍ʹ#J-45.ͰηϯςϯεϨϕϧͷ ಛΛநग़ʢେҬಛʣ 1SPQPTFE(SBQI ֶशखॱͱNPEFMPWFSWJFX
୯ޠϊʔυͱจϊʔυͷؔੑʹؔ͢Δใͱ ͯ͠ɺUGJEGΛΤοδಛͰ༻͢Δ άϥϑಛ(SBQI"UUFOUJPO/FUXPSLͰ ߋ৽
#model overview #graph attention network ࣗͱपғʹͦΕͧΕॏΈΛ͔͚ͨϕΫτϧ͔ΒBUUFOUJPOΛܭࢉ ͠ɺपลϊʔυ͔ΒͷBHHSFHBUJPOʹར༻ (SBQI"UUFOUJPO/FUXPSL άϥϑ্ͰͷBUUFOUJPOΛఆٛ "UUFOUJPO
ྡϊʔυ "UUFOUJPOΛܭࢉ͢Δؔ "UUFOUJPOΛߟྀͨ͠ BHHSFHBUJPO άϥϑूͷڑؔΛɺάϥϑߏʹґଘ͠ͳ͍BUUFOUJPOͱͯ͠ ఆֶٛ͠शϕʔεͰٻΊΔɺΈ͍ͨͳ ϊʔυಛ
#dataset #train test split %BUBTFU ୯จॻཁͰͭɺෳจॻཁͰͭͷσʔληοτͰ࣮ݧ • ୯จॻཁͰ࠷͘ར༻͞Ε͍ͯΔϕϯνϚʔΫσʔληοτ • USBJO
WBMJE UFTUσʔλͦΕͧΕ $//%BJMZ.BJM2"σʔλ • /FX:PSL5JNFT"OOPUBUFE$PSQVT 4BOEIBVT ͔Βऩू͞Εͨ୯จॻཁ σʔληοτ • USBJO WBMJE UFTUσʔλͦΕͧΕ ݅ /:5 .VMUJ/FXT • ෳจॻཁσʔληοτ • ͦΕͧΕʙͷจॻʹର͠ɺਓ͕ؒॻ͍ͨཁ͕͋Δ • USBJO WBMJE UFTUσʔλͦΕͧΕ
#experiment #setting #hyper-parameter #preprocessing 4FUUJOH)ZQFSQBSBNFUFST લॲཧ άϥϑ ࣮ݧ ετοϓϫʔυ۟ಡͷআڈ ೖྗจॻͷ࠷େΛจʹ
ઃఆ UGJEGԼҐΛআڈ ޠኮΛʹ੍ݶ ࣍ݩͷ(MP7FͰຒΊࠐΈ จϕΫτϧαΠζͰॳظԽ Τοδಛྔ ࣍ݩͰॳظԽ IFBE όοναΠζ ֶशF "EBN FQPDIͰMPTT ͕Լ͕Βͳ͍߹FBSMZTUPQQJOH ୯จॻཁͰ্Ґจ ෳจॻཁͰ্ҐจΛબ
#methods #extractor • &YU#J-45. ◦$// #J-45. ◦จॻΛจͷܥྻͱΈͳ͠จؔΛֶश͢Δ • &YU5SBOTGPSNFS ◦5SBOTGPSNFS
USBOTGPSNFS ◦શจͷϖΞϫΠζ૬ޓ࡞༻Λֶश ◦จϨϕϧͷશ࿈݁άϥϑͱΈͳͤΔ • )4( )FUFS4VN(SBQI ◦ఏҊख๏ɻจ୯ޠจͷؔੑΛάϥϑͰϞσϧԽ ◦)4(ͰϊʔυྨʹΑͬͯཁจΛબ͠ɺ͞ΒʹUSJHSBN CMPDLJOHʹΑͬͯUSJHSBN͕ࣅ͍ͯΔจΛআ֎͠ੑΛ͑ͨόʔ δϣϯ࣮ݧ .FUIPET
#result #CNN/DailyMail 3FTVMUʢ୯จॻཁɿ$//%BJMZ.BJMʣ $//%BJMZ.BJMͰͷ୯จॻཁͷ݁Ռɻطଘख๏ͯ͢Λ্ճΔείΞ͕ಘΒΕͨɻ -&"%͕ϕʔεϥΠϯɺ 03"$-&͕VQQFSCPVOE MBCFM QSFWJPVTTUVEZ QSPQPTFENFUIPE จ຺όϯσΟουͱͯ͠ఆٛ
ͨ͠)&3ʹؔͯ͠ಛʹϙϦ γʔ͋Γͳ࣮͠ݧ͠ɺ͍ͣΕ উͪ ʢ#&35Λ͍ͬͯͳ͍ʣશͯͷطଘख๏ΑΓߴ͍είΞ͕ಘΒΕͨ 306(& -ͰධՁɻͦΕ ͧΕHSBN HSBN Ұக͢Δ ࠷ܥྻͷྨࣅͷείΞ
#result #CNN/DailyMail 3FTVMUʢ୯จॻཁɿ$//%BJMZ.BJMʣ จܥྻશଓάϥϑΛར༻ͨ͠ख๏ͱൺΔ͜ͱͰɺ IFUFSPHSBQIߏͷ༗༻ੑ͕ࣔ͞Εͨɻ &YUNFUIPE QSPQPTFENFUIPE จܥྻɺશଓάϥϑΛͬ ͨ&YU#J-45. &YU
5SBOTGPSNFSΑΓߴ͍είΞ IFUFSPHSBQIΛ͏͜ͱͰɺ ηϯςϯεؒͷෆཁͳ݁߹ΛޮՌ తʹআڈͰ͖͍ͯΔ
#result #NYT50 3FTVMUʢ୯จॻཁɿ/:5ʣ /:5Ͱͷ୯จॻཁͷ࣮ݧ݁Ռɻ$//%BJMZ.BJMͱجຊతʹಉ͕͡ݟΒΕͨɻ جຊతʹ$//%BJMZ.BJM ͱಉ͡ͰɺఏҊख๏͕طଘ ख๏Λ্ճ͍ͬͯΔ QSPQPTFENFUIPE USJHSBNCMPDLJOH͋Γ όʔδϣϯ͕ҐͰͳ͍
ͷͳͥɾɾɾʁ ˠ$//%BJMZ.BJMͰॏෳͷ গͳ͍Օॻ͖Λ࿈݁͢Δܗࣜ ͕ͩɺ/:5ͰΩʔϑ Ϩʔζ͕ෳճొ͢ΔͳͲॏ ෳ͕͋ΔɻͳͷͰɺUSJHSBN CMPDLJOHͰ/:5Ͱε ίΞΛग़ͮ͠Β͍ͷͰ
#ablation #CNN/DailyMail ୯ޠϑΟϧλϦϯάͷআͰ 3 3-είΞݮগ 3 είΞ૿Ճ "CMBUJPO $//%BJMZ.BJMͰBCMBUJPO͠ϞδϡʔϧͷߩݙΛௐͨɻ ୯ޠϑΟϧλϦϯάʹΑΓɺಛʹॏཁͳ୯ޠϊʔυʹϑΥʔΧεͰ͖Δར
͕CJHSBNใΛࣦ͏σϝϦοτΛ্ճ͍ͬͯΔͷͰͳ͍͔ ("5ؒͷSFTJEVBM DPOOFDUJPOΛআ͢Δ͜ͱͰ είΞ͕େ͖͘ݮগ ("5ͷSFTJEVBMDPOOFDUJPOɺIFUFSPHSBQIʹ͓͚ΔผλΠϓͷ ϊʔυ͔ΒͷूͰཧతʹॏཁͳͷͰ୯ͳΔ݁߹Ͱஔ͖͑Ͱ͖ͳ͍
#result #multidocument )4( )%4(ڞʹطଘख๏Λ্ճ ΔείΞ͕ಘΒΕ͍ͯͯɺಛʹ )%4(ͰείΞ্ঢ͕େ͖͍ 3FTVMUʢଟจॻཁʣ ଟจॻཁͰจॻϊʔυΛՃͨ͠ఏҊख๏Ͱݕূ จॻϊʔυͷՃ͕ଟจॻཁʹ ޮՌతͰ͋Δ͜ͱ͕ࣔࠦ
USJHSBNCMPDLJOH͕ޮ͍͍ͯͳ͍ ͷɺ͓ͦΒ͖ͬ͘͞ͱಉ͡ཧ༝ ఏҊख๏Ͱ୯ʹϊʔυλΠϓΛՃ͢Δ͚ͩͰผλεΫʹԠ༻Ͱ͖͓ͯ Γɺൃలੑ͕ߴ͍ QSPQPTFENFUIPE
#qualitative analysis #degree ୯ޠϊʔυͷ͕ߴ͍ͱɺͦͷ୯ޠ ͷग़ݱ͕ଟ͍ͱ͍͏͜ͱʹͳΓจॻ ͷΛʢଟগʣද͢ 2VBMJUBUJWF"OBMZTJT ୯ޠϊʔυͷ͕༩͑ΔӨڹΛௐࠪ ୯ޠϊʔυ͕͋Δ͜ͱͰɺจใͷूͱେҬදݱͷ͕ߦΘΕ͍ͯΔՄ ೳੑ͕ࣔࠦ͞ΕΔ
୯ޠͷͱ306(&͕ൺྫ ˠੑͷߴ͍จॻ΄Ͳཁ͠қ͍ ͕ߴ͍ͱෳͷจͷใΛू͢ Δ͜ͱ͕Ͱ͖ɺϞσϧͷԸܙΛΑΓڧ ͘ड͚Δ͜ͱ͕Ͱ͖Δͱߟ͑ΒΕΔ
#qualitative analysis #source จॻ͕૿Ճ͢Δ͜ͱͰɺϕʔεϥΠϯ ্ঢ͢Δ͕ఏҊख๏ͰԼ͠ จͰฒͿ 2VBMJUBUJWF"OBMZTJT ଟจॻཁͰɺจॻͷͷӨڹΛௐࠪ จॻͷ૿ՃͰ)&5&346.(3"1)ͱ)&5&3%0$46.(3"1)ͷੑ
ೳ͕֦ࠩେจॻͱจॻͷ͕ؔෳࡶʹͳΔ΄Ͳɺจॻϊʔυͷར͕Α Γେ͖͘ͳΔ 'JSTUɺΧόϨοδΛ֬อͰ͖Δ จষΛ֤จॻ͔Βڧ੍తʹநग़Ͱ͖Δ จॻͷ૿Ճʹ͍ɺશจͷओࢫΛΧ όʔͰ͖ΔݶΒΕͨͷจΛநग़͢Δ ͜ͱ͕ࠔʹͳ͍ͬͯͨ͘Ί
#key points ·ͱΊ IFUFSPHSBQIΛ͏͜ͱͰɺจॻཁʹpOFHSBJOFEͳҙຯ୯Ґ Λಋೖ͢Δ͜ͱ͕Ͱ͖ɺจɾจষؒͷؔੑͷϞσϦϯάͷ༗ޮੑ ͕͔֬ΊΒΕͨ ख๏ͷ֦ுੑߴ͘ɺ୯จॻཁ͔ΒϊʔυλΠϓͷՃͷΈͰଟจ ॻཁʹରԠՄೳ IFUFSPHSBQIʹಛԽͨ͠ख๏ʢϝλύεΛͬͨαϒάϥϑͷఆ ٛɺIFUFSPHSBQIʹର͢ΔBUUFOUJPOʣΛࢼ͢ͱ໘ന͍͔
ࠓޙ#&35ࣄલֶशϞσϧΛ͍Ζ͍Ζݕ౼͍ͨ͠ͱͷ͜ͱ චऀܰ͘৮Ε͍͕ͯͨɺ୯ޠϊʔυʹͨΔ෦͕ҙຯϊʔυ·Ͱ நԽ͞ΕͨΓͨ͠Βख๏ͷ༏Ґੑ͕ΑΓ׆͔͞ΕΔͱࢥ͏ɻͦ͏Ͱ ͳͯ͘ɺϊʔυλΠϓͷՃ͍Ζ͍Ζࢼͤͦ͏