Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
NLP2025参加報告
Search
Yano
April 11, 2025
0
430
NLP2025参加報告
こちらのNLP振り返りイベントにおけるLTで使用したスライドです(
https://moneyforward.connpass.com/event/344276/
)
Yano
April 11, 2025
Tweet
Share
More Decks by Yano
See All by Yano
【輪講資料】ReAct: Synergizing Reasoning and Acting in Language Models / Tree of Thoughts: Deliberate Problem Solving with Large Language Models
yano0
0
130
【輪講資料】SimLM: Pre-training with Representation Bottleneck for Dense Passage Retrieval
yano0
2
290
【輪講資料】From Zero to Hero: On the Limitations of Zero-Shot Language Transfer with Multilingual Transformers
yano0
0
69
【輪講資料】Zero-shot Cross-lingual Semantic Parsing
yano0
0
100
Featured
See All Featured
Music & Morning Musume
bryan
47
6.5k
It's Worth the Effort
3n
184
28k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
331
21k
Making the Leap to Tech Lead
cromwellryan
133
9.2k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
160
15k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
47
2.7k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
248
1.3M
The Invisible Side of Design
smashingmag
299
50k
Gamification - CAS2011
davidbonilla
81
5.3k
StorybookのUI Testing Handbookを読んだ
zakiyama
30
5.7k
Building Flexible Design Systems
yeseniaperezcruz
329
39k
Building Better People: How to give real-time feedback that sticks.
wjessup
367
19k
Transcript
NLPௌߨࢀՃใࠂ 4݄11 ໊େ ݚڀࣨɹD1 ઍߛ
ࣗݾհ ઍߛʢͷ ͪͻΖʣ • ܦྺɿ໊େ ాɾݚʢम࢜՝ఔʣˠPKSHAʢػցֶशΤϯδχΞʣ →໊େ ݚʢത࢜՝ఔʣ •
ڵຯؔ৺ɿҙຯɺຒΊࠐΈදݱ • ࠷ۙςΩετຒΊࠐΈָ͕͍͠ • ݕࡧಛԽϞσϧɺͥͻ͍ͬͯͩ͘͞ɿ ɹpkshatech/GLuCoSE-base-ja-v2 2 9ZBOP@D
ςΩετຒΊࠐΈϞσϧͱ • ࣗવݴޠจ·ͨจষΛܭࢉػ͕ཧղՄೳͳදݱʢҰൠʹϕΫτ ϧʣʹΤϯίʔυ͢Δͷ • ϕΫτϧؒͷྨࣅΛଌΔ͜ͱͰɺྨࣅΛଌΔ͜ͱ͕Ͱ͖Δ 3 ࣍ճNLPͷ։࠵ʁ ࢜ࢁຊ࠷ߴๆͷಠཱๆͰ͢ɻ ࢁསݝͱ੩Ԭݝʹލ͍ͬͯ·͢ɻ
ຊͰҰ൪ߴ͍ࢁʁ ྨࣅɿ ྨࣅɿߴ Ϟσϧ ʜ ʜ ʜ Ϟσϧ Ϟσϧ 2"λεΫͰͷఆڍಈ
ࠓճͷNLP • ϓϩάϥϜ͔Β”ຒΊࠐΈදݱ”ςʔϚͷফࣦ 😢 • ຒΊࠐΈʹؔ࿈͢Δൃද͕ݮ͍ͬͯΔͷͰʁ • “ຒΊࠐΈ”͕λΠτϧʹೖͬͨൃදɿ17/499݅ˠ26/777݅ 😊 •
ʢׂ߹ʹ͢Δͱ΄ͱΜͲҰఆʣ • “ςΩετ” or ”จ” + “ຒΊࠐΈ”͕λΠτϧʹೖͬͨൃදɿ6݅ˠ6݅ 😊 • “ςΩετຒΊࠐΈ”͕λΠτϧʹೖͬͨൃද0 -> 5݅ ※ ͋͘·Ͱදʹجͮ͘౷ܭ 4
”ຒΊࠐΈ”ΛλΠτϧʹؚΉൃදҰཡ 5 ΨεաఔʹΑΔຒΊࠐΈू߹ͷ࣌ؒભҠͷϞσϧԽ ຒΊࠐΈදݱͷಠཱͷݴޠɾݴޠؒҰ؏ੑͷੳ ຒΊࠐΈϕΫτϧΛ༻͍ͨಈࢺͷҙຯͷཻੳͱڞىؔ Lۙࣄྫʹجͮ͘ຒΊࠐΈදݱͷυϝΠϯదԠͱݕࡧͷԠ༻ ຒΊࠐΈදݱͷࡏ࣍ݩΛଌΔ ՎͷຒΊࠐΈʹجͮ͘ຊՎऔΓͷਪఆ 3VSJຊޠʹಛԽͨ͠൚༻ςΩετຒΊࠐΈϞσϧ จͷຒΊࠐΈʹޮՌతͳ੩త୯ޠϕΫτϧͷ֫ಘ
ରཤྺͷ--.ຒΊࠐΈΛ༻͍ͨԻ߹ͷελΠϧ੍ޚ ܇࿅ෆཁͳ͖݅ςΩετຒΊࠐΈ ϓϩϯϓτʹجͮ͘ςΩετຒΊࠐΈͷλεΫʹΑΔੑͷҧ͍ খઆձจͷ༁͚ͨٯ༁Λ༻͍ͨऀຒΊࠐΈͷ࡞ ୯ޠຒΊࠐΈͷಠཱੳͷ͕࣠ղऍͰ͖ΔཻͲΕ͘Β͍͔ʁ Ϣʔβߦಈϩάʹجͮ͘ΫΤϦཧղͷͨΊͷݕࡧΫΤϦຒΊࠐΈ ςΩετͷຒΊࠐΈදݱʹجͮ͘σʔλ૿ڧΛ༻͍ͨ 9ʢچ5XJUUFSʣʹ͓͚Δຊޠͷൽݕग़ ςΩετຒΊࠐΈ͔ΒͷςΩετ෮ݩʹ͓͚Δ ༧ଌ੍ޚͷԉ༻ͷޮՌݕূ จֶ൷ධ͔ΒେنݴޠϞσϧ ʕ୯ޠຒΊࠐΈͷΈ͑ʹΑΔจֶςΫετղऍͷࢼΈ --.ຒΊࠐΈͱભҠ֬༧ଌΛར༻ͨ͠ ࣮ళฮސ٬ߦಈγϛϡϨʔγϣϯ ಠཱੳʹΑΔࣄલֶशࡁΈଟݴޠϞσϧͷ Λԣஅͨ͠୯ޠຒΊࠐΈදݱͷੳ දهΏΕ͕จຒΊࠐΈϞσϧʹٴ΅͢Өڹʹ͍ͭͯͷߟ -BSHF7JTJPO-BOHVBHF.PEFMͷ จॻը૾ςΩετຒΊࠐΈͷݕূ จॻຒΊࠐΈͱΫϥελϦϯάΛΈ߹Θͤͨ τϐοΫੳख๏ͷఏҊ --.ࣄલֶशͷޮԽͱੑ࣭վળɿ ຒΊࠐΈ͓Αͼग़ྗͷύϥϝʔλݻఆʹΑΔ࠶ར༻ ຒΊࠐΈϞσϧϕʔεͷڭࢣͳ͠ΩʔϑϨʔζநग़ʹ͓͚Δ จʹର͢Δநग़ਫ਼ͷվળ ֦ࢄϞσϧΛ༻͍ͨςΩετੜʹ͓͚Δ ʮ่յʯͱ࣌ࠁຒΊࠐΈͷӨڹ దԠతରγεςϜͷͨΊͷ ऴ൫ͷձΛ༧ଌ͢ΔຒΊࠐΈϞσϧͷߏங ର༷ʑ👀
• ΞϒετϥΫτΛwordcloudʹͯ͠Έͨ ୯ޠςΩετͷຒ ΊࠐΈ͕ଟͦ͏ ൃදͷ 6 ੳ͕ϝΠϯͷݚڀ ଟͦ͏ ୯ޠςΩετͷຒ ΊࠐΈ͕ଟͦ͏
͍͔ͭ͘հ • ϞσϧΛ܇࿅͍ͯ͠Δจ • Ruri: ຊޠʹಛԽͨ͠൚༻ςΩετຒΊࠐΈϞσϧ [௩ӽΒ] • ΠϯετϥΫγϣϯͱෳλεΫΛར༻ͨ͠ຊޠ͚ࢄදݱϞσϧͷߏ ங
[উຢΒ] • Ϣʔβߦಈϩάʹجͮ͘ΫΤϦཧղͷͨΊͷݕࡧΫΤϦຒΊࠐΈ [Β] • ຒΊࠐΈΛੳ͍ͯ͠Δจ • ಠཱੳʹΑΔࣄલֶशࡁΈଟݴޠϞσϧͷΛԣஅͨ͠୯ޠຒΊࠐΈ දݱͷੳ [Β] • ϓϩϯϓτʹجͮ͘ςΩετຒΊࠐΈͷλεΫʹΑΔੑͷҧ͍ [௩ӽΒ] 7 ˞˞ͱͯओ؍ͰબΜͰ͍·͢˞˞ଞʹ͜Μͳ͓Ζ͍จ͋ͬͨΑʂͳͲͷίϝϯτܴͰ͢ʂ
Ruri: ຊޠʹಛԽͨ͠൚༻ςΩετຒΊࠐΈϞσϧ • ຊޠ൚༻ςΩετຒΊࠐΈϞσϧɺRuriͷ։ൃɺެ։ • ϞσϧαΠζෳʢsmallɺbaseɺlargeʣ • ్தͷϞσϧͰ͋ΔRuri-PTɺRuri-Rerankerެ։ • ܇࿅༻σʔληοτͷඋ
• ਓσʔληοτΛ࡞ • ෳͷެ։σʔλΛಉҰϑΥʔϚοτͰඋ 8 ຊޠBERT Ruri-PT Ruri-Reranker Ruri ରরࣄલֶश 'JOF5VOJOH ৠཹ ͜͜ʹ͔ͳΓશ͕ͯ͋Γ ͋Γ͕͍ͨ Πϝʔδ จհɿϞσϧߏஙܥ
ΠϯετϥΫγϣϯͱෳλεΫΛར༻ͨ͠ຊޠ͚ࢄදݱϞσϧͷߏங • ෳλεΫɾݴޠͷ܇࿅σʔλͰͷ܇࿅͕ɺJMTEB (=แׅతͳςΩετ ຒΊࠐΈϞσϧͷධՁࢦඪ) ʹରͯ͠༩͑ΔӨڹΛੳ • ӳࠞ߹ͯ͠܇࿅ͨ͠ํ͕ɺຊޠͷΈͰ Ͱ܇࿅͢ΔΑΓߴ͍ੑೳ •
ධՁλεΫʹΑͬͯ༗ޮͳ܇࿅λεΫҟͳΔ • ӳࠞ߹σʔλͰ܇࿅͠ߏஙͨ͠ϞσϧΛެ։ • retrieva-jp/amber-base, retrieva-jp/amber-large 9 จհɿϞσϧߏஙܥ ܇࿅͔Βআ͘λεΫ ධՁλεΫͷੑೳมԽ ྫ) NLIͰͷ܇࿅ɿSTSੑೳ⤴ ΫϥελϦϯάੑೳ⤵ ˝ਤΑΓҾ༻
Ϣʔβߦಈϩάʹجͮ͘ΫΤϦཧղͷͨΊͷݕࡧΫΤϦຒΊࠐΈ • ݕࡧΫΤϦ௨ৗͷςΩετຒΊࠐΈͰରͱ͞ΕΔࣗવจͱൺֱͯ͘͠ จ຺͕͍ܽͯ͠Δʢྫɿ࡚ͷདྷिͷఱؾԿʁʣ • ϢʔβʔͷߦಈϩάΛར༻ͯ͠ྨࣅҙਤΛ࣋ͭΫΤϦϖΞΛநग़͠܇࿅ʹར༻͢ Δख๏ɺUBIQUEΛఏҊ • ΫϦοΫϩάɿݕࡧ݁Ռͷಉ͡URLΛΫϦοΫ ͨ͠ΫΤϦ
• ηογϣϯϩάɿಉ͡ηογϣϯͰҰఆ࣌ؒʹ ೖྗ͞ΕͨΫΤϦ • → දʹΑΒͣಉ͡ҙਤͷΫΤϦ͕நग़͞ΕΔ • ಛʹදมԽʹؤ݈ͳϞσϧΛߏங 10 จհɿϞσϧߏஙܥ ˝ਤΑΓҾ༻
ϓϩϯϓτʹجͮ͘ςΩετຒΊࠐΈͷλεΫʹΑΔੑͷҧ͍ • ϓϩϯϓτʹجͮ͘ςΩετຒΊࠐΈɿ • λεΫݻ༗ͷΠϯετϥΫγϣϯΛ༩ͯ͠ຒΊࠐΈΛ࡞Δ • λεΫʹґଘͨ͠ੑΛ࣋ͭ͜ͱΛࣔͨ͠ 11 จհɿੳܥ 4096࣍ݩΛ16࣍ݩ·Ͱ࣍ݩݮ
ͯ͠ੑೳྼԽͳ͍ 512࣍ݩ͙Β͍·Ͱ ྼԽগͳ͍ Ҏ߱ੑೳྼԽ͕ݦஶ ྨλεΫ ݕࡧλεΫ ˝ਤΑΓҾ༻ ˝ਤΑΓҾ༻
ಠཱੳʹΑΔࣄલֶशࡁΈଟݴޠϞσϧͷΛԣஅͨ͠୯ޠຒΊࠐΈදݱͷੳ • ଟݴޠϞσϧ͕͝ͱʹ࣋ͭಛΛ ಠཱੳʢICAʣʹΑ֤ͬͯ࣠ʹ • ग़ྗʹ͍ۙ΄ͲҙຯʹΑ͕ͬͯ࣠ ͢Δ͜ͱΛ໌Β͔ʹͨ͠ • 1ɿ ɹɹɹ࣠දܥʹΑͬͯ
• 712ɿ ɹɹɹ࣠ҙຯʹΑͬͯ 12 จհɿੳܥ ˝ਤΑΓҾ༻
ࢀՃใࠂతͳ༰ • จͷհ͔ͬͯ͠͠·ͬͨͷͰ… • 3ճͷNLPࢀՃͰײͨ͜͡ͱ • ࠃݚڀք۾ͰͷϗοτͳςʔϚ͕໌ʹͳͬͯษڧʹͳΔ • ྫ͑ࣗϞσϧղੳपΓʹ͍ͭͯશʹӜౡଠͩͬͨ •
ϙελʔ͔ͬΓݟ͍ͯΔͷΛࣙΊ͍ͨ • ͱΓ͋͑ͣͰϙελʔձʹߦͬͯ͠·͏͜ͱ͕ଟ͔͕ͬͨɺޱड़ ʹ໘നͦ͏ͳൃද͋ͬͨͳ…ͱؼޙʹޙչ • ࣗͷֶͼ͕͕Δͱͱʹɺݟ͕ͬͨ૿ָ͍͑ͯ͠ 13
࠷ޙʹ • օ͞Μͷ͓͢͢ΊNLPจɺͥͻڭ͍͑ͯͩ͘͞🥺 • ͠Β͘౦ژʹ͍ΔͷͰɺͳΜͰ༠͍ͬͯͩ͘͞ʂ • ʢ໘നͦ͏ͳΠϯλʔϯͳͲɺڭ͍͑ͯͩ͘͞ʣ 14