Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
CommonLitコンペで学んだこと
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
nogawanogawa
October 12, 2023
2.4k
2
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
CommonLitコンペで学んだこと
nogawanogawa
October 12, 2023
More Decks by nogawanogawa
See All by nogawanogawa
Amazon Bedrockを用いた新着募集のモデレーション半自動化への取り組み
nogawanogawa
2
300
推薦システムにおけるPost Processの取り組み
nogawanogawa
2
550
Python型チェッカー ty を使ってみた話
nogawanogawa
2
1.8k
Devinを導入してドキュメンテーションで変わったこと
nogawanogawa
2
190
相互推薦システム開発の舞台裏と今後の展望
nogawanogawa
2
410
コサイン類似度のいろんな書き方
nogawanogawa
4
1.6k
機械学習で使用しているGCSの料金を激減させた話
nogawanogawa
2
5.5k
How to Index Item IDs for Recommendation Foundation Models
nogawanogawa
0
640
Featured
See All Featured
職位にかかわらず全員がリーダーシップを発揮するチーム作り / Building a team where everyone can demonstrate leadership regardless of position
madoxten
62
55k
Darren the Foodie - Storyboard
khoart
PRO
3
3.4k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
360
30k
Groundhog Day: Seeking Process in Gaming for Health
codingconduct
0
220
Navigating Weather and Climate Data
rabernat
0
240
Visual Storytelling: How to be a Superhuman Communicator
reverentgeek
2
570
The Director’s Chair: Orchestrating AI for Truly Effective Learning
tmiket
1
200
Site-Speed That Sticks
csswizardry
13
1.2k
KATA
mclloyd
PRO
35
15k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
10
1.2k
Optimizing for Happiness
mojombo
378
71k
Statistics for Hackers
jakevdp
799
230k
Transcript
©2023 Wantedly, Inc. CommonLitίϯϖͰֶΜͩ͜ͱ Custom HeaderʹΑΔTransformerੑೳվળ ΈΜͳͷPythonษڧձ #98 Oct. 12
2023 - @nogawanogawa
©2023 Wantedly, Inc. CommonLit ίϯϖ ? CommonLit - Evaluate Student
Summaries
©2023 Wantedly, Inc. CommonLit - Evaluate Student Summaries KaggleʢσʔλαΠΤϯείϯϖςΟγϣϯϓϥοτ ϑΥʔϜʣͰ։࠵͞Ε͍ͯͨେձ
• ظؒ: 7/13 ~ 10/12 (3͔݄ɺࠓேऴྃʂ) • ࢀՃऀ: 2106νʔϜ • ࣗͷ࠷ऴॱҐ25ҐͰͨ͠
©2023 Wantedly, Inc. CommonLit - Evaluate Student Summaries • খֶߍ3ੜʙߴߍ3ੜͷੜెʹΑΔจষͷཁ͕ςʔϚ
• ੜె͕࡞ͨ͠ཁจͷ࣭Λهड़༰(contentʣɺޠኮɾจ๏ʢwordingʣͷ2ͭͷ؍Ͱ࠾ ◦ ࣮ࡍʹਓ͕ؒ࠾ͨ݁͠Ռʹ࠷͍ۙػցֶशϞσϧΛ࡞ͬͨਓ͕উར จষΛཁ ੜె ઌੜ XPSEJOH DPOUFOU XPSEJOH DPOUFOU ػցֶशϞσϧ ͳΔ͘ਓؒͷ ࠾݁Ռʹ͍ۙ ϞσϧΛ࡞Γ͍ͨʂ ࢀՃऀ͜ΕΛ࡞͍ͬͯͨ ࠾ ༧ଌ
©2023 Wantedly, Inc. TransformerͱCustom Header
©2023 Wantedly, Inc. TransformerΛͬͨجຊతͳղ๏ • ࣗવจΛ͏ίϯϖͰΑ͘Transformer͕༻͞ΕΔ • χϡʔϥϧωοτϫʔΫͷݱࡏओྲྀͷΞʔΩςΫνϟ • ߏӈਤͷΑ͏ʹͳ͍ͬͯΔ
◦ ChatGPTཪଆ͜Εͷؒ • ͏͜ͱࣗମҙ֎ͱ؆୯ ◦ HuggingFaceͷTransformersϥΠϒϥϦΛ͑ ेߦఔॻ͚खܰʹ͑Δ ◦ ੈքதͷਓֶ͕शࡁΈϞσϧͷσʔλΛެ։ͯ͘͠Εͯ ͍ͯɺDLͯ͠͏͚ͩ
©2023 Wantedly, Inc. Transformers ༻͍ͨ͠ϞσϧͷछྨΛબ ֶशࡁΈϞσϧΛμϯϩʔυ ͘͠ͳ͍ʂ ۩ମతʹղ͖͍ͨͷͨΊʹඍௐ(fine tune) ͢ΔίʔυΛॻ͍ͨͱͯ͠ϓϥεेߦఔ
• ༻͢Δ͚ͩͳΒߦͰॻ͚Δ
©2023 Wantedly, Inc. ਫ਼্ͷํ • લॲཧ • ଟ͘ͷσʔλ͋·Γ͖Ε͍Ͱͳ͍ʢϊΠζͱͳΔͷ͕ଟ͍ʣࣄ͕ଟ͘ɺͦΕΒΛআڈ͢Δ • ࣄલֶशࡁΈϞσϧͷมߋ
• ͑ΔֶशࡁΈϞσϧʹͨ͘͞Μछྨ͕͋ΔʢBertɺRobertaɺDebertaɺetc… ʣ • ֶशσʔλͷ૿ڧ • ΑΓଟ͘ɾଟ༷ͳσʔλͰֶशͨ͠Ϟσϧ͕ڧ͘ͳΓ͕ͪ • Ξϯαϯϒϧ • ෳͷػցֶशϞσϧΛͬͨଟܾ • TransformerͷߏࣗମʹςίೖΕ • Custom Header ʢPoolerʣ • etc… ࠓ͜͜Λ͍ͨ͠ʂ Transformers؆୯ʹ͑Δ & ߴ͍ਫ਼ͷϞσϧΛ༻Ͱ͖Δɻͨͩ͠ɺKaggleͳͲͷίϯϖͰ͜Ε Λ͏ͷ͕ελʔτϥΠϯͰɺίϯϖͰॱҐΛ্͛Δʹ͔͜͜Βߋʹ͕ඞཁ
©2023 Wantedly, Inc. Custom Header (Pooler)ʹΑΔਫ਼্ࡦ • σϑΥϧτͰTransformerΛ༻͢Δͱɺೖྗͷ࠷ॳͷτʔΫϯ (CLSτʔΫϯ)ͷ࠷ऴग़ྗΛͱʹ༧ଌ ࠷ॳͷτʔΫϯग़ྗ͚ͩΛ༻ɺଞͷτʔΫϯͷग़ྗࣺ͍ͯͯΔ
→ײతʹඇৗʹ͍ͬͨͳ͘ײ͡Δ ʢจষશମͷग़ྗΛͬͨ΄͏͕දݱྗߴ͘ͳΓͦ͏ʹࢥ͑Δʣ
©2023 Wantedly, Inc. Custom Header (Pooler)ʹΑΔਫ਼্ࡦ • Custom Header (Pooler)
: จશମͷग़ྗΛͬͯpoolingΛߦ͍ɺ࠷ऴʹೖྗ • poolingͷྫ: ฏۉɺ࠷େɺ࠷ऴ4ͷग़ྗΛͭͳ͛Δɺetc… ࢀߟ: https://www.kaggle.com/code/rhtsingh/utilizing-transformer-representations-efficiently ͦ͠͏ʹݟ͑ͯ ҙ֎ͱߦͰॻ͚Δ ྫ: Average pooling
©2023 Wantedly, Inc. ΦϦδφϧCustom HeaderʹΑΔਫ਼্ࡦ • ࠓճ͕ࣗͬͯҰ൪͏·͘ߦͬͨํ๏ • ཁରͷจͷՕॴ͚ͩΛaverage poolingͯ͠༻
• ղऍ • ࠓճੜెͷཁจͷ࣭ΛධՁ͢Δͷ͕త • ධՁରͷ෦͚ͩ༻͢Δ͜ͱͰ࠷ऴ ͷϊΠζΛݮΒ͢
©2023 Wantedly, Inc. ·ͱΊɿCommonLitίϯϖͰֶΜͩ͜ͱ •TransformerϞσϧͦ͜·Ͱۤ࿑ͤͣ͑ͯߴੑೳɻͨͩ͠Ԟਂ͍ɻ • TransformersΛ͏ͱઌਓ͕࡞ͬͨߴੑೳͷֶशࡁΈϞσϧΛ༻Ͱ͖Δ ◦ ͏͚ͩͳΒͦ͜·Ͱ͘͠ͳ͍ •
ͨͩ͠࠷ۙͷKaggleͳͲͷNLPίϯϖͰɺTransformerΛ͏ͷ͋͘·ͰελʔτϥΠϯ ◦ σʔλαΠΤϯςΟετతʹ͏Ұา౿ΈࠐΜͰ͍͖͍ͨ • TransformerͷੑೳΛߋʹ্ͤ͞ΔΞϓϩʔνͷ1ͭ: Custom Header (Pooler) ◦ Transformer͕λεΫΛղ͖͍͢Α͏ʹ࠷ޙͷPoolingΛௐ͢Δͱ্ख͍͘͘͜ͱ͕͋Δ