Upgrade to Pro — share decks privately, control downloads, hide ads and more …

資源として見る実験プログラム

 資源として見る実験プログラム

言語処理学会第29回年次大会 併設ワークショップ JLR2023 『日本語言語資源の構築と利用性の向上』での口頭発表「資源として見る実験プログラム」の資料版スライドです。

URL: https://jedworkshop.github.io/JLR2023/program

Hayato Tsukagoshi

March 20, 2023
Tweet

More Decks by Hayato Tsukagoshi

Other Decks in Research

Transcript

  1. •໊લ: ௩ӽ ॣ / TSUKAGOSHI, Hayato •ॴଐ: ໊େ ෢ా࡫໺ݚ M2

    ݚڀ: •ఆٛจΛ༻͍ͨจຒΊࠐΈߏ੒๏
 (NLP 2021, ACL-IJCNLP 2021, ࣗવݴޠॲཧ Vol. 30) •ࣗવݴޠਪ࿦ͱ࠶ݱثΛ༻͍ͨSplit and Rephrase
 ʹ͓͚Δੜ੒จͷ඼࣭޲্ (NLP 2022) •ҟͳΔڭࢣ৴߸͔Βߏஙͨ͠จϕΫτϧͷൺֱ
 ͱ౷߹ (म࿦, *SEM 2022) •(ڞஶ) Ψ΢ε෼෍ʹجͮ͘จදݱੜ੒ (NLP2023) •ֶৼಛผݚڀһ(DC1)࠾༻಺ఆɾത࢜՝ఔਐֶ༧ఆ ࣗݾ঺հ 3 ϓϩϑΟʔϧαΠτ: https://hpprc.dev/ @γΞτϧ
  2. •࣮ݧϓϩάϥϜͷ؅ཧͱ࣮૷ͷࢦ਑ • ద੾ͳ࣮ݧΛ͢ΔͨΊʹ • ਂ૚ֶशؔ࿈ιϑτ΢ΣΞͷൃల • େن໛ͳϞσϧͷ fi ne-tuningςΫχοΫͱ࣮૷ྫ •έʔεελσΟ

    • BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ • ChatGPTΛ༻͍ͨӳޠσʔληοτͷࣄྫ͝ͱ຋༁ • SimCSEͷ࠶ݱ࣮૷ͱ೔ຊޠSimCSEͷߏங ໨࣍ 5
  3. •࣮ݧϓϩάϥϜͷ؅ཧͱ࣮૷ͷࢦ਑ • ద੾ͳ࣮ݧΛ͢ΔͨΊʹ • ਂ૚ֶशؔ࿈ιϑτ΢ΣΞͷൃల • େن໛ͳϞσϧͷ fi ne-tuningςΫχοΫͱ࣮૷ྫ •έʔεελσΟ

    • BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ • ChatGPTΛ༻͍ͨӳޠσʔληοτͷࣄྫ͝ͱ຋༁ • SimCSEͷ࠶ݱ࣮૷ͱ೔ຊޠSimCSEͷߏங ໨࣍ 6
  4. •࣮ࡍʹ͸ʮ࣮૷ͱධՁʯ΋ۃΊͯॏཁ • ʮ࣮૷͕ؒҧֶ͍ͬͯͯश͕Ͱ͖ͳ͔ͬͨʯ • ʮධՁํ๏͕ؒҧ͍ͬͯͯҙຯͷͳ͍࣮ݧΛ͍ͯͨ͠ʯ • ʮԿ΋͔΋μϝͩͬͨʯ •ؒҧ࣮ͬͨݧ݁Ռ͔Β͸ؒҧͬͨߟ࡯͔͠ੜ·Εͳ͍ ద੾ͳ࣮ݧΛ͢ΔͨΊʹ 12

    ద੾ͳ
 Ծઆ ؒҧͬͨ
 ࣮૷ͱධՁ ؒҧͬͨ
 ߟ࡯ ਖ਼͍͠ʮ࣮૷ͱධՁʯ͸
 ख໭ΓΛݮΒ͠ݚڀΛਝ଎Խ͢Δ ʮ࣮૷ͱධՁʯͷ
 best practice΋
 ࢿݯͱͯ͠ॏཁ
  5. •ม਺໊Λಀ͛ͣʹߟ͑Δ •άϩʔόϧม਺Λආ͚Δ •഑ྻʹҙຯʹ͋ΔσʔλΛԡ͠ࠐΉͷΛ΍ΊΔ (dict΍dataclassΛ࢖͏) •Type HintsΛ࢖ͬͯग़དྷΔ͚ͩܕΛ໌ࣔ͢Δ •Jupyter Notebook͚ͩͰ࣮ݧ͢ΔͷΛ΍ΊΔ •࠷ڧ train.py

    (ਗ਼໺, 2021) Λආ͚Δ (੹຿Λ෼͚Δ) •ঢ়ଶʹґଘͨ͠ॲཧΛආ͚Δ (ࢀরಁաੑͷ͋Δؔ਺Λઃܭͷத৺ʹ͢Δ) ࣮૷ͱධՁͷޡΓΛ๷͙ 16 ൃදऀ஫: ʰϦʔμϒϧίʔυʱΛ݄1ͰಡΈฦ͢ͱྑ͍Ͱ͠ΐ͏ɻ
  6. •ม਺໊Λಀ͛ͣʹߟ͑Δ •άϩʔόϧม਺Λආ͚Δ •഑ྻʹҙຯʹ͋ΔσʔλΛԡ͠ࠐΉͷΛ΍ΊΔ (dict΍dataclassΛ࢖͏) •Type HintsΛ࢖ͬͯग़དྷΔ͚ͩܕΛ໌ࣔ͢Δ •Jupyter Notebook͚ͩͰ࣮ݧ͢ΔͷΛ΍ΊΔ •࠷ڧ train.py

    (ਗ਼໺, 2021) Λආ͚Δ (੹຿Λ෼͚Δ) •ঢ়ଶʹґଘͨ͠ॲཧΛආ͚Δ (ࢀরಁաੑͷ͋Δؔ਺Λઃܭͷத৺ʹ͢Δ) •ࣗ෼ͷهԱྗΛ৴͡ΔͷΛ΍ΊΔ •͋ΒΏΔ࡞ۀΛϓϩάϥϜʹى͜͢ (σʔλͷμ΢ϯϩʔυɾલॲཧɾධՁ) ࣮૷ͱධՁͷޡΓΛ๷͙ 17 ൃදऀ஫: ʰϦʔμϒϧίʔυʱΛ݄1ͰಡΈฦ͢ͱྑ͍Ͱ͠ΐ͏ɻ
  7. •ม਺໊Λಀ͛ͣʹߟ͑Δ •άϩʔόϧม਺Λආ͚Δ •഑ྻʹҙຯʹ͋ΔσʔλΛԡ͠ࠐΉͷΛ΍ΊΔ (dict΍dataclassΛ࢖͏) •Type HintsΛ࢖ͬͯग़དྷΔ͚ͩܕΛ໌ࣔ͢Δ •Jupyter Notebook͚ͩͰ࣮ݧ͢ΔͷΛ΍ΊΔ •࠷ڧ train.py

    (ਗ਼໺, 2021) Λආ͚Δ (੹຿Λ෼͚Δ) •ঢ়ଶʹґଘͨ͠ॲཧΛආ͚Δ (ࢀরಁաੑͷ͋Δؔ਺Λઃܭͷத৺ʹ͢Δ) •ࣗ෼ͷهԱྗΛ৴͡ΔͷΛ΍ΊΔ •͋ΒΏΔ࡞ۀΛϓϩάϥϜʹى͜͢ (σʔλͷμ΢ϯϩʔυɾલॲཧɾධՁ) ࣮૷ͱධՁͷޡΓΛ๷͙ 18 ൃදऀ஫: ʰϦʔμϒϧίʔυʱΛ݄1ͰಡΈฦ͢ͱྑ͍Ͱ͠ΐ͏ɻ ʰϦʔμϒϧίʔυʱ
 ʰGoogleͷιϑτ΢ΣΞΤϯδχΞϦϯάʱ
 Λಡ΋͏ʂ
  8. •Best practiceΛֶͿ • ࣮૷ʹࡍͯ͠ʮྑ͍ʯͱ͞Ε͍ͯΔઃܭ΍ه๏ΛֶͿ •ΞϯνύλʔϯΛ஌Δ • ࣮૷ʹࡍͯ͠ʮѱ͍ʯͱ͞Ε͍ͯΔઃܭ΍ࢥߟ͔Β୤͢Δ •৽͍ٕ͠ज़Λ஌Δ • طଘͷٕज़ΑΓ΋ͬͱྑ͍ํ๏͕ੜ·Ε͍ͯΔ͔΋

    • ৽͍ٕ͠ज़ͷࢥ૝Λࠓͷٕज़ʹ΋Ԡ༻Ͱ͖Δ͔΋ • ਂ૚ֶशؔ࿈ιϑτ΢ΣΞͷൃల • େن໛ͳϞσϧͷ fi ne-tuningςΫχοΫͱ࣮૷ྫ ద੾ͳ࣮ݧΛ͢ΔͨΊʹඞཁͳ͜ͱ 20
  9. •HuggingFaceͷTransformers͕୆಄🤗 •Ͱ͖Δ͜ͱ͸େ͖͘෼͚ͯ3ͭ • ஶ໊ͳਂ૚ֶशϞσϧɾΞʔΩςΫνϟ࣮૷ͷར༻ • ࣄલ܇࿅ࡁΈϞσϧύϥϝʔλͷڞ༗ɾμ΢ϯϩʔυ • ࣄલఆٛɾࣄલ܇࿅͞ΕͨϞσϧΛ༻ֶ͍ͨशɾਪ࿦ͷ؆ུԽ •PyTorchɾTensorFlowɾJax /

    FlaxʹରԠ •NLPઐ໳ϥΠϒϥϦ͕ͩͬͨɺը૾ɾԻ੠ܥͷϞσϧ΋ೖ͖͍ͬͯͯΔ • ը૾෼໺ͷಉ༷ͷϥΠϒϥϦtimm΋࠷ۙ HuggingFace ؅ཧԼʹ ਂ૚ֶश༻ϥΠϒϥϦ: Transformers 22
  10. •ۙ೥ͷਂ૚ֶशͰ͸΄ͱΜͲͷίʔυ͕PythonͰهड़͞ΕΔ •͔͠͠ɺPython͸ͦΕ΄Ͳ଎͍ݴޠͰ͸ͳ͍ • Global Interpreter Lock (GIL) ͷଘࡏʹΑͬͯฒྻॲཧ͕໘౗ • ͦ΋ͦ΋શମతʹͳΜͱͳ͘஗͍

    •ʮPyTorch΋C++Ͱهड़͞ΕͯΔ͠C++Λ࢖͑͹ʁʯ • ΋ɺ΋͏ͪΐͬͱϞμϯͳݴޠΛ࢖͍͍ͨؾ͕࣋ͪ… •࠷ۙʹͳͬͯ Rust ͕༷ʑͳ৔ॴʹಋೖ͞Ε͍ͯΔ • PythonͱҟͳΓ੩తܕ෇͚ɾίϯύΠϧ͞ΕͯػցޠΛੜ੒ ϓϩάϥϛϯάݴޠͷมભ 26 Skipped
  11. •ϑϩϯτΤϯυ։ൃ͔ΒγεςϜϓϩάϥϛϯά·Ͱ༷ʑͳ৔ॴͰར༻ •ण໋ (lifetime)΍ॴ༗ݖͱ͍ͬͨ֓೦ͷಋೖʹΑΓthread safe & null safe ࣮ࡍʹRustΛར༻͍ͯ͠ΔϥΠϒϥϦ •huggingface/tokenizers •

    transformers ಺Ͱར༻͞Ε͍ͯΔτʔΫφΠβ༻ϥΠϒϥϦ •google-research/deduplicate-text-datasets • େྔͷςΩετ͔ΒॏෳΛ࡟আ (ֶशޮ཰ͷվળ) •Rust+ػցֶशͷ·ͱΊ: vaaaaanquish/Awesome-Rust-MachineLearning ϓϩάϥϛϯάݴޠͷมભ: Rustͷಋೖ 27 Skipped
  12. •ෳ਺ͷ࣮ݧઃఆΛࢼ͢৔߹͸
 ίϚϯυϥΠϯҾ਺ͷར༻͕ศར •͔͠͠argparse͸ܕ͕෇͔ͳ͍ Typed Argument Parser (Tap) •PythonͷdataclassͷΑ͏ʹ
 ίϚϯυϥΠϯύʔαΛهड़Մೳ •

    αϒίϚϯυͷఆٛ΍ܧঝ΋ •ద੾ͳܕ෇͚ʹΑͬͯิ׬ਫ਼౓޲্ •ଐੑ໊ͷtypo΍ܕͷؒҧ͍͕ܹݮ ิ׬ͷޮ͘argparseͷ୅ସ: Tap 30 https://github.com/swansonk14/typed-argument-parser
  13. A100: VRAM 80GB •T5-3B (30ԯύϥϝʔλ) ͕όοναΠζ 16 Ͱී௨ʹ܇࿅Ͱ͖Δ •ຊൃදͷ޻෉ΛೖΕΕ͹΋ͬͱେ͖ͳϞσϧ΋܇࿅Ͱ͖Δ͸ͣ A6000:

    VRAM 48GB •BERT-large (3.3ԯύϥϝʔλ) ͕όοναΠζ 16 Ͱී௨ʹ܇࿅Ͱ͖Δ GTX2080 ti: VRAM 11GB •BERT-base (1.1ԯύϥϝʔλ) ͕όοναΠζ 16 Ͱී௨ʹ܇࿅Ͱ͖Δ GPUͱ܇࿅ՄೳͳϞσϧαΠζͷഽײ 33 ໔੹ࣄ߲: ೖྗܥྻ௕΍ͦͷଞ͞·͟·ͳཁҼʹӨڹΛड͚Δײ֮஋ͳͷͰ͝ঝ஌͓͖͍ͩ͘͞
  14. •ਂ૚ֶशϞσϧͷύϥϝʔλ͸௨ৗ 32 bit ͷ ුಈখ਺఺਺ Ͱදݱ • ࣮͸ͦΜͳʹࡉ͔͘਺஋Λදݱ͠ͳͯ͘΋ྑ͍ •਺஋දݱͷ bit

    ਺ΛݮΒͤΔͱলϝϞϦɾ௿ܭࢉίετʹͳ͓ͬͯಘ • 16 bitͰ਺஋Λදݱͨ͠ͷ͕൒ਫ਼౓ුಈখ਺఺਺ •16 bitͷ࢖͍ํʹΑ༷ͬͯʑͳ࢓༷͕ଘࡏ • FP16: traditionalͳ൒ਫ਼౓ුಈখ਺఺਺ • BF16 (b fl oat16): Google͕ఏҊɺA100ͳͲ࠷ۙͷGPU΍TPUͰར༻Մೳ • B͸BrainͷBΒ͍͠ ൒ਫ਼౓ුಈখ਺఺਺: FP16, BF16 34 https://cloud.google.com/tpu/docs/b fl oat16?hl=ja
  15. •࣮༻తʹ͸ɺAutomatic Mixed Precision (AMP) ͕༗༻ • FP16 / BF16 ͩͱϚζ͍෦෼͸ࣗಈతʹFP32ʹͯ͘͠ΕΔ

    •AMP͸PyTorchͳΒࣗಈతʹ࣮ߦͯ͘͠ΕΔΠϯλϑΣʔε͕ଘࡏ • AMP & BF16ͷར༻͸ҎԼͷΑ͏ʹ࣮ݱՄೳ • ͜ΕͱGradScalerͱ͍͏ػߏΛ࢖͏ඞཁ͕͋Δ ൒ਫ਼౓ුಈখ਺఺਺: FP16, BF16 36 ը૾͸ https://carbon.now.sh/ Ͱੜ੒ forwardΛwithͷதͰ࣮ߦ
  16. •Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ Gradient Checkpointing 38 ଛࣦ u2 u3 u4

    u5 u1 u2 u4 u1 u5 ॱ఻೻ ٯ఻೻ ௨ৗ u4 ͷޯ഑ܭࢉʹͦΕҎલͷ৘ใ͕ඞཁ u3 https://github.com/cybertronai/gradient-checkpointing
 Fitting larger networks into memory. Backprop and systolic arrays.
  17. •Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ Gradient Checkpointing 39 ଛࣦ u2 u3 u4

    u5 u1 u2 u3 u4 u1 u5 ॱ఻೻ ٯ఻೻ ௨ৗ u4Ҏલͷܭࢉ݁ՌΛهԱͯ͠ར༻ https://github.com/cybertronai/gradient-checkpointing
 Fitting larger networks into memory. Backprop and systolic arrays.
  18. •Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ Gradient Checkpointing 40 ଛࣦ u2 u3 u4

    u5 u1 u2 u3 u4 u1 u5 ॱ఻೻ ٯ఻೻ ௨ৗ u4Ҏલͷܭࢉ݁ՌΛهԱͯ͠ར༻ https://github.com/cybertronai/gradient-checkpointing
 Fitting larger networks into memory. Backprop and systolic arrays. ॱ఻೻ͷܭࢉ݁ՌΛ͢΂ͯهԱ
  19. •Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ Gradient Checkpointing 41 ଛࣦ u2 u3 u4

    u5 u1 u2 u3 u1 u5 ॱ఻೻ ٯ఻೻ Gradient Checkpointing u4 https://github.com/cybertronai/gradient-checkpointing
 Fitting larger networks into memory. Backprop and systolic arrays.
  20. •Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ Gradient Checkpointing 42 ଛࣦ u2 u3 u4

    u5 u1 u2 u3 u1 u5 ॱ఻೻ ٯ఻೻ Gradient Checkpointing Checkpoint u4 https://github.com/cybertronai/gradient-checkpointing
 Fitting larger networks into memory. Backprop and systolic arrays.
  21. •Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ Gradient Checkpointing 43 ଛࣦ u2 u3 u4

    u5 u1 u2 u3 u1 u5 ॱ఻೻ ٯ఻೻ Gradient Checkpointing Checkpoint ܭࢉάϥϑͷҰ෦ͷ݁ՌͷΈอଘ u4 https://github.com/cybertronai/gradient-checkpointing
 Fitting larger networks into memory. Backprop and systolic arrays.
  22. •Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ Gradient Checkpointing 44 https://github.com/cybertronai/gradient-checkpointing
 Fitting larger networks

    into memory. Backprop and systolic arrays. ଛࣦ u2 u3 u4 u5 u1 u2 u3 u1 u5 ॱ఻೻ ٯ఻೻ Gradient Checkpointing Checkpoint ඞཁͳ෼Λܭࢉ͠௚͠ u4
  23. •Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ Gradient Checkpointing 45 https://github.com/cybertronai/gradient-checkpointing
 Fitting larger networks

    into memory. Backprop and systolic arrays. ଛࣦ u2 u3 u4 u5 u1 u2 u3 u1 u5 ॱ఻೻ ٯ఻೻ Gradient Checkpointing Checkpoint هԱ͢Δܭࢉ݁Ռ͕গͳ͍ʂ u4
  24. •࣮ݧϓϩάϥϜͷ؅ཧͱ࣮૷ͷࢦ਑ • ద੾ͳ࣮ݧΛ͢ΔͨΊʹ • ਂ૚ֶशؔ࿈ιϑτ΢ΣΞͷൃల • େن໛ͳϞσϧͷ fi ne-tuningςΫχοΫͱ࣮૷ྫ •έʔεελσΟ

    • BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ • ChatGPTΛ༻͍ͨӳޠσʔληοτͷࣄྫ͝ͱ຋༁ • SimCSEͷ࠶ݱ࣮૷ͱ೔ຊޠSimCSEͷߏங ໨࣍ 49
  25. •ࣄલֶशࡁΈϞσϧΛऔΓר͘؀ڥ͸ۃΊͯٸ଎ʹ੔උ͞Ε͍ͯΔ • BERTΛ༻͍ͨߴ඼࣭ͳςϯϓϨʔτ͸΄ͱΜͲଘࡏ͠ͳ͍ • ಛʹ࠷৽ͷPython, PyTorch, TransformersʹରԠͰ͖͍ͯͳ͍ •ࣗવݴޠॲཧͷॳֶऀʹͱͬͯ͸͍ۤ͠ঢ়گ • ʮݚڀ΍࣮ݧΛͲͷΑ͏ʹ։࢝ͨ͠ΒΑ͍͔Θ͔Βͳ͍ʯ

    • ʮΑ͍ઃܭɺ࣮ݧ؅ཧΛͲͷΑ͏ʹߦ͑͹ྑ͍͔Θ͔Βͳ͍ʯ •ʮϞμϯͰߴ඼࣭ͳ࣮ݧϓϩάϥϜʯͷݟຊ͕ඞཁ • ࣗ෼ͳΓͷ࣮૷ํ਑ɾࢦ਑Λදݱͨ͠Θ͔Γ΍͍࣮͢૷͕͋Δͱ༗༻ BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ 53 https://github.com/hppRC/bert-classi fi cation-tutorial
  26. •ςΩετ෼ྨͷೖ໳ͱͯ͠༗໊ͳʮϥΠϒυΞχϡʔείʔύεʯ͕୊ࡐ •BERTΛ fi ne-tuning͢ΔྲྀΕΛग़དྷΔ͚ͩγϯϓϧʹ࣮૷ •࣮૷: hppRC/bert-classi fi cation-tutorial ߩݙ •Python

    3.10, PyTorch 1.13, Transformers 4.25 Ҏ্ʹରԠ •Type Hintsͷ׆༻ͱݟ௨͠ͷྑ͍ઃܭ •ʮσʔλ४උʯ → ʮ܇࿅ & ධՁʯ ͱ͍͏୯ํ޲తͳ࣮ݧϓϩηεͷ࣮ྫ •࣮ݧςϯϓϨʔτͱͯͦ͠ͷଞͷλεΫ΁ͷస༻͕༰қͳ࣮૷ BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ 55 https://github.com/hppRC/bert-classi fi cation-tutorial
  27. •࣮ݧ݁Ռͱ࣮ͯ͠ݧઃఆɾධՁࢦඪɾֶशϩάΛอଘ͢Δ •࣮ݧ݁Ռ͸࣮ݧઃఆɾ೔෇͝ͱʹσΟϨΫτϦΛ੾Δ • ྫ: outputs/[Ϟσϧ໊]/[೥݄೔]/[࣌෼ඵ] • ಉ࣮͡ݧઃఆͰ΋݁Ռ্͕ॻ͖͞ΕͨΓ͠ͳ͍ •࣮ݧ݁Ռͷूܭ͸ 1. ධՁࢦඪϑΝΠϧ

    (metrics.json) Λ࠶ؼతʹऩू 2. ࣮ݧઃఆϑΝΠϧ (con fi g.json) ΛಡΈࠐΈ 3. PandasͰ࣮ݧઃఆͱධՁࢦඪͷσʔλϑϨʔϜΛ࡞੒ 4. ࣮ݧઃఆ͝ͱʹgroupby͢ΔͳͲ͓޷ΈͰ ࣮ݧ݁Ռͷอଘํ๏: betterͳ࣮ݧ؅ཧΛ໨ࢦͯ͠ 62 ͱʹ͔͘
 σΟϨΫτϦΛ੾Δ
  28. ࣗવݴޠਪ࿦ (Natural Language Inference; NLI) •จϖΞ (લఏจɾԾઆจ) ʹϥϕϧ (ؚҙɾໃ६ɾதཱ) ͕෇༩

    •จϖΞͷҙຯؔ܎Λ༧ଌ͢ΔλεΫ NLIσʔληοτ 64 લఏจ Ծઆจ ϥϕϧ A man playing an electric guitar on stage. A man playing guitar on stage. ؚҙ A man playing an electric guitar on stage. A man playing banjo on the fl oor. ໃ६ A man playing an electric guitar on stage. A man is performing for cash. தཱ
  29. •ӳޠͷNLIσʔληοτ͸͔ͳΓ੔උ͞Ε͍ͯΔ • Stanford NLI (SNLI; Bowman et al., 2015): ໿57ສจϖΞ

    • Multi-Genre NLI (MNLI; Williams et al., 2018): ໿41ສจϖΞ •೔ຊޠͷNLIσʔληοτ͸ӳޠͱൺֱͯ͠ݶఆత • JSNLI (٢ӽΒ, 2020): Stanford NLIσʔληοτΛ೔ຊޠʹػց຋༁ • JNLI (܀ݪΒ, 2022): JGLUE ʹಉࠝɺΩϟϓγϣϯΛར༻ • JaNLI (୩தΒ, 2021): ݴޠֶత஌ݟʹجͮ͘ఢରతσʔληοτ ChatGPTʹΑΔࣗવݴޠਪ࿦σʔληοτͷࣄྫ͝ͱ຋༁ 65
  30. 0. OpenAIͷAPI KeyΛൃߦ 1. pip install openai 2. promptΛઃܭ 3.

    ࣙॻํʹpromptΛೖΕͯJSONͱͯ͠APIʹ౤͛Δ (উखʹ΍ͬͯ͘ΕΔ) ChatGPTʹΑΔ຋༁ͷखॱ 67
  31. ຋༁ର৅ •Stanford NLI: ໿57ສจϖΞ •Multi-Genre NLI: ໿41ສจϖΞ ຋༁ख๏ •ϥϕϧͷҙຯؔ܎Λյ͞ͳ͍Α͏ʹ຋༁͢ΔΑ͏promptͰࢦࣔ •OpenAIͷChatGPT

    API (gpt-3.5-turbo, $0.002/1K tokens) Λར༻ •໿100ສจϖΞͷ຋༁ʹ5ສԁఔ౓ (DeepLͷAPIͩͱ17ສԁҎ্) ੒Ռ෺ •೔ӳର༁෇͖ͷ೔ຊޠNLIσʔληοτ ChatGPTʹΑΔࣗવݴޠਪ࿦σʔληοτͷࣄྫ͝ͱ຋༁ 68 ൃදऀ஫: promptΛ؆ૉʹ͢ΔͳͲͰ΋ͬͱ҆ՁʹͰ͖ͦ͏Ͱ͢
  32. •6-shot learningΛ࣮ࢪ • NLIσʔληοτͷ೔ӳର༁
 ͱͯ͠JSICK͔ΒࣄྫΛഈआ •গ਺ࣄྫͷޙʹ຋༁͍ͨ͠ࣄྫΛ౤ೖ •Batch Prompting (Cheng et

    al., 2023)
 ΋ར༻ ମײ •zero-shotΑΓfew-shotͷ΄͏͕
 ֨ஈʹ຋༁඼࣭͕ߴ͍ •promptΤϯδχΞϦϯάͰ͞Βʹ
 ຋༁඼࣭Λ޲্ͤ͞ΒΕͦ͏ ࣮ࡍͷPrompt 70
  33. •ରরֶश(Contrastive Learning)Λ༻͍ͯࣄલֶशࡁΈϞσϧΛ fi ne-tuning • Unsupervised SimCSE:ʮಉ͡จΛ2ճຒΊࠐΜͰରরֶशʯ • Supervised SimCSE:

    ʮؚҙؔ܎ʹ͋ΔจΛਖ਼ྫͱͯ͠ରরֶशʯ Gao+: SimCSE: Simple Contrastive Learning of Sentence Embeddings, EMNLP ’21 SimCSE: ରরֶशʹجͮ͘จຒΊࠐΈख๏ 73 ਤ͸࿦จΑΓҾ༻ɻҎલ࣮ࢪͨ͠SimCSEͷྠߨࢿྉ͸ͪ͜Β
  34. •ެ࣮ࣜ૷͸ଟ༷ͳந৅Խ͕ࢪ͞Ε͍ͯͯॳֶऀʹ͸௥͍ͮΒ͍ • จຒΊࠐΈͷݚڀΛଅਐ͍ͨ͠ •ग़དྷΔ͚ͩந৅ԽΛݮΒͨ͠γϯϓϧͳ࠶ݱ࣮૷Λެ։ • hppRC/simple-simcse •γϯϓϧͳ PyTorch + transformers

    ͷߏ੒ɾશମͰ250ߦ • + ࿦จ΁ͷ֘౰Օॴ΁ͷݴٴɾࢲݟΛؚΉίϝϯτ107ߦ • σʔλͷલॲཧΛআ͘ •࠶ݱ࣮૷Λ༻͍ͨ࠶ݱ࣮ݧ΋࣮ࢪ • ࿦จͷϋΠύϥͰ4Ϟσϧɾ50ཚ਺γʔυͰ࣮ݧ (=200ճ) SimCSEͷ࠶ݱ࣮૷: Simple-SimCSE 75
  35. •ӳޠͷࣄલֶशࡁΈจຒΊࠐΈϞσϧ͸ଟ਺ଘࡏ •ҰํͰ೔ຊޠจຒΊࠐΈϞσϧͷܾఆ൛͸ଘࡏ͠ͳ͍ • ิ଍: ࠷ۙ PKSHA͔ࣾΒ೔ຊޠSimCSEϞσϧ ͕ެ։ • ೔ຊޠจຒΊࠐΈք۾΋੝Γ্͕Γͭͭ͋ΔΧϞ…ʂ •೔ຊޠจຒΊࠐΈϞσϧͷแׅతͳධՁ͕ଘࡏ͠ͳ͍

    •೔ຊޠจຒΊࠐΈϞσϧͷߏஙͱแׅతͳධՁΛ࣮ࢪ • ۙ೥ͷจຒΊࠐΈख๏ͱͯ͠୅දతͳSimCSEΛϕʔεʹ • ڭࢣ͋Γɾڭࢣͳ͠ͷ྆ํΛ࣮ݧ • ෳ਺ͷσʔληοτɾϋΠύϥͰ࣮ݧ WIP: ೔ຊޠSimCSEϞσϧͷߏங 79
  36. ܇࿅σʔλ •ڭࢣͳ͠: ೔ຊޠWikipedia, Wiki-40B, CC-100, BCCWJ •ڭࢣ͋Γ: JSNLI, NU-NLI (SNLI,

    MNLI) WIP: ೔ຊޠSimCSEϞσϧͷߏங 80 WikipediaܥΛ2ͭ WebܥΛ2ͭ
  37. ܇࿅σʔλ •ڭࢣͳ͠: ೔ຊޠWikipedia, Wiki-40B, CC-100, BCCWJ •ڭࢣ͋Γ: JSNLI, NU-NLI (SNLI,

    MNLI) ࣮ݧઃఆ: •ࣄલֶशࡁΈݴޠϞσϧ21छྨͰ࣮ݧ (base: 14छྨ, large: 7छྨ) •όοναΠζ: {64, 128, 256, 512}, ֶश཰: {1e-5, 3e-5, 5e-5} •ҟͳΔཚ਺γʔυ஋Ͱ3ճ࣮ͣͭݧͯ͠࠷ྑͷϋΠύϥͰධՁ WIP: ೔ຊޠSimCSEϞσϧͷߏங 81 WikipediaܥΛ2ͭ WebܥΛ2ͭ
  38. ܇࿅σʔλ •ڭࢣͳ͠: ೔ຊޠWikipedia, Wiki-40B, CC-100, BCCWJ •ڭࢣ͋Γ: JSNLI, NU-NLI (SNLI,

    MNLI) ࣮ݧઃఆ: •ࣄલֶशࡁΈݴޠϞσϧ21छྨͰ࣮ݧ (base: 14छྨ, large: 7छྨ) •όοναΠζ: {64, 128, 256, 512}, ֶश཰: {1e-5, 3e-5, 5e-5} •ҟͳΔཚ਺γʔυ஋Ͱ3ճ࣮ͣͭݧͯ͠࠷ྑͷϋΠύϥͰධՁ ݱঢ়ͷ࣮ݧ݁Ռ •ڭࢣ͋Γ/ͳ͠ڞʹૣҴాେRoBERTa-large͕࠷ߴੑೳ •Studio Ousia ೔ຊޠLUKE-largeͱXLM-RoBERTa-large͕͍࣍Ͱߴੑೳ WIP: ೔ຊޠSimCSEϞσϧͷߏங 82 ݱࡏ·Ͱʹ… ڭࢣͳ͠: 1559ճ ڭࢣ͋Γ: 3172ճ
  39. •ۙ೔ެ։༧ఆ • σʔληοτલॲཧ༻ͷϓϩάϥϜ • ࣮ݧίʔυ (ֶशɾධՁ) • ࣮ݧ݁Ռ (ϋΠύϥ୳ࡧ࣌ͷ݁Ռ΋) •

    ࣄલ܇࿅ࡁΈϞσϧ •ݱࡏ࣮ݧɾධՁத… WIP: ೔ຊޠSimCSEϞσϧͷߏங 83 BCCWJͷXMLΛ
 Ϛϧνϓϩηεʹલॲཧͯ͠ ςΩετϑΝΠϧʹ
 ม׵͢ΔϓϩάϥϜͳͲ ஶ໊ͳ೔ຊޠσʔληοτ
 લॲཧ༻ͷϓϩάϥϜηοτͱͯ͠΋