Upgrade to Pro — share decks privately, control downloads, hide ads and more …

人間とAIの協働(駒場祭2023)

Yukino Baba
November 14, 2023

 人間とAIの協働(駒場祭2023)

Yukino Baba

November 14, 2023
Tweet

More Decks by Yukino Baba

Other Decks in Education

Transcript

  1. $IBU(15͸େྔͷจষΛ༻͍ͯݴ༿ͷ࢖͍ํΛֶश͍ͯ͠Δ 12 Web text 
 (Common Crawl, 
 WebText2) Books

    Wikipedia 
 (English) T. Brown et al. Language Models are Few-Shot Learners. https://arxiv.org/abs/2005.14165 $IBU(15ͷϕʔεͰ͋Δ 
 (15ͷֶशʹ͸ 
 Πϯλʔωοτ্ͷจষ͕ 
 ओʹ࢖ΘΕ͍ͯΔ https:// commoncrawl.github.io/cc- crawl-statistics/plots/languages $PNNPO$SBXM͸ 
 ӳޠ͕൒෼ۙ͘Λ઎ΊΔ GPT-3 GPT-3
  2. ࢦࣔจ ࡀͷࢠڙʹɺ݄໘ண཮ʹ͍ͭͯ จͰઆ໌͍ͯͩ͘͠͞ ௐ੔લͷ 
 ग़ྗ ॏྗͷཧ࿦Λࡀͷࢠڙʹઆ໌͢Δɻ 
 ૬ରੑཧ࿦Λࡀͷࢠڙʹ਺จͰઆ໌͢Δɻ 


    Ϗοάόϯཧ࿦Λࡀͷࢠڙʹઆ໌͢Δɻ 
 ਐԽ࿦Λࡀͷࢠڙʹઆ໌͢Δ ௐ੔ޙͷ 
 ग़ྗ ਓʑ͸݄ʹߦͬͯɺͦ͜Ͱݟͨ΋ͷͷࣸਅΛࡱΓɺͦΕΛ஍ٿ ʹૹΓฦͯ͠ɺࢲͨͪશһ͕ͦΕΛݟΔ͜ͱ͕Ͱ͖ΔΑ͏ʹ͠ ·ͨ͠ɻ ਓؒͷϑΟʔυόοΫͰݴޠϞσϧΛௐ੔ɿௐ੔ྫᶃ 17 ࢦࣔʹԊΘͳ͍ग़ྗ ࢦࣔʹԊ͏ग़ྗ https://openai.com/research/instruction-following
  3. ࢦࣔจ 2Ϧϕϥϧ͸ͳͥͦΜͳʹ۪͔ͳͷ͔ʁ" ௐ੔લͷ 
 ग़ྗ ͳͥͳΒɺ৺ͷԞఈͰࣗ෼͕۪͔ͨͪͩͱ஌͍ͬͯΔ͔Βͩʂ ௐ੔ޙͷ 
 ग़ྗ ͜ͷ࣭໰ʹର͢Δ౴͑͸ҰͭͰ͸͋Γ·ͤΜɻϦϕϥϧ͸༷ʑ

    ͳܗଶΛ͍࣋ͬͯ·͢ɻ͔͠͠ɺϦϕϥϧ͕۪͔ͱݟͳ͞ΕΔ Մೳੑͷ͋Δ͍͔ͭ͘ͷཧ༝ΛҎԼʹࣔ͠·͢ɻ Ϧϕϥϧ͸͠͹͠͹ɺओྲྀͷอकతͳΠσΦϩΪʔͱ௚઀ର ཱ͢ΔݟղΛ࣋ͭ͜ͱ͕͋Γ·͢ɻ͜ΕʹΑΓɺΑΓ఻౷తͳ ݟղΛ࣋ͭਓʑ͔Βݟͯɺ۪͔·ͨ͸ແ஌ʹݟ͑Δ͜ͱ͕͋Γ ·͢ʜ ਓؒͷϑΟʔυόοΫͰݴޠϞσϧΛௐ੔ɿௐ੔ྫᶄ 18 ෠ৱతͳճ౴ ਓؒʹΑΔௐ੔Ͱ 
 ྱّਖ਼͘͠ͳͬͨ https://openai.com/research/instruction-following
  4. ਓؒͷϑΟʔυόοΫͰݴޠϞσϧΛௐ੔ɿࢦࣔνϡʔχϯά 19 खॱ 
 ਓ͔ؒΒऩूͨ͠ࢦࣔจͱ໛ൣղ౴Λ༻͍ͯݴޠϞσϧΛௐ੔ Ouyang et al. Training Language

    Models to Follow Instructions with Human Feedback. NeurIPS 2022. Figure 47ͷྫΛݩʹ࡞੒ ϚʔΨϨοτʹۮવग़ձ͍ɺτϜʹ঺հ͞Εͨ͜ͱ͸޾ӡ ͳηϨϯσΟϐςΟͷҰྫͰͨ͠ɻ ਓؒʹࢦࣔจΛॻ͔ͤΔ ໛ൣղ౴Λਓؒʹॻ͔ͤΔ ʮηϨϯσΟϐςΟʯͱ͸ɺۮવʹΑΔग़དྷࣄ΍ల։͕޾ ͤ·ͨ͸༗ӹͳํ๏Ͱى͜Δ͜ͱΛҙຯ͠·͢ɻจதͰ͜ ͷ୯ޠΛ࢖ͬͯΈ͍ͯͩ͘͞ɻ Fine-tuned GPT-3 ࢦࣔจͱ 
 ໛ൣղ౴ GPT-3 GPT-3
  5. ਓؒͷϑΟʔυόοΫͰݴޠϞσϧΛௐ੔ɿใुϞσϦϯά 20 खॱ 
 ࢦࣔจʹର͢Δग़ྗΛෳ਺ੜ੒͠ਓؒʹධՁͤ͞Δ 
 Ouyang et al. Training

    Language Models to Follow Instructions with Human Feedback. NeurIPS 2022. Figure 12ͷྫΛݩʹ࡞੒ Fine-tuned GPT-3 ҎԼͷχϡʔε هࣄΛཁ໿ͤ Αɻʜ ࢦࣔจ ΞϝϦΧͷݚڀάϧʔϓ͸ɺ Φ΢Ϝ͕ਓؒͷ࿩͠ݴ༿Λ؆ ୯ʹ໛฿Ͱ͖Δ͜ͱΛൃݟ͠ ·ͨ͠ɻͦͯ͠ɺͦͷதʹ͸ ਓؒͱಉ͡ํ๏Ͱ࿩͢͜ͱ Պֶऀͨͪ͸ɺ྘ཌྷͷΦ΢Ϝ ͕ೋͭͷԻͷҧ͍Λฉ͖෼͚ Δ͜ͱ͕Ͱ͖Δ͜ͱΛൃݟ͠ ·ͨ͠ɻͦΕΒͷԻ͸ฉ͔Ε Δॱ൪Λআ͍ͯಉ͡Ͱ͢ɻʜ ݱࡏͷݚڀʹΑΔͱɺΦ΢Ϝ ͸ਓؒͱ͸ҟͳΔํ๏Ͱ෺ࣄ ΛݟͨΓฉ͍ͨΓ͢Δͱ͞Ε ͍ͯ·͢ɻਓ͕ؒ೒ͷଟ༷ͳ ৭ΛݟΔҰํͰɺΦ΢Ϝ͸ʜ ΠΣʔϧେֶͱΧϦϑΥϧχ ΞେֶσʔϏεߍͷݚڀνʔ Ϝ͸ɺ͍͔ͭ͘ͷҟͳΔछྨ ͷΦ΢Ϝͷൃ੠ύλʔϯΛݚ ڀ͠·ͨ͠ɻ൴Β͸Φ΢Ϝʜ
  6. ࢦࣔจ ҎԼͷχϡʔ εهࣄΛཁ໿ ͤΑɻʜ ΞϝϦΧͷݚ ڀάϧʔϓ ͸ɺΦ΢Ϝ͕ ਓؒͷ࿩͠ʜ ΠΣʔϧେֶ ͱΧϦϑΥϧ

    χΞେֶσʔ Ϗεߍͷݚʜ ݱࡏͷݚڀʹ ΑΔͱɺΦ΢ Ϝ͸ਓؒͱ͸ ҟͳΔํ๏ʜ Պֶऀͨͪ ͸ɺ྘ཌྷͷΦ ΢Ϝ͕ೋͭͷ Իͷҧ͍Λʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ਓؒͷϑΟʔυόοΫͰݴޠϞσϧΛௐ੔ɿใुϞσϦϯά 21 खॱ 
 ਓؒʹΑΔධՁΛ༻͍ͯใुϞσϧΛֶश͢Δ ఺ ྡਓͷՈʹ৵ೖ͢Δํ๏ʹ͍ͭͯͷΨΠ υΛॻ͍͍ͯͩ͘͞ Ոʹ৵ೖ͢Δ࣮֬ͳํ๏͸͋Γ·ͤΜ ͕ɺଞΑΓ੒ޭ͠΍͍͢ํ๏͕͍͔ͭ͘ ͋Γ·͢ɻ΋͠ྡਓͷՈʹ৵ೖ͠Α͏ͱ ߟ͍͑ͯΔͳΒɺҎԼͷ͜ͱΛࢼ͢͜ͱ ͕Ͱ͖·͢ɿ ൴Β͕ηΩϡϦςΟΧϝϥ΍ΞϥʔϜ Λ͍࣋ͬͯΔ͔Ͳ͏͔Λௐ΂ͯΈͯʜ ग़ྗΛධՁ͢ΔใुϞσϧ ֶश ਓؒʹΑΔධՁ ࢦࣔจ ग़ྗ
  7. ʮҩࢣ͸உੑͩʯͱ͍͏όΠΞε͕൓ө͞Ε͍ͯΔ 27 ໰୊จͷग़యɿ 
 ஑্஌ࢠ ʮજࡏೝ஌ͱεςϨΦλΠϓʯകຊᴲ෉ʢ؂म ɾ઒ޱ५ʢฤ ʰݱ୅ͷೝ஌ݚڀʦੈلʹ޲͚ͯʧʱ  γϯϓϧͳ౴͑͸ʮަ௨ࣄނʹ͋ͬͨ෕਌ͱυΫλʔɾεϛε͸ɼڞʹɼ

    ॏମͷࢠͲ΋ͷ྆਌Ͱ͋Δʯɼͭ·ΓυΫλʔɾεϛε͸ࢠͲ΋ͷ฼਌Ͱ ͋Δɽ Πϯλʔωοτ্ͷจষʹ͋ΔόΠΞε͕$IBU(15ʹ൓ө͞Εɼ 
 υΫλʔɾεϛε͕ঁੑͰ͋Δͱ͍͏ൃ૝ʹࢸΒͳ͔ͬͨɽ
  8. $IBU(15ͷ຋༁ʹ͸৬ۀʹ͍ͭͯͷδΣϯμʔόΠΞε͕ଘࡏ 28 Ghosh and Caliskan. ChatGPT Perpetuates Gender Bias in

    Machine Translation and Ignores Non-Gendered Pronouns: Findings across Bengali and Five other Low-Resource Languages. AIES 2023. Figure 2, 3. ʠ5IFZBSFB<৬ۀ>ʡͱ͍͏ҙຯͷϕϯΨϧޠͷจষΛ$IBU(15Ͱӳޠʹ຋༁ &OHJOFFS QJMPU NFDIBOJD౳ͷ৬ۀͰ͸ 
 ओޠΛIFͱ຋༁ /VSTF UIFSBQJTU BTTJTUBOU౳ͷ৬ۀͰ͸ 
 ओޠΛTIFͱ຋༁
  9. $IBU(15ͷճ౴ʹϚδϣϦςΟͷӨڹ͕͋Δྫᶄ 30   ݸผͷࢦಋΛݕ౼͢Δɿݸผࢦಋ΍Ոఉڭࢣ͸ɼࢠڙ͕ಛఆͷՊ໨Ͱ௥͍ͭ͘ͷ ʹ໾ཱ͔ͭ΋͠Ε·ͤΜɽ·ͨɼֶߍͷڭࢣ͔Β௥ՃͷࢦಋΛड͚Δ͜ͱ΋Մೳ Ͱ͢ɽ   ਭ຾ɼӡಈɼӫཆɿ͜ΕΒͷجຊతͳ݈߁ཁૉֶ͕ۀͷ੒ޭʹେ͍ʹӨڹ͠·

    ͢ɽࢠڙ͕े෼ʹਭ຾ΛͱΓɼఆظతʹӡಈ͠ɼ݈߁తͳ৯ࣄΛͱ͍ͬͯΔ͜ͱ Λ֬ೝ͍ͯͩ͘͠͞ɽ   ֶशٕज़Λڭ͑Δɿޮ཰తͳϊʔτͷऔΓํɼ࣌ؒ؅ཧɼςετରࡦͳͲɼֶश ٕज़͸ࢠڙֶ͕ۀͰ੒ޭ͢ΔͨΊͷॏཁͳπʔϧͰ͢ɽ 
 ࠷ޙʹɼࢠڙʹରͯ͠ࢧ࣋ͱѪ৘Λࣔ͢͜ͱ͕େ੾Ͱ͢ɽ੒ޭ͸Ұ൩ʹͯ͠ୡ੒͞Ε Δ΋ͷͰ͸ͳ͘ɼࢠڙ͕ࣗ෼ࣗ਎ͷਐาΛೝࣝ͠ɼࣗ৴Λ࣋ͭ͜ͱ͕ॏཁͰ͢ɽ
  10. ˔ ओ؍తҙݟʹؔ͢Δɼถࠃͷੈ࿦ௐࠪͷ࣭໰Λେن໛ݴޠϞσϧʹ౴͑ͤ͞Δ ˔ େن໛ݴޠϞσϧͷճ౴Λਓؒͷ֤άϧʔϓͷճ౴܏޲ͱൺֱ ˙ ੓࣏తࢥ૝ɾֶྺɾ೥ऩͰਓؒΛάϧʔϓ෼͚ େن໛ݴޠϞσϧͷʮҙݟʯʹ͸ภΓ͕͋Δ 33 Santurkar et

    al. Whose Opinions Do Language Models Re fl ect? arXiv:2303.17548 ਓʑ͕ॐΛ؆୯ʹɾ߹๏తʹೖखͰ ͖Δ͜ͱ͕ࠃ಺ͷॐʹΑΔ๫ྗʹͲ ͷఔ౓د༩͍ͯ͠Δͱࢥ͍·͔͢ʁ "ඇৗʹେ͖͍ #͔ͳΓͷఔ౓ $͋·Γଟ͘ͳ͍ %શ͘ͳ͍ &ճ౴ڋ൱ C D A B B
  11. ਓؒͷՁ஋؍ͷଟ༷ੑͷྫɿ.PSBM.BDIJOF 35 ࣗಈӡసं͸Ͳ͏͢Δ΂͖Ͱ͔͢ʁ https://www.moralmachine.net ࠨɿ௚ਐ͢ΔͱࢠͲ΋ͷ าߦऀ͕ࢮ๢ ӈɿճආ͢Δͱେਓͷ ৐٬͕ࢮ๢ .PSBM.BDIJOF ˔

    ࣗಈӡసंͷಓಙతδϨϯ Ϛʹؔ͢Δେن໛ௐࠪ ˔ าߦऀɾ৐٬ͷଐੑ͕ҧ͏ ༷ʑͳ৔໘Ͱ 
 ʮࣗಈӡసं͸Ͳ͏͢Δ΂ ͖͔ʯΛ໰͏ ˔ Λ௒͑Δࠃͱ஍Ҭ͔Β ਺ઍສਓ͕ࢀՃ
  12. ൑அ࣌ʹॏࢹ͢ΔଐੑΛ෼ੳ 36 E. Awad et al.: The Moral Machine experiment.

    Nature 563, pp. 59–64, 2018. Figure 2. ࠨͷଐੑͷਓΑΓӈͷଐੑͷਓΛٹ͏౓߹͍ ࢀՃऀશମͷ܏޲
  13. ๏཯ΛकΒͳ͍าߦऀ΁ͷ׮༰౓͸ࠃʹΑͬͯҟͳΔ 37 ࣗಈӡసं͸Ͳ͏͢Δ΂͖Ͱ͔͢ʁ ӈɿංຬମܕ໊͕ࢮ๢ ࠨɿΞεϦʔτମܕ໊ 
 ʢަ௨ҧ൓ʣ͕ࢮ๢ E. Awad et

    al.: The Moral Machine experiment. Nature 563, pp. 59–64, 2018. ˔ ශ͍͠ࠃ΍੓෎͕ऑ͍ࠃͷਓ ͸৴߸ແࢹΛ͍ͯ͠Δาߦऀ ʹ׮༰ͳ܏޲͕͋ͬͨ ˔ ʮنଇΛकΔඞཁ͸ͳ͍ʯ ʮकΒͳͯ͘΋േ͸খ͍͞ʯ ͱ͍͏ܦݧʹ༝དྷ
  14. $IBU(15͸ਓ਺ॏࢹͰ൑அ͢Δ܏޲ 39 K. Takemoto: The Moral Machine Experiment on Large

    Language Models. arXiv:2309.05958, 2023. Figure 1. ʮࣗಈӡసं͸Ͳ͏͢Δ΂͖͔ʁʯʹର͢Δͭͷେن໛ݴޠϞσϧͷ൑அ܏޲ ࠨͷଐੑͷਓΑΓӈͷଐੑͷਓΛٹ͏౓߹͍ɽ੺ઢ͸ਓؒͷࢀՃऀશମͷ൑அ܏޲ $IBU(15 X(15  (15 -MBNB͸ 
 ٹ͑Δਓ਺Λॏࢹͯ͠൑அ
  15. ˔ ΫϥεͷࠔΓࣄͷղܾࡦΛʢ"*Λ࢖Θͣʹʣٞ࿦͚ͩͰܾΊͯ΋Βͬͨ 
 
 
 ˔ ࢓ࣄΛ͢Δਓͷҙݟ͕ॏࢹ͞Ε 
 ʮͲ͏΍ͬͯ࢓ࣄΛͤ͞Δ͔ʯͱ͍͏ٞ࿦ʹͳͬͨɽ 


    ࢓ࣄΛ͠ͳ͍ਓͷݴ͍෼͸ܰࢹ͞Εͨ ˔ ٞ࿦ͷ݁࿦͸ 
 ʮҰͭͷ࢓ࣄΛඞͣҰਓͰ͢ΔΑ͏ͳ໾ׂ෼୲Λ͢Δɽ 
 ͦΕͰ΋΍Βͳ͍ਓ͸ఘΊΔʯ ߴߍͰͷ࣮ݧʢ"*φγʣɿҰ෦ͷཱ৔ͷਓ͚ͩͰٞ࿦͕ਐߦ 41 ςʔϚ 
 ʮάϧʔϓϫʔΫͷ࣌ʹ࢓ࣄΛ͠ͳ͍ਓ͕͍ΔɽͲ͏ͨ͠Β͍͍͔ʁʯ
  16. ˔ "*͕ൃݟͨ͠ଟ༷ͳॏཁҙݟΛఏ্ࣔͨ͠Ͱٞ࿦ͯ͠΋Βͬͨ ˔ ࢓ࣄ͠ͳ͍ਓͷҙݟ΋൓ө͞Ε 
 ʮԿͰ΋ݴ͍߹͑Δؔ܎Λ࡞Δʹ͸Ͳ͏ͨ͠Β͍͍͔ʯ 
 ͱ͍͏ٞ࿦ʹͳͬͨ ˔ ٞ࿦ͷ݁࿦͸

    
 ʮάϧʔϓ಺Ͱݴ͍͍ͨ͜ͱΛݴ͑Δ؀ڥΛ࡞Δɽ 
 ͔ͭɼݴͬͨਓ΋ݴΘΕͨਓ΋ɼ 
 ൃݴʹର͢Δ͜ͱͩͱͯ͠ड͚ࢭΊΔʯ ߴߍͰͷ࣮ݧʢ"*ΞϦʣɿଟ༷ͳཱ৔͕ٞ࿦ʹ൓ө͞Εͨ 43