Upgrade to Pro — share decks privately, control downloads, hide ads and more …

人間とAIの協働(駒場祭2023)

Avatar for Yukino Baba Yukino Baba
November 14, 2023

 人間とAIの協働(駒場祭2023)

Avatar for Yukino Baba

Yukino Baba

November 14, 2023
Tweet

More Decks by Yukino Baba

Other Decks in Education

Transcript

  1. $IBU(15͸େྔͷจষΛ༻͍ͯݴ༿ͷ࢖͍ํΛֶश͍ͯ͠Δ 12 Web text 
 (Common Crawl, 
 WebText2) Books

    Wikipedia 
 (English) T. Brown et al. Language Models are Few-Shot Learners. https://arxiv.org/abs/2005.14165 $IBU(15ͷϕʔεͰ͋Δ 
 (15ͷֶशʹ͸ 
 Πϯλʔωοτ্ͷจষ͕ 
 ओʹ࢖ΘΕ͍ͯΔ https:// commoncrawl.github.io/cc- crawl-statistics/plots/languages $PNNPO$SBXM͸ 
 ӳޠ͕൒෼ۙ͘Λ઎ΊΔ GPT-3 GPT-3
  2. ࢦࣔจ ࡀͷࢠڙʹɺ݄໘ண཮ʹ͍ͭͯ จͰઆ໌͍ͯͩ͘͠͞ ௐ੔લͷ 
 ग़ྗ ॏྗͷཧ࿦Λࡀͷࢠڙʹઆ໌͢Δɻ 
 ૬ରੑཧ࿦Λࡀͷࢠڙʹ਺จͰઆ໌͢Δɻ 


    Ϗοάόϯཧ࿦Λࡀͷࢠڙʹઆ໌͢Δɻ 
 ਐԽ࿦Λࡀͷࢠڙʹઆ໌͢Δ ௐ੔ޙͷ 
 ग़ྗ ਓʑ͸݄ʹߦͬͯɺͦ͜Ͱݟͨ΋ͷͷࣸਅΛࡱΓɺͦΕΛ஍ٿ ʹૹΓฦͯ͠ɺࢲͨͪશһ͕ͦΕΛݟΔ͜ͱ͕Ͱ͖ΔΑ͏ʹ͠ ·ͨ͠ɻ ਓؒͷϑΟʔυόοΫͰݴޠϞσϧΛௐ੔ɿௐ੔ྫᶃ 17 ࢦࣔʹԊΘͳ͍ग़ྗ ࢦࣔʹԊ͏ग़ྗ https://openai.com/research/instruction-following
  3. ࢦࣔจ 2Ϧϕϥϧ͸ͳͥͦΜͳʹ۪͔ͳͷ͔ʁ" ௐ੔લͷ 
 ग़ྗ ͳͥͳΒɺ৺ͷԞఈͰࣗ෼͕۪͔ͨͪͩͱ஌͍ͬͯΔ͔Βͩʂ ௐ੔ޙͷ 
 ग़ྗ ͜ͷ࣭໰ʹର͢Δ౴͑͸ҰͭͰ͸͋Γ·ͤΜɻϦϕϥϧ͸༷ʑ

    ͳܗଶΛ͍࣋ͬͯ·͢ɻ͔͠͠ɺϦϕϥϧ͕۪͔ͱݟͳ͞ΕΔ Մೳੑͷ͋Δ͍͔ͭ͘ͷཧ༝ΛҎԼʹࣔ͠·͢ɻ Ϧϕϥϧ͸͠͹͠͹ɺओྲྀͷอकతͳΠσΦϩΪʔͱ௚઀ର ཱ͢ΔݟղΛ࣋ͭ͜ͱ͕͋Γ·͢ɻ͜ΕʹΑΓɺΑΓ఻౷తͳ ݟղΛ࣋ͭਓʑ͔Βݟͯɺ۪͔·ͨ͸ແ஌ʹݟ͑Δ͜ͱ͕͋Γ ·͢ʜ ਓؒͷϑΟʔυόοΫͰݴޠϞσϧΛௐ੔ɿௐ੔ྫᶄ 18 ෠ৱతͳճ౴ ਓؒʹΑΔௐ੔Ͱ 
 ྱّਖ਼͘͠ͳͬͨ https://openai.com/research/instruction-following
  4. ਓؒͷϑΟʔυόοΫͰݴޠϞσϧΛௐ੔ɿࢦࣔνϡʔχϯά 19 खॱ 
 ਓ͔ؒΒऩूͨ͠ࢦࣔจͱ໛ൣղ౴Λ༻͍ͯݴޠϞσϧΛௐ੔ Ouyang et al. Training Language

    Models to Follow Instructions with Human Feedback. NeurIPS 2022. Figure 47ͷྫΛݩʹ࡞੒ ϚʔΨϨοτʹۮવग़ձ͍ɺτϜʹ঺հ͞Εͨ͜ͱ͸޾ӡ ͳηϨϯσΟϐςΟͷҰྫͰͨ͠ɻ ਓؒʹࢦࣔจΛॻ͔ͤΔ ໛ൣղ౴Λਓؒʹॻ͔ͤΔ ʮηϨϯσΟϐςΟʯͱ͸ɺۮવʹΑΔग़དྷࣄ΍ల։͕޾ ͤ·ͨ͸༗ӹͳํ๏Ͱى͜Δ͜ͱΛҙຯ͠·͢ɻจதͰ͜ ͷ୯ޠΛ࢖ͬͯΈ͍ͯͩ͘͞ɻ Fine-tuned GPT-3 ࢦࣔจͱ 
 ໛ൣղ౴ GPT-3 GPT-3
  5. ਓؒͷϑΟʔυόοΫͰݴޠϞσϧΛௐ੔ɿใुϞσϦϯά 20 खॱ 
 ࢦࣔจʹର͢Δग़ྗΛෳ਺ੜ੒͠ਓؒʹධՁͤ͞Δ 
 Ouyang et al. Training

    Language Models to Follow Instructions with Human Feedback. NeurIPS 2022. Figure 12ͷྫΛݩʹ࡞੒ Fine-tuned GPT-3 ҎԼͷχϡʔε هࣄΛཁ໿ͤ Αɻʜ ࢦࣔจ ΞϝϦΧͷݚڀάϧʔϓ͸ɺ Φ΢Ϝ͕ਓؒͷ࿩͠ݴ༿Λ؆ ୯ʹ໛฿Ͱ͖Δ͜ͱΛൃݟ͠ ·ͨ͠ɻͦͯ͠ɺͦͷதʹ͸ ਓؒͱಉ͡ํ๏Ͱ࿩͢͜ͱ Պֶऀͨͪ͸ɺ྘ཌྷͷΦ΢Ϝ ͕ೋͭͷԻͷҧ͍Λฉ͖෼͚ Δ͜ͱ͕Ͱ͖Δ͜ͱΛൃݟ͠ ·ͨ͠ɻͦΕΒͷԻ͸ฉ͔Ε Δॱ൪Λআ͍ͯಉ͡Ͱ͢ɻʜ ݱࡏͷݚڀʹΑΔͱɺΦ΢Ϝ ͸ਓؒͱ͸ҟͳΔํ๏Ͱ෺ࣄ ΛݟͨΓฉ͍ͨΓ͢Δͱ͞Ε ͍ͯ·͢ɻਓ͕ؒ೒ͷଟ༷ͳ ৭ΛݟΔҰํͰɺΦ΢Ϝ͸ʜ ΠΣʔϧେֶͱΧϦϑΥϧχ ΞେֶσʔϏεߍͷݚڀνʔ Ϝ͸ɺ͍͔ͭ͘ͷҟͳΔछྨ ͷΦ΢Ϝͷൃ੠ύλʔϯΛݚ ڀ͠·ͨ͠ɻ൴Β͸Φ΢Ϝʜ
  6. ࢦࣔจ ҎԼͷχϡʔ εهࣄΛཁ໿ ͤΑɻʜ ΞϝϦΧͷݚ ڀάϧʔϓ ͸ɺΦ΢Ϝ͕ ਓؒͷ࿩͠ʜ ΠΣʔϧେֶ ͱΧϦϑΥϧ

    χΞେֶσʔ Ϗεߍͷݚʜ ݱࡏͷݚڀʹ ΑΔͱɺΦ΢ Ϝ͸ਓؒͱ͸ ҟͳΔํ๏ʜ Պֶऀͨͪ ͸ɺ྘ཌྷͷΦ ΢Ϝ͕ೋͭͷ Իͷҧ͍Λʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ʜ ਓؒͷϑΟʔυόοΫͰݴޠϞσϧΛௐ੔ɿใुϞσϦϯά 21 खॱ 
 ਓؒʹΑΔධՁΛ༻͍ͯใुϞσϧΛֶश͢Δ ఺ ྡਓͷՈʹ৵ೖ͢Δํ๏ʹ͍ͭͯͷΨΠ υΛॻ͍͍ͯͩ͘͞ Ոʹ৵ೖ͢Δ࣮֬ͳํ๏͸͋Γ·ͤΜ ͕ɺଞΑΓ੒ޭ͠΍͍͢ํ๏͕͍͔ͭ͘ ͋Γ·͢ɻ΋͠ྡਓͷՈʹ৵ೖ͠Α͏ͱ ߟ͍͑ͯΔͳΒɺҎԼͷ͜ͱΛࢼ͢͜ͱ ͕Ͱ͖·͢ɿ ൴Β͕ηΩϡϦςΟΧϝϥ΍ΞϥʔϜ Λ͍࣋ͬͯΔ͔Ͳ͏͔Λௐ΂ͯΈͯʜ ग़ྗΛධՁ͢ΔใुϞσϧ ֶश ਓؒʹΑΔධՁ ࢦࣔจ ग़ྗ
  7. ʮҩࢣ͸உੑͩʯͱ͍͏όΠΞε͕൓ө͞Ε͍ͯΔ 27 ໰୊จͷग़యɿ 
 ஑্஌ࢠ ʮજࡏೝ஌ͱεςϨΦλΠϓʯകຊᴲ෉ʢ؂म ɾ઒ޱ५ʢฤ ʰݱ୅ͷೝ஌ݚڀʦੈلʹ޲͚ͯʧʱ  γϯϓϧͳ౴͑͸ʮަ௨ࣄނʹ͋ͬͨ෕਌ͱυΫλʔɾεϛε͸ɼڞʹɼ

    ॏମͷࢠͲ΋ͷ྆਌Ͱ͋Δʯɼͭ·ΓυΫλʔɾεϛε͸ࢠͲ΋ͷ฼਌Ͱ ͋Δɽ Πϯλʔωοτ্ͷจষʹ͋ΔόΠΞε͕$IBU(15ʹ൓ө͞Εɼ 
 υΫλʔɾεϛε͕ঁੑͰ͋Δͱ͍͏ൃ૝ʹࢸΒͳ͔ͬͨɽ
  8. $IBU(15ͷ຋༁ʹ͸৬ۀʹ͍ͭͯͷδΣϯμʔόΠΞε͕ଘࡏ 28 Ghosh and Caliskan. ChatGPT Perpetuates Gender Bias in

    Machine Translation and Ignores Non-Gendered Pronouns: Findings across Bengali and Five other Low-Resource Languages. AIES 2023. Figure 2, 3. ʠ5IFZBSFB<৬ۀ>ʡͱ͍͏ҙຯͷϕϯΨϧޠͷจষΛ$IBU(15Ͱӳޠʹ຋༁ &OHJOFFS QJMPU NFDIBOJD౳ͷ৬ۀͰ͸ 
 ओޠΛIFͱ຋༁ /VSTF UIFSBQJTU BTTJTUBOU౳ͷ৬ۀͰ͸ 
 ओޠΛTIFͱ຋༁
  9. $IBU(15ͷճ౴ʹϚδϣϦςΟͷӨڹ͕͋Δྫᶄ 30   ݸผͷࢦಋΛݕ౼͢Δɿݸผࢦಋ΍Ոఉڭࢣ͸ɼࢠڙ͕ಛఆͷՊ໨Ͱ௥͍ͭ͘ͷ ʹ໾ཱ͔ͭ΋͠Ε·ͤΜɽ·ͨɼֶߍͷڭࢣ͔Β௥ՃͷࢦಋΛड͚Δ͜ͱ΋Մೳ Ͱ͢ɽ   ਭ຾ɼӡಈɼӫཆɿ͜ΕΒͷجຊతͳ݈߁ཁૉֶ͕ۀͷ੒ޭʹେ͍ʹӨڹ͠·

    ͢ɽࢠڙ͕े෼ʹਭ຾ΛͱΓɼఆظతʹӡಈ͠ɼ݈߁తͳ৯ࣄΛͱ͍ͬͯΔ͜ͱ Λ֬ೝ͍ͯͩ͘͠͞ɽ   ֶशٕज़Λڭ͑Δɿޮ཰తͳϊʔτͷऔΓํɼ࣌ؒ؅ཧɼςετରࡦͳͲɼֶश ٕज़͸ࢠڙֶ͕ۀͰ੒ޭ͢ΔͨΊͷॏཁͳπʔϧͰ͢ɽ 
 ࠷ޙʹɼࢠڙʹରͯ͠ࢧ࣋ͱѪ৘Λࣔ͢͜ͱ͕େ੾Ͱ͢ɽ੒ޭ͸Ұ൩ʹͯ͠ୡ੒͞Ε Δ΋ͷͰ͸ͳ͘ɼࢠڙ͕ࣗ෼ࣗ਎ͷਐาΛೝࣝ͠ɼࣗ৴Λ࣋ͭ͜ͱ͕ॏཁͰ͢ɽ
  10. ˔ ओ؍తҙݟʹؔ͢Δɼถࠃͷੈ࿦ௐࠪͷ࣭໰Λେن໛ݴޠϞσϧʹ౴͑ͤ͞Δ ˔ େن໛ݴޠϞσϧͷճ౴Λਓؒͷ֤άϧʔϓͷճ౴܏޲ͱൺֱ ˙ ੓࣏తࢥ૝ɾֶྺɾ೥ऩͰਓؒΛάϧʔϓ෼͚ େن໛ݴޠϞσϧͷʮҙݟʯʹ͸ภΓ͕͋Δ 33 Santurkar et

    al. Whose Opinions Do Language Models Re fl ect? arXiv:2303.17548 ਓʑ͕ॐΛ؆୯ʹɾ߹๏తʹೖखͰ ͖Δ͜ͱ͕ࠃ಺ͷॐʹΑΔ๫ྗʹͲ ͷఔ౓د༩͍ͯ͠Δͱࢥ͍·͔͢ʁ "ඇৗʹେ͖͍ #͔ͳΓͷఔ౓ $͋·Γଟ͘ͳ͍ %શ͘ͳ͍ &ճ౴ڋ൱ C D A B B
  11. ਓؒͷՁ஋؍ͷଟ༷ੑͷྫɿ.PSBM.BDIJOF 35 ࣗಈӡసं͸Ͳ͏͢Δ΂͖Ͱ͔͢ʁ https://www.moralmachine.net ࠨɿ௚ਐ͢ΔͱࢠͲ΋ͷ าߦऀ͕ࢮ๢ ӈɿճආ͢Δͱେਓͷ ৐٬͕ࢮ๢ .PSBM.BDIJOF ˔

    ࣗಈӡసंͷಓಙతδϨϯ Ϛʹؔ͢Δେن໛ௐࠪ ˔ าߦऀɾ৐٬ͷଐੑ͕ҧ͏ ༷ʑͳ৔໘Ͱ 
 ʮࣗಈӡసं͸Ͳ͏͢Δ΂ ͖͔ʯΛ໰͏ ˔ Λ௒͑Δࠃͱ஍Ҭ͔Β ਺ઍສਓ͕ࢀՃ
  12. ൑அ࣌ʹॏࢹ͢ΔଐੑΛ෼ੳ 36 E. Awad et al.: The Moral Machine experiment.

    Nature 563, pp. 59–64, 2018. Figure 2. ࠨͷଐੑͷਓΑΓӈͷଐੑͷਓΛٹ͏౓߹͍ ࢀՃऀશମͷ܏޲
  13. ๏཯ΛकΒͳ͍าߦऀ΁ͷ׮༰౓͸ࠃʹΑͬͯҟͳΔ 37 ࣗಈӡసं͸Ͳ͏͢Δ΂͖Ͱ͔͢ʁ ӈɿංຬମܕ໊͕ࢮ๢ ࠨɿΞεϦʔτମܕ໊ 
 ʢަ௨ҧ൓ʣ͕ࢮ๢ E. Awad et

    al.: The Moral Machine experiment. Nature 563, pp. 59–64, 2018. ˔ ශ͍͠ࠃ΍੓෎͕ऑ͍ࠃͷਓ ͸৴߸ແࢹΛ͍ͯ͠Δาߦऀ ʹ׮༰ͳ܏޲͕͋ͬͨ ˔ ʮنଇΛकΔඞཁ͸ͳ͍ʯ ʮकΒͳͯ͘΋േ͸খ͍͞ʯ ͱ͍͏ܦݧʹ༝དྷ
  14. $IBU(15͸ਓ਺ॏࢹͰ൑அ͢Δ܏޲ 39 K. Takemoto: The Moral Machine Experiment on Large

    Language Models. arXiv:2309.05958, 2023. Figure 1. ʮࣗಈӡసं͸Ͳ͏͢Δ΂͖͔ʁʯʹର͢Δͭͷେن໛ݴޠϞσϧͷ൑அ܏޲ ࠨͷଐੑͷਓΑΓӈͷଐੑͷਓΛٹ͏౓߹͍ɽ੺ઢ͸ਓؒͷࢀՃऀશମͷ൑அ܏޲ $IBU(15 X(15  (15 -MBNB͸ 
 ٹ͑Δਓ਺Λॏࢹͯ͠൑அ
  15. ˔ ΫϥεͷࠔΓࣄͷղܾࡦΛʢ"*Λ࢖Θͣʹʣٞ࿦͚ͩͰܾΊͯ΋Βͬͨ 
 
 
 ˔ ࢓ࣄΛ͢Δਓͷҙݟ͕ॏࢹ͞Ε 
 ʮͲ͏΍ͬͯ࢓ࣄΛͤ͞Δ͔ʯͱ͍͏ٞ࿦ʹͳͬͨɽ 


    ࢓ࣄΛ͠ͳ͍ਓͷݴ͍෼͸ܰࢹ͞Εͨ ˔ ٞ࿦ͷ݁࿦͸ 
 ʮҰͭͷ࢓ࣄΛඞͣҰਓͰ͢ΔΑ͏ͳ໾ׂ෼୲Λ͢Δɽ 
 ͦΕͰ΋΍Βͳ͍ਓ͸ఘΊΔʯ ߴߍͰͷ࣮ݧʢ"*φγʣɿҰ෦ͷཱ৔ͷਓ͚ͩͰٞ࿦͕ਐߦ 41 ςʔϚ 
 ʮάϧʔϓϫʔΫͷ࣌ʹ࢓ࣄΛ͠ͳ͍ਓ͕͍ΔɽͲ͏ͨ͠Β͍͍͔ʁʯ
  16. ˔ "*͕ൃݟͨ͠ଟ༷ͳॏཁҙݟΛఏ্ࣔͨ͠Ͱٞ࿦ͯ͠΋Βͬͨ ˔ ࢓ࣄ͠ͳ͍ਓͷҙݟ΋൓ө͞Ε 
 ʮԿͰ΋ݴ͍߹͑Δؔ܎Λ࡞Δʹ͸Ͳ͏ͨ͠Β͍͍͔ʯ 
 ͱ͍͏ٞ࿦ʹͳͬͨ ˔ ٞ࿦ͷ݁࿦͸

    
 ʮάϧʔϓ಺Ͱݴ͍͍ͨ͜ͱΛݴ͑Δ؀ڥΛ࡞Δɽ 
 ͔ͭɼݴͬͨਓ΋ݴΘΕͨਓ΋ɼ 
 ൃݴʹର͢Δ͜ͱͩͱͯ͠ड͚ࢭΊΔʯ ߴߍͰͷ࣮ݧʢ"*ΞϦʣɿଟ༷ͳཱ৔͕ٞ࿦ʹ൓ө͞Εͨ 43