Upgrade to Pro — share decks privately, control downloads, hide ads and more …

NLP for HCI: 人の行動変更を促すためのNLPベースのプロンプトの導入

NLP for HCI: 人の行動変更を促すためのNLPベースのプロンプトの導入

日常における人の行動変容は学習や健康などwell-beingの向上を目的としたHCI研究の大きなテーマの一つです。大規模モデルをはじめとしたNLP技術の発達を、コンピュータから人への介入 (intervention) の革新と考えて、私たちは人の行動変容のプロンプトを組み込んだアプリケーションを作成し、その評価を行ってきました。本トークではNLP技術をHCI分野に展開した実例、特に、人が信頼 (trust) してシステムを使い続けるためのAIデザインやその評価について紹介しようと思います。

https://nlp-colloquium-jp.github.io/schedule/2023-04-19_riku-arakawa_hiromu-yakura/

Hiromu Yakura

April 19, 2023
Tweet

More Decks by Hiromu Yakura

Other Decks in Research

Transcript

  1. Riku Arakawa * Carnegie Mellon University Hiromu Yakura * University

    of Tsukuba NLP for HCI: ਓͷߦಈมߋΛଅͨ͢ΊͷNLP ϕʔεͷϓϩϯϓτͷಋೖ NLP ίϩΩΞϜ, 2023/4
  2. What's HCI? Human-Computer Interaction: ਓؒத৺తͳ؍఺͔Β 
 ίϯϐϡʔλͷ৽ͨͳ࢖͍ํΛݟ͚ͭΔֶࡍతͳྖҬ ૲ͷࠜతͳ৽ͨͳ࢖͍ํΛ 
 ݟ͚ͭΔݚڀ

    ৽ͨͳ࢖͍ํΛࣗΒఏҊ͠ 
 ͦͷՄೳੑΛݕূ͢Δݚڀ ೔ຊͷΞΠυϧ͕ 
 ͲͷΑ͏ʹΦϯϥΠϯձٞ 
 πʔϧΛ࢖͍ͬͯΔ͔ AlphaGoҎ߱ɺAI͕ 
 ϓϩع࢜ͷ࿅श΍ 
 ଧͪखΛͲ͏ม͔͑ͨ खʹ360౓ΧϝϥΛ͚ͭΔͰ 
 ͲΜͳΠϯλϥΫγϣϯΛ 
 ৽ͨʹ࣮ݱͰ͖Δ͔ H. Yakura. No More Handshaking: How have COVID-19 pushed the expansion of computer-mediated communication in Japanese idol culture? ACM CHI '21. R. Arakawa, et al. Hand with Sensing Sphere: Body-Centered Spatial Interactions with a Hand-Worn Spherical Camera. ACI SUI '20. J. Kang, et al. How AI-Based Training A ff ected the Performance of Professional Go Players. ACM CHI '22. 
 R. Arakawa and Y. Zhang. Low-Cost Millimeter-Wave Interactive Sensing through Origami Re fl ectors. CHIIoT Workshop '21. ϛϦ೾ϨʔμͰંΓࢴͷมܗΛ 
 ݕग़͢Δ͜ͱͰͲΜͳ௿ίετ IoTΛ৽ͨʹ࣮ݱͰ͖Δ͔ 
  3. Our focus ߦಈม༰Ϟσϧ by BJ Fogg Motivation x Ability xPrompt

    NLP౳ͷػցֶशٕज़Λ׆༻ͨ͠γεςϜΛσβΠϯ͠ɺ 
 ਓͷߦಈͷมԽΛҾ͖ى͜͢͜ͱͰɺWell-Being ʹߩݙ͢Δ 
  4. CatAlyst: Domain-Extensible Intervention for Preventing Task Procrastination Using Large Generative

    Models Riku Arakawa*1, Hiromu Yakura*2,3, Masataka Goto3 * : equal contribution 1 : Carnegie Mellon University 2 : University of Tsukuba 3 : National Institute of Advanced Industrial Science and Technology (AIST)
  5. While large generative models show surprising performances, they are not

    always perfect to alternate our intellectual tasks. writing domain-speci fi c documents Is it possible to bene fi t from them in various tasks? providing novel ideas 
  6. We hypothesized that such models can 
 help us avoid

    procrastination. Assumption: even imperfect content generated 
 can be used to guide users' interests to their tasks. Conventional 
 approaches: visual feedback of 
 task progress [40] site blockers [33] Our approach: [33] G. Kovacs, et al. 2018. Rotating Online Behavior Change Interventions Increases E ff ectiveness But Also Increases Attrition. ACM CSCW. [40] Y. Liu, et al. 2014. Supporting Task Resumption Using Visual Feedback. ACM CSCW. 
  7. 

  8. CatAlyst: Overview The pipeline design is independent from task domain.

    We developed two prototypes: writing and slide-editing. 
  9. CatAlyst: Implementation of prototype for slide-editing • GPT-3 to generate

    the continuation of text • It also generates a caption of an image to be used, which is provided to a di ff usion model for image generation. 
  10. CatAlyst: Strategy 💡 Prompt workers to face the task even

    for a short time. encouraging message Can we improve the e ff ectiveness? 
  11. CatAlyst: Strategy [21] J. Clear. 2015. The Chemistry of Building

    Better Habits. https://jamesclear.com/chemistry-habits. 
  12. User Study: Comparison to conventional site blockers Writing Slide-editing •

    H1: CatAlyst is an e ff ective means to keep attracting the interest of workers who are away from the task by presenting the continuation of interrupted work as an intervention. • H2: CatAlyst can induce workers’ behavior to resume the original task e ff ectively through the intervention. • H3: CatAlyst can improve worker productivity by helping them avoid procrastination while performing tasks. • H4: CatAlyst can lower the cognitive load imposed on workers while performing a task, thereby being favorably accepted by them. 
  13. Measures & Results Writing Slide-editing • Ignorance rate: a rate

    of noti fi cations ignored • Interest retrieval time: duration passed before resumption • Progress after resumption: progress made within T s after resumption • Total time: time spent on completing the assigned task • Subjective quality: product quality rated by crowdworkers • Cognitive load: NASA-TLX score responded by participants • System usability: SUS score responded by participants ? ? While total time and subjective quality didn't changed signi fi cantly for writing, CatAlyst holistically exhibited its e ffi cacy. 
  14. User Study: Long-term e ff ect • 5-day use of

    CatAlyst in their writing tasks • Uncontrolled setting • Semi-structured interviews 
  15. User Study: Long-term e ff ect • E ff ects

    on behavior • Feelings about AI’s accuracy • Role of CatAlyst • as a reminder • as an ideator • as a peer • Room for further improvements Interview Result Usage Result • No signi fi cant di ff erence in the interest retrieval time across the fi ve days • Continued use of CatAlyst over the days Suggest long-term e ff i cacy of CatAlyst Please refer to the paper! 
  16. CatAlyst: Domain-Extensible Intervention for Preventing Task Procrastination Using Large Generative

    Models Writing Slide-Editing Instead of pursuing accuracy via tuning for a higher level of task delegation, CatAlyst utilizes large generative models publicly available but imperfect for each individual domain to contribute to workers’ digital well-being by in fl uencing their behavior. Composition ɾɾɾ 
  17. VocabEncounter: NMT-powered Vocabulary Learning by Presenting Computer-Generated Usages of Foreign

    Words into Users' Daily Lives bogus (adj.) ڏِͷɺ͍Μ͖ͪͷ ʜʜ੓෎ͷڅ෇ۚͰ͕͢ɺ its applicants were mostly occupied by bogus companies. ͜ΕΛड͚ͯʜʜ ? bogus (adj.) ڏِͷɺ͍Μ͖ͪͷ NLP techniques (NMT with constrained decoding, etc.) Repeated exposure to word usages is crucial in vocabulary learning. VocabEncounter achieves it by encapsulating foreign words in materials the user is reading in native language. Various daily activities can be transformed into the eld of learning. working commuting strolling watching movies Riku Arakawa * Carnegie Mellon University Hiromu Yakura * University of Tsukuba Sosuke Kobayashi Tohoku University * Equal contribution
  18. 

  19. 

  20. VocabEncounter: NMT-powered Vocabulary Learning by Presenting Computer-Generated Usages of Foreign

    Words into Users' Daily Lives bogus (adj.) ڏِͷɺ͍Μ͖ͪͷ ʜʜ੓෎ͷڅ෇ۚͰ͕͢ɺ its applicants were mostly occupied by bogus companies. ͜ΕΛड͚ͯʜʜ ? bogus (adj.) ڏِͷɺ͍Μ͖ͪͷ NLP techniques (NMT with constrained decoding, etc.) Repeated exposure to word usages is crucial in vocabulary learning. VocabEncounter achieves it by encapsulating foreign words in materials the user is reading in native language. Various daily activities can be transformed into the eld of learning. working commuting strolling watching movies 
  21. 

  22. Implementation Challenges • How to choose an appropriate phrase to

    translate from web pages • How to avoid presenting unnatural or mistranslated phrases • How to obtain translations that contain the words to remember 
  23. Implementation Challenges • How to choose an appropriate phrase to

    translate from web pages • How to avoid presenting unnatural or mistranslated phrases • How to obtain translations that contain the words to remember NMT with Constrained Decoding Multilingual word embedding Backtranslation + Sentence-BERT 
  24. materials (native language) search for words having a similar meaning

    
 from web pages by multilingual word embedding Key Feature of VocabEncounter: Encapsulation foreign word to remember MUSE 
 [Conneau and Lample+, ICLR’18] 
  25. materials (native language) extract a phrase with an appropriate length

    
 around the detected word Key Feature of VocabEncounter: Encapsulation foreign word to remember dependency 
 structure analysis 
  26. materials (native language) extract a phrase with an appropriate length

    
 around the detected word Key Feature of VocabEncounter: Encapsulation foreign word to remember dependency 
 structure analysis 
  27. Key Feature of VocabEncounter: Encapsulation materials (native language) extract a

    phrase with an appropriate length 
 around the detected word 
  28. generate a translated phrase containing the word by neural mechanical

    translation with constrained decoding Key Feature of VocabEncounter: Encapsulation materials (native language) NMT with constrained decoding 
 [Hu+, NAACL’19] 
  29. generate a translated phrase containing the word by neural mechanical

    translation with constrained decoding NMT with constrained decoding 
 [Hu+, NAACL’19] Key Feature of VocabEncounter: Encapsulation materials (native language) foreign 
 language 
  30. translate backwardly to con fi rm that 
 the phrase

    does not lose its original meaning NMT with constrained decoding 
 [Hu+, NAACL’19] Key Feature of VocabEncounter: Encapsulation materials (native language) foreign 
 language original 
 language 
  31. Key Feature of VocabEncounter: Encapsulation materials (native language) foreign 


    language original 
 language Sentence-BERT [Reimers and Gurevych, EMNLP’19] translate backwardly to con fi rm that 
 the phrase does not lose its original meaning 
  32. replace the original phrase in the material 
 with the

    generated phrase if it has a certain "quality" Key Feature of VocabEncounter: Encapsulation materials (native language) foreign 
 language original 
 language Sentence-BERT [Reimers and Gurevych, EMNLP’19] 
  33. Key Feature of VocabEncounter: Encapsulation materials (native language) replace the

    original phrase in the material 
 with the generated phrase if it has a certain "quality" 
  34. Quality of the Generated Translations Similarity between an original phrase

    and its backtranslated phrase Likelihood of the (back)translated phrases 
 (grammatically broken phrases exhibit low score) 
  35. Questions • It is unsure whether our approach really helps

    learners memorizing new words e ff ectively. • We also need to examine the experience of learning with VocabEncounter in their daily lives. • It has a risk of presenting unnatural or mistranslated phrases. 
  36. Evaluation 1: Human-Compatible Quality of Translation • 60 crowd workers

    rating (naturalness and meaning preservation) • Human-compatible quality of translation • Meaning-preservation correlates with a designed score using Sentence-BERT. • Filtering is possible. 
  37. Evaluation 2: Signi fi cant Learning E ff ects •

    10 participants compared their correct rate between pre- and post- vocabulary test. • VocabEncounter helped them memorize the words to learn. • The e ff ect of presenting generated usages was 
 much stronger than presenting only the words. 
  38. Evaluation 3: Preferable Experience in 1-week Use Semi-structured interviews 5

    participants Please refer to the paper! • Bene fi t of Micro Learning • Bene fi t of Usage-Based Learning 
  39. The news article is distributed under Creative Commons 2.1 by

    NHN Japan 
 according to https://www.rondhuit.com/download.html#ldcc 
  40. The movie is distributed under Creative Commons 3.0 by WebTB

    ASO 
 at https://www.youtube.com/watch?v=Wxh5-NRLxi4 
  41. 

  42. Summary • We introduce a new paradigm of vocabulary learning

    
 by leveraging ML-based generation techniques. • We show its feasibility and e ff ectiveness by implementing VocabEncounter, which encapsulates the words to remember into materials a user is reading. • We believe that VocabEncounter provides a good example of how new ML technology expands the way of interaction.