Upgrade to Pro — share decks privately, control downloads, hide ads and more …

卒業研究最終発表

Avatar for Qiushi Pan Qiushi Pan
January 08, 2020

 卒業研究最終発表

Improving Deep Knowledge Tracing: pre-training and encoder-decoder architecture
—Towards analyzation of hidden vector representation—

Avatar for Qiushi Pan

Qiushi Pan

January 08, 2020
Tweet

More Decks by Qiushi Pan

Other Decks in Education

Transcript

  1. Improving Deep Knowledge Tracing: pre-training and encoder-decoder architecture —Towards analyzation

    of hidden vector representation— ஜ೾େֶ৘ใֶ܈ɹ஌ࣝ৘ใɾਤॻֶؗྨ 201511548ɹᖊळ࣮ 
  2. σʔληοτɿAssistments 2009-2010 corrected • ੜె਺ɹ 4,417 ໊ • ໰୊਺ɹ 328,291

    ໰ • εΩϧɹ 124 छྨ • εΩϧ13(ฏํࠜͷཧղ), εΩϧ41(ฏํࠜͱ੔਺ͷൺֱ), εΩϧ 4673(׬શฏํͷཧղ)ͷΑ͏ʹɼཁٻ͢ΔεΩϧಉ࢜ʹ͸ؔ࿈ ͕ڧ͍΋ͷ͔Βऑ͍΋ͷ·Ͱ͋Δɽ 5
  3. Deep Knowledge Tracing [2] (2) • ೖྗ ͸ճ౴ͨ͠εΩϧID ͱɼճ౴ͷਖ਼ ޡ݁Ռ

    ͷ৘ใΛ࣋ͭɽ ͸εΩϧ਺ɽ • ࣌఺ʹճ౴͢ΔεΩϧʹର͢Δਖ਼ޡ༧ଌ ͱɼ࣮ࡍͷਖ਼ޡ݁Ռ ͱͷLossɽ xt qt ∈ {0,…, m} at ∈ {0,1} m t + 1 ˜ y⊤δm (qt+1) at+1 ℒ = ∑ t ℓ (˜ y⊤δm (qt+1), at+1) 8
  4. ભҠৼಈ໰୊ʢwavy transition problemʣ[3] • DKTͰ͸ɼλΠϜεςοϓؒͰ༧ଌ஋্͕Լͯ͠͠· ͏ɽ • ֶྗ͸ঃʑʹมԽ͢Δͱߟ͑Δͷ͕ࣗવ • ༧ଌ஋

    ͱ ͱͷ͔ࠩΒL1ϊϧϜɼL2ϊϧϜͷධՁࢦ ඪ ʢYeung et al. 2018ʣ yt+1 yt w1 , w2 9 , w1 = ∑n i=1 ∑Ti −1 t=1 yi t+1 − yi t 1 M∑n i=1 (Ti − 1) w2 2 = ∑n i=1 ∑Ti −1 t=1 yi t+1 − yi t 2 2 M∑n i=1 (Ti − 1) ℒ′ = ℒ + λw1 w1 + λw2 w2 2
  5. ਖ਼౴཰࠶ݱ໰୊ 1. ༧ଌ஋ ͕ඞͣ͠΋ೖྗγʔέϯεͷਖ਼ղ཰Λ൓ө͠ͳ͍ɽ 2. μϛʔͷ࿈ଓਖ਼౴/࿈ଓޡ౴σʔλΛͦΕͧΕ༩͑ͨ࣌ɼલऀʹΑΔ ༧ଌ஋ ͕ޙऀͷ΋ͷ ΛԼճΔ৔߹( )͕͋Δ

    →࿈ଓਖ਼౴ͨ͠ͷʹֶྗͷݟੵΓ͕௿Լ͢Δͱ͍͏௚ײʹ൓͢Δ݁Ռ ˜ y ̂ y′ cor ̂ y′ wro s = ̂ y′ cor − ̂ y′ wro < 0 10 ਤ:ॎ࣠͸༧ଌ஋ɼԣ࣠͸μϛʔೖྗʹؚ·ΕΔਖ਼ղͷݸ਺
  6. Knowledge State Vector loss • ༧ଌͱεΩϧͷొ৔ස౓ͷΞμϚʔϧੵ ͱɼεΩϧͷਖ਼ղස౓ ͱͷLossΛܭࢉɽ • ਖ਼ޡͷ2஋෼ྨͷͨΊʹ༧ଌਖ਼ղ֬཰ΛٻΊΔʢطଘʣ

    ͷͰ͸ͳ͘ɼೖྗʹొ৔͢ΔεΩϧͷਖ਼ղ཰΁ͷճؼ ໰୊ʹ͢Δɽ ㅟ ㅟ ㅟ ㅟ ㅟ Lksv = T ∑ t=1 ℓ ( ˜ yt ∘ t+1 ∑ s=2 δm (qs), t+1 ∑ s=2 as δm (qs) ) ℒ′ = ℒ + λksv Lksv ˜ yt ∘ ∑ s δm (qs) ∑ s as δm (qs) 12
  7. pre-training • ࿈ଓਖ਼౴ɾ࿈ଓޡ౴ͷμϛʔσʔλΛ࡞੒͠ɼ ਖ਼౴ˠਖ਼౴ / ޡ౴ˠޡ౴ ͷؔ܎Λֶशͤ͞Δɽ • ࣮σʔλͰֶशΛߦ͏લʹ͜ͷؔ܎Λֶश͢Δ͜ͱͰɼ ਖ਼౴ˠޡ౴

    / ޡ౴ˠਖ਼౴ ͷہॴղʹቕΔ͜ͱΛ๷͙ ′ wro = {(qi ,0), …, (qi ,0)}, y′ wro = (qi ,0) ′ cor = {(qi ,1), …, (qi ,1)}, y′ cor = (qi ,1) 13
  8. Knowledge State Vector loss • ͰKS Vector lossΛݮগͤ͞ΔΑ͏ʹ܇࿅͢Δ͜ ͱ͕Ͱ͖ͨɽ͞ΒʹɼAUC, ,

    ͷ݁Ռ΋վળͨ͠ɽ • EDDKTͰ͸AUCΛϕʔεϥΠϯΑΓ্͛ͭͭɼ ʹґΒ ͳͯ͘΋KS Vector loss͕௿͘ͳͬͨɽ λksv = 0.5 w1 w2 λksv 17 ਤɿDKT (baseline)ͱDKT with ͷൺֱ λksv = 0.5
  9. pre-training • ͱͳΔεΩϧΛ12Ҏ্ˠ5ͭʹݮΒ ͢͜ͱ͕Ͱ͖ͨɽ • ͕ϕʔεϥΠϯΑΓ΍΍௿͘ͳͬͨɽ s = ̂ y′

    cor − ̂ y′ wro < 0 w1 , w2 18 ਤ:ॎ࣠͸༧ଌ஋ɼԣ࣠͸μϛʔೖྗʹؚ·ΕΔਖ਼ղͷݸ਺
  10. ࢀߟจݙ • [1] Corbett, A. T. and Anderson, J. R.:

    Knowledge tracing: Modeling the acquisition of procedural knowledge, User Modeling and User- adapted Interaction, Vol. 4, No. 4, pp. 253–278 (1994). • [2] Piech, C., Bassen, J., Huang, J., Ganguli, S., Sahami, M., Guibas, L. J. and Sohl-Dickstein, J.: Deep knowledge tracing, Advances in Neural Information Processing Systems, pp. 505–513 (2015). • [3] Yeung, C.-K. and Yeung, D.-Y.: Addressing two problems in deep knowledge tracing via prediction-consistent regularization, arXiv preprint arXiv:1806.02180 (2018). 22