Upgrade to Pro — share decks privately, control downloads, hide ads and more …

卒業研究最終発表

Qiushi Pan
January 08, 2020

 卒業研究最終発表

Improving Deep Knowledge Tracing: pre-training and encoder-decoder architecture
—Towards analyzation of hidden vector representation—

Qiushi Pan

January 08, 2020
Tweet

More Decks by Qiushi Pan

Other Decks in Education

Transcript

  1. Improving Deep Knowledge Tracing: pre-training and encoder-decoder architecture —Towards analyzation

    of hidden vector representation— ஜ೾େֶ৘ใֶ܈ɹ஌ࣝ৘ใɾਤॻֶؗྨ 201511548ɹᖊळ࣮ 
  2. σʔληοτɿAssistments 2009-2010 corrected • ੜె਺ɹ 4,417 ໊ • ໰୊਺ɹ 328,291

    ໰ • εΩϧɹ 124 छྨ • εΩϧ13(ฏํࠜͷཧղ), εΩϧ41(ฏํࠜͱ੔਺ͷൺֱ), εΩϧ 4673(׬શฏํͷཧղ)ͷΑ͏ʹɼཁٻ͢ΔεΩϧಉ࢜ʹ͸ؔ࿈ ͕ڧ͍΋ͷ͔Βऑ͍΋ͷ·Ͱ͋Δɽ 5
  3. Deep Knowledge Tracing [2] (2) • ೖྗ ͸ճ౴ͨ͠εΩϧID ͱɼճ౴ͷਖ਼ ޡ݁Ռ

    ͷ৘ใΛ࣋ͭɽ ͸εΩϧ਺ɽ • ࣌఺ʹճ౴͢ΔεΩϧʹର͢Δਖ਼ޡ༧ଌ ͱɼ࣮ࡍͷਖ਼ޡ݁Ռ ͱͷLossɽ xt qt ∈ {0,…, m} at ∈ {0,1} m t + 1 ˜ y⊤δm (qt+1) at+1 ℒ = ∑ t ℓ (˜ y⊤δm (qt+1), at+1) 8
  4. ભҠৼಈ໰୊ʢwavy transition problemʣ[3] • DKTͰ͸ɼλΠϜεςοϓؒͰ༧ଌ஋্͕Լͯ͠͠· ͏ɽ • ֶྗ͸ঃʑʹมԽ͢Δͱߟ͑Δͷ͕ࣗવ • ༧ଌ஋

    ͱ ͱͷ͔ࠩΒL1ϊϧϜɼL2ϊϧϜͷධՁࢦ ඪ ʢYeung et al. 2018ʣ yt+1 yt w1 , w2 9 , w1 = ∑n i=1 ∑Ti −1 t=1 yi t+1 − yi t 1 M∑n i=1 (Ti − 1) w2 2 = ∑n i=1 ∑Ti −1 t=1 yi t+1 − yi t 2 2 M∑n i=1 (Ti − 1) ℒ′ = ℒ + λw1 w1 + λw2 w2 2
  5. ਖ਼౴཰࠶ݱ໰୊ 1. ༧ଌ஋ ͕ඞͣ͠΋ೖྗγʔέϯεͷਖ਼ղ཰Λ൓ө͠ͳ͍ɽ 2. μϛʔͷ࿈ଓਖ਼౴/࿈ଓޡ౴σʔλΛͦΕͧΕ༩͑ͨ࣌ɼલऀʹΑΔ ༧ଌ஋ ͕ޙऀͷ΋ͷ ΛԼճΔ৔߹( )͕͋Δ

    →࿈ଓਖ਼౴ͨ͠ͷʹֶྗͷݟੵΓ͕௿Լ͢Δͱ͍͏௚ײʹ൓͢Δ݁Ռ ˜ y ̂ y′ cor ̂ y′ wro s = ̂ y′ cor − ̂ y′ wro < 0 10 ਤ:ॎ࣠͸༧ଌ஋ɼԣ࣠͸μϛʔೖྗʹؚ·ΕΔਖ਼ղͷݸ਺
  6. Knowledge State Vector loss • ༧ଌͱεΩϧͷొ৔ස౓ͷΞμϚʔϧੵ ͱɼεΩϧͷਖ਼ղස౓ ͱͷLossΛܭࢉɽ • ਖ਼ޡͷ2஋෼ྨͷͨΊʹ༧ଌਖ਼ղ֬཰ΛٻΊΔʢطଘʣ

    ͷͰ͸ͳ͘ɼೖྗʹొ৔͢ΔεΩϧͷਖ਼ղ཰΁ͷճؼ ໰୊ʹ͢Δɽ ㅟ ㅟ ㅟ ㅟ ㅟ Lksv = T ∑ t=1 ℓ ( ˜ yt ∘ t+1 ∑ s=2 δm (qs), t+1 ∑ s=2 as δm (qs) ) ℒ′ = ℒ + λksv Lksv ˜ yt ∘ ∑ s δm (qs) ∑ s as δm (qs) 12
  7. pre-training • ࿈ଓਖ਼౴ɾ࿈ଓޡ౴ͷμϛʔσʔλΛ࡞੒͠ɼ ਖ਼౴ˠਖ਼౴ / ޡ౴ˠޡ౴ ͷؔ܎Λֶशͤ͞Δɽ • ࣮σʔλͰֶशΛߦ͏લʹ͜ͷؔ܎Λֶश͢Δ͜ͱͰɼ ਖ਼౴ˠޡ౴

    / ޡ౴ˠਖ਼౴ ͷہॴղʹቕΔ͜ͱΛ๷͙ ′ wro = {(qi ,0), …, (qi ,0)}, y′ wro = (qi ,0) ′ cor = {(qi ,1), …, (qi ,1)}, y′ cor = (qi ,1) 13
  8. Knowledge State Vector loss • ͰKS Vector lossΛݮগͤ͞ΔΑ͏ʹ܇࿅͢Δ͜ ͱ͕Ͱ͖ͨɽ͞ΒʹɼAUC, ,

    ͷ݁Ռ΋վળͨ͠ɽ • EDDKTͰ͸AUCΛϕʔεϥΠϯΑΓ্͛ͭͭɼ ʹґΒ ͳͯ͘΋KS Vector loss͕௿͘ͳͬͨɽ λksv = 0.5 w1 w2 λksv 17 ਤɿDKT (baseline)ͱDKT with ͷൺֱ λksv = 0.5
  9. pre-training • ͱͳΔεΩϧΛ12Ҏ্ˠ5ͭʹݮΒ ͢͜ͱ͕Ͱ͖ͨɽ • ͕ϕʔεϥΠϯΑΓ΍΍௿͘ͳͬͨɽ s = ̂ y′

    cor − ̂ y′ wro < 0 w1 , w2 18 ਤ:ॎ࣠͸༧ଌ஋ɼԣ࣠͸μϛʔೖྗʹؚ·ΕΔਖ਼ղͷݸ਺
  10. ࢀߟจݙ • [1] Corbett, A. T. and Anderson, J. R.:

    Knowledge tracing: Modeling the acquisition of procedural knowledge, User Modeling and User- adapted Interaction, Vol. 4, No. 4, pp. 253–278 (1994). • [2] Piech, C., Bassen, J., Huang, J., Ganguli, S., Sahami, M., Guibas, L. J. and Sohl-Dickstein, J.: Deep knowledge tracing, Advances in Neural Information Processing Systems, pp. 505–513 (2015). • [3] Yeung, C.-K. and Yeung, D.-Y.: Addressing two problems in deep knowledge tracing via prediction-consistent regularization, arXiv preprint arXiv:1806.02180 (2018). 22