ɹɹ ɹɹ ɹ͜Ε72"Ͱ༗ޮͱ͍͏͜ͱ͕ࣔ͞Ε͍ͯΔɽʢ4VFUBMʣ ɹը૾ͱΞϥΠϝϯτΛऔΔ͜ͱͰɼಉ͡εϖϧͰෳͷҙຯΛ࣋ͭΑ͏ͳᐆດͳ දݱΛ͑ࠐΉ͜ͱ͕Ͱ͖ΔɽʢྫɿNPVTFͶͣΈɼిࢠػثͷϚεʣ ⃗ Ia = FFN(MHA( ⃗ I, ⃗ T , ⃗ T )) ⃗ T a = FFN(MHA( ⃗ T , ⃗ I, ⃗ I)) Experiment Setting ɹBUUFOUJPOIFBEͷݸɼGFFEGPSXBSEOFUXPSLͷ࣍ ݩ
ɹɹ ɹɹ ɹ͜Ε*NBHF$BQUJPOJOHͰ༗ޮͱ͍͏͜ͱ͕ࣔ͞Ε͍ͯΔɽʢ:BPFUBMʣ ⃗ Ii = FFN(MHA( ⃗ Ia , ⃗ Ia , ⃗ Ia )) ⃗ T i = FFN(MHA( ⃗ T a , ⃗ T a , ⃗ T a )) Experiment Setting ɹBUUFOUJPOIFBEͷݸɼGFFEGPSXBSEOFUXPSLͷ࣍ ݩ
• Spatial ( Lu et al. 2017 ) • NBT ( Lu et al. 2017 ) Baseline(Visual Question Answering) • BUTB ( Anderson et al. 2018 ) • NBT ( Kim, Jun and Zhang 2018 )
• Spatial ( Lu et al. 2017 ) • NBT ( Lu et al. 2017 ) Baseline(Visual Question Answering) • BUTB ( Anderson et al. 2018 ) • NBT ( Kim, Jun and Zhang 2018 )