実世界の⽂書を視覚的に(画像として)理解し読解するタスク 課題①: ⽂書画像理解 VisualMRC [Tanaka&Nishida+, AAAI’21] PubLayNet [Xu+, ICDAR’19] Screen2Word [Wang+, UIST’21] Zhong+, PubLayNet: largest dataset ever for document layout analysis, ICADR’19 Tanaka+, VisualMRC: Machine Reading Comprehension on Document Images, AAAI’21 Wang+, Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning, UIST’21