BERT from the Perspective of Graph Han Shi, JIAHUI GAO, Hang Xu, Xiaodan Liang, Zhenguo Li, Lingpeng Kong, Stephen M. S. Lee, James Kwok ICLR 2022 https://openreview.net/forum?id=dUV91uaXm3
− 個々の埋込をスカラーの集合だと思って 中⼼化 & 基準化 (標準偏差で割る) 📄 Xiong+, On Layer NormalizaXon in the Transformer Architecture (ICML 2020) A. BERTs の過平滑化の鍵は, 層正規化に⼊る埋込の標準偏差の最⼩値 📄 Kobayashi+, Incorporating Residual and Normalization Layers into Analysis of Masked Language Models (EMNLP 2021)