2022/09/27 ୈ14ճ࠷ઌNLPษڧձ ඇৗʹྨࣅ mer block works as a key-value memory. The first linear ner product. Taking the activation of these neurons as e vectors through weighted sum. We hypothesize that expressing factual knowledge. in Transformers, even without any fine-tuning. 2 Background: Transformer Transformer (Vaswani et al., 2017) is one of the most popular and effective NLP architectures. A Transformer encoder is stacked with L identical blocks. Each Transformer block mainly contains two modules: a self-attention module, and a feed- forward network (abbreviated as FFN) module. Let X 2 Rn⇥d denote the input matrix, two modules can be formulated as follows: Qh = XW Q h ,Kh = XW K h , Vh = XW V h , (1) Self-Atth(X) = softmax QhK T h Vh, (2) FFN(H) = gelu (HW1) W2, (3) 2: Illustration of how an FFN module in a Transformer block works as a key-value memory. The first linear FFN(key) computes intermediate neurons through inner product. Taking the activation of these neurons as s, the second linear layer FFN(val) integrates value vectors through weighted sum. We hypothesize that edge neurons in the FFN module are responsible for expressing factual knowledge. fectiveness of the proposed knowledge at- on method. First, suppressing and ampli- knowledge neurons notably affects the ex- on of the corresponding knowledge. Second, d that knowledge neurons of a fact tend to ivated more by corresponding knowledge- ssing prompts. Third, given the knowledge ns of a fact, the top activating prompts re- d from open-domain texts usually express rresponding fact, while the bottom activating pts do not express the correct relation. our case studies, we try to leverage knowl- neurons to explicitly edit factual knowledge trained Transformers without any fine-tuning. esent two preliminary studies: updating facts, asing relations. After identifying the knowl- neurons, we perform a knowledge surgery in Transformers, even without any fine-tuning. 2 Background: Transformer Transformer (Vaswani et al., 2017) is one of the most popular and effective NLP architectures. A Transformer encoder is stacked with L identical blocks. Each Transformer block mainly contains two modules: a self-attention module, and a feed- forward network (abbreviated as FFN) module. Let X 2 Rn⇥d denote the input matrix, two modules can be formulated as follows: Qh = XW Q h ,Kh = XW K h , Vh = XW V h , (1) Self-Atth(X) = softmax QhK T h Vh, (2) FFN(H) = gelu (HW1) W2, (3) where W Q h , W K h , W V h , W1, W2 are parameter ma- Attention head Attention weights Key vectors Value vectors !! !" !# … weighted sum inner product … … … … … ʢॏΈߦྻʣ ʢॏΈߦྻʣ 1. 入力をqueryとして,各keyとの内積で重みを計算 2. この重みをかけながら各valueを総和 (重み付け和)