Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Practical and Interpretable Deep Learning Techn...

Practical and Interpretable Deep Learning Techniques in Our Iyatomi’s Lab

The slides for "The 1st Univ. Carthage - Hosei International Joint Webinar
with honorable support by the Embassy of the Republic of Tunisia in Japan.
Recent Issues in Intelligent Robotics, Machine Learning and Distributed System", Mar. 17th, 2021

Shunsuke KITADA

March 17, 2021
Tweet

More Decks by Shunsuke KITADA

Other Decks in Research

Transcript

  1. Practical and Interpretable Deep Learning Techniques in Our Iyatomi’s Lab

    Shunsuke Kitada 1st year Ph.D student at Major in Applied Informatics, Graduate School of Science and Engineering, Hosei University The 1st Univ. Carthage - Hosei International Joint Webinar with honorable support by the Embassy of the Republic of Tunisia in Japan. Recent Issues in Intelligent Robotics, Machine Learning and Distributed System Mar. 17th, 2021 The figures and formulas presented in this presentation are borrowed/captured from the papers.
  2. Self-introduction Shunsuke Kitada • 1st year Ph.D student in Hosei

    Univ. • JSPS Research Fellow DC2 Research Interest: • Natural Language Processing (NLP) ◦ Learning character-level compositionality ▪ From Kanji [Kitada+ AIPRW’18, Aoki+ AACL SRW’20] ▪ From Arabic [Daif+ ACL SRW’20] ◦ Developing perturbation robust and interpretable deep learning models [Kitada+ IEEE Access’21, Kitada+ CoRR’21] • Medical image processing ◦ Recognizing skin cancer from skin image [Kitada+ CoRR’18] • Computational advertising ◦ Supporting to create good ad creatives [Kitada+ KDD’19] 2 HP: shunk031.me GitHub Japanese characters
  3. About our Iyatomi’s lab 3 Automatic plant disease diagnosis Cybersecurity

    CBIR on MRI Skin cancer Natural language processing (NLP)
  4. About our Iyatomi’s lab 4 Automatic plant disease diagnosis Cybersecurity

    CBIR on MRI Skin cancer Natural language processing (NLP)
  5. Natural language processing with Deep Learning Models 5 • Natural

    language processing (NLP) ◦ One of a field of AI that gives the machines the ability to read, understand and derive meaning from human languages ◦ Deep learning models have provided excellent prediction performance in this field as well 猫 猫 b'54yr' Cat in Japanese The key idea to improve the two aspect is Attention Mechanisms and Adversarial Training However, the models generally become a black box that is difficult to interpret for the prediction ➜ In recent years, deep learning models have placed more emphasis on the interpretability and robustness Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion
  6. Attention Mechanisms in NLP 6 Attention mechanisms [Bahdanau+’14] • learn

    conditional distributions over input units to compose a weighted context vector • significantly contribute to improving the performance of NLP tasks, e.g., text classification [Lin+’17], question answering [Golub+’16], natural language inference [Parikh+’16] Image from Bahdanau+’14 Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion Interpretability through the mechanisms • Attention weights are often claimed to afford insights into the “inner-workings” of models ➜ “Attention provides an important way to explain the workings of neural models” [Li+’16] • The claims that attention provides interpretability are common in the literature [Xu+’15, Choi+’16, Xie+’17, Lin+’17] Attention heatmap of Yelp reviews with 5 star review Image from Lin+’17
  7. Attention Mechanisms in NLP 7 Attention mechanisms [Bahdanau+’14] • learn

    conditional distributions over input units to compose a weighted context vector • significantly contribute to improving the performance of NLP tasks e.g., text classification [Lin+’17], question answering [Golub+’16], natural language inference [Parikh+’16] Image from Bahdanau+’14 Interpretability through the mechanisms • Attention weights are often claimed to afford insights into the “inner-workings” of models ➜ “Attention provides an important way to explain the workings of neural models” [Li+’16] • The claims that attention provides interpretability are common in the literature [Xu+’15, Choi+’16, Xie+’17, Lin+’17] Attention heatmap of Yelp reviews with 5 star review Image from Lin+’17 However, it has been pointed out that DNN models tend to be locally unstable, and even tiny perturbations to the original inputs [Szegedy+’13] or attention mechanisms [Jain+’19] can mislead the models. ➜ Maliciously perturbations are called adversarial examples or adversarial perturbations Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion
  8. AT is widely used in the various NLP field: •

    Text classification [Miyato+’16, Sato+’18] • Part-of-speech tagging [Yasunaga+’18] • Relation extraction [Wang+’18] Overcome the vulnerability of adversarial examples: Adversarial Training 8 Adversarial Training (AT) [Goodfellow+’14] • aims to improve the robustness of a model to input perturbations by training on adversarial examples • primarily explored in image recognition field and demonstrate the enhanced robustness [Shaham+’18] In the context of attention mechanisms in NLP, yet the specific effects of the robustness from AT are unclear. Image from Goodfellow+’14 Image from Miyato+’16 Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion
  9. The attention weight of each word is considered an indicator

    of the importance of each word ➜ In terms of interpretability, the weights is considered a higher-order feature than the word embeddings AT to attention mechanisms is expected to be more effective Adversarial training in NLP 9 Adversarial perturbation to word embeddings • AT to word embeddings ◦ Improving the text classification performance by applying AT to a word embedding space [Miyato+’16] • interpretable AT to word embeddings ◦ Restricting the direction of the perturbations to existing words in the word embedding space [Sato+’18] Image from Sato+’18 AT [Miyato+’16] iAT [Sato+’18] Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion
  10. Main contribution of adversarial training for attention mechanisms (from my

    recent work [Kitada+ IEEE Access’21]) 10 Investigating the idea/technique of employing AT for attention mechanisms, the following findings is obtained by the AT for attention mechanisms: • improves the prediction performance of various NLP tasks • helps the model learn cleaner attention • is much less independent concerning perturbation size Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion Image from Kitada+’21
  11. Brief introduction of Adversarial Training for Attention Mechanisms 11 Base

    model Following [Jain+’19], 1-layer bi-LSTM with additive attention mechanism was used as base model Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion Image from Kitada+’21 • Input layer ◦ Word Embeddings • Intermediate layer ◦ Additive attention mechanisms ▪ The AT for attention mechanisms was employed to the layer • Output layer ◦ Prediction for target task
  12. Attention AT: Adversarial Training for Attention Mechanisms 12 The main

    idea is to employ AT to attention mechanism ã: • The adversarial perturbation is defined as the worst-case perturbation of a size ε that maximizes the loss function of the current model Input word sequence with perturbated attention score Perturbation Ground Truth • Adversarial perturbation was constructed as ã adv • Train the model with adversarial examples Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion
  13. Attention iAT: Interpretable Adversarial Training for Attention Mechanisms 13 Attention

    iAT enhances the difference in attention. The difference leads to clear and interpretable attention. • defines the normalized difference vector as the normalized difference between attention to in a sentence: Input word sequence with perturbated attention score Perturbation Ground Truth • defines perturbation for attention with trainable parameters Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion where, where, • seeks the worst- case weights of the difference vectors that maximize the loss function
  14. Experiments | Task & Model settings 14 • Task and

    Dataset ◦ Binary classification (BC): 4 datasets ▪ Stanford Sentiment Treebank (SST) [Socher+’13], IMDB Movie Review Corpus [Maas+’11], 20Newsgroups Corpus [Lang+’95], AgNews Corpus [Zhang+’15] ◦ Question answering (QA): 2 datasets ▪ CNN news [Hermann+’15], bAbI task 1, 2, 3 [Weston+’16] ◦ Natural language inference (NLI): 2 datasets ▪ SNLI [Bowman+’15], MultiNLI [Williams+’17] • Model Settings ◦ Vanilla model (described in basemodel section) [Jain+’19] ◦ Word AT [Miyato+’16]: apply AT for word embedding ◦ Word iAT [Sato+’18]: apply iAT for word embedding ◦ Attention RP: apply random perturbation for attention ◦ Attention AT (proposed): apply AT for attention ◦ Attention iAT (proposed): apply iAT for attention Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion
  15. Evaluation Criteria 15 • Prediction performance (followed by [Jain+’19]) ◦

    F1 score, accuracy, micro-F1 for BC, QA, NLI • Correlation with word importance ◦ How the attention weights obtained through the proposals agreed with the importance of words calculated by the gradients [Simonyan+’13] • Effects of perturbation size ◦ Randomly chose the value of ε in the 0-30 range and ran the training 100 times Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion The movie was pretty good The movie was pretty good Word importance obtained from backprop. gradient Learned attention weight How agreed based on Pearson’s correlation
  16. Results | Binary classification task 16 • Prediction performance ◦

    Attention AT/iAT showed a clear advantage over the model without AT as well as other AT-based technique • Correlation with word importance ◦ The attention to the words obtained with the Attention AT/iAT notable correlated with the importance of the word as determined by the gradients Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion
  17. Results | QA and NLI tasks 17 Introduction > Contribution

    > Basemodel > Methods > Experiments > Conclusion
  18. Results | QA and NLI tasks 18 We observed similar

    trends in other datasets/tasks. The detail of the results are show in [Kitada+’20] Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion
  19. 19 Vanilla Attention AT Visualization of learned attention weights for

    each words Attention iAT Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion
  20. 20 Vanilla Attention AT Attention iAT Introduction > Contribution >

    Basemodel > Methods > Experiments > Conclusion
  21. 21 Attention AT yielded clearer attention compared to the Vanilla

    model or Attention iAT ➜ Attention AT tended to strongly focus attention on a few words Attention AT Vanilla Attention iAT Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion
  22. 22 Attention AT yielded clearer attention compared to the Vanilla

    model or Attention iAT ➜ Attention AT tended to strongly focus attention on a few words Attention AT Vanilla Attention iAT Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion
  23. 23 Vanilla Attention AT Attention iAT Introduction > Contribution >

    Basemodel > Methods > Experiments > Conclusion
  24. 24 Attention AT Vanilla Attention iAT In terms of the

    correlation of word importance based on attention weights and gradient-based word importance: Attention iAT demonstrated higher similarities than the other models. Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion
  25. 25 Attention AT Vanilla Attention iAT In terms of the

    correlation of word importance based on attention weights and gradient-based word importance: Attention iAT demonstrated higher similarities than the other models. Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion
  26. Effects of perturbation size ε 26 • The performances of

    the conventional Word AT/iAT deteriorated according to the increase in the perturbation size. • Attention AT/iAT maintained almost the same prediction performance Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion
  27. Conclusion | Adversarial training for attention mechanisms 27 • The

    key idea of improving the model interpretability and the prediction performance in the deep learning model: ◦ Attention mechanisms and Adversarial training • My recent work is proposed Attention AT and Attention iAT, training technique to robust and interpretable attention mechanisms that exploit adversarial training ◦ achieves better performance than techniques using AT for word embedding • Attention iAT introduced adversarial perturbations that ◦ emphasized differences in the importance of words ◦ combined high accuracy with clear attention, which was strongly correlated with the word Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion Thank you for your kind attention :) [email protected] HP: shunk031.me Feel free to contact me!