Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[輪講資料] A Frustratingly Easy Approach for Entity...

tossy
June 03, 2022

[輪講資料] A Frustratingly Easy Approach for Entity and Relation Extraction

「A Frustratingly Easy Approach for Entity and Relation Extraction」の論文について解説した資料

tossy

June 03, 2022
Tweet

More Decks by tossy

Other Decks in Research

Transcript

  1. A Frustratingly Easy Approach for Entity and Relation Extraction Zexuan

    Zhong, Danqi Chen NAACL 2021(North American Chapter of the Association for Computational Linguistics) URL: https://aclanthology.org/2021.naacl-main.5/ citations:53 What is this paper about? Paper contributions Key points Validate advantages and effectiveness Related work 1 Propose a simple and effective approach for named entity recognition and relation extraction ・ Learns two independent encoders ・ Insert typed entity markers in training relation model Outperforms all previous joint model on three datasets Wadden+: Entity, relation, and event extraction with contextualized span representations, EMNLP ‘19 ・ 8-16x speedup with a small accuracy drop
  2. Introduction: Entity and Relation Extraction [01] Sang+: Introduction to the

    conll-2003 shared task: Language independent named entity recognition, CoNLL ‘03 [02] Ratinov+: Design challenges and misconceptions in named entity recognition, CoNLL ‘09 [03] Zelenko+: Kernel methods for relation extraction, EMNLP ‘02 [04] Bunescu+: A shortest path dependency kernel for relation extraction, EMNLP '05 Named Entity Recognition[01],[02] Relation Extraction[03], [04] 3 Input morpa is a fully implemented parser for a text-to-speech system
  3. Problem definition Input n tokens, token represents a single word

    m spans, span is an ordered sequence of token X of up to length L=8, START(i) an END(i) denote start and end indices of 4 Output : span, : entity type : subject/object span, : relation type
  4. Introduction: Early and Recently Work Early Work: pipelined approach •

    Training entity model and relation model separately[05], [06], [07] Recently: end-to-end approach • modeling entity model and relation model jointly[08]~[17] • joint models can better capture the interactions between entities and relations [05] Zhou+: Exploring various knowledge in relation extraction, ACL ‘05 [06] Kambhatla+: Combining lexical, syntactic, and semantic features with maximum entropy models for information extraction, ACL ‘04 [07] Chan+: Exploiting syntactico-semantic structures for relation extraction, ACL-HLT ‘11 [08] Li+: Incremental joint extraction of entity mentions and relations, ACL ‘14 [09] Miwa+: End-to-end relation extraction using LSTMs on sequences and tree structures, ACL ‘16 [10] Katiyar+: Going out on a limb: Joint extraction of entity mentions and relations without dependency trees, ACL ‘17 [11] Zhang(a)+: End-to-end neural relation extraction with global optimization, EMNLP ‘17 [12] Zhang(b)+: Position aware attention and supervised data improve slot filling, EMNLP ‘17 [13] Li+: Entity-relation extraction as multi-turn question answering, ACL ‘19 [14] Luan+: Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction, EMNLP ‘18 [15] Wadden+: Entity, relation, and event extraction with contextualized span representations, EMNLP ‘19 [16] Lin+: A joint neural model for information extraction with global features, ACL ‘20 [17] Wang+: Two are better than one: Joint entity and relation extraction with table sequence encoders, EMNLP '20 5
  5. Introduction: Joint models 6 multi-task learning[16] shared span representations between

    two task [16] Luan+: A general framework for information extraction using dynamic span graphs, NAACL ‘19 The output vector as inputs to the relation propagation layer The output vector as inputs to the entity and relation prediction layer
  6. This work approach • 「Is it good to share the

    span representation with the entity and relation like the joint model?」 • → Learns two independent encoders • Insert typed entity markers in training relation model • An efficient approximation: 8-16x speed up with a small accuracy drop 7 [S:PER] Bill Smith [/S:PER] was in the [O:FAC] hotel [/O:FAC] room Insert Markers Insert Markers
  7. Entity model 9 Input Bill Smith was in the hotel

    room Entity Encoder …… ・ bert-base-uncased ・ albert-xxlarge-v1 ・ scibert-scivocab-uncased Contextualized representations Span representations …… Bill Smith was in room …… ……
  8. Entity model 10 Input Bill Smith was in the hotel

    room Entity Encoder …… ・ bert-base-uncased ・ albert-xxlarge-v1 ・ scibert-scivocab-uncased Contextualized representations Span representations …… Bill Smith was in room …… …… FFNN …… 2-layer FFNN 150 hidden units Dense Softmax …… PER
  9. Entity model 11 Input Bill Smith was in the hotel

    room Entity Encoder …… ・ bert-base-uncased ・ albert-xxlarge-v1 ・ scibert-scivocab-uncased Contextualized representations Span representations …… Bill Smith was in room …… …… FFNN …… 2-layer FFNN 150 hidden units Dense Softmax …… PER …… Dense Softmax ……
  10. Relation model 13 Bill Smith was in the hotel room

    Bill Smith hotel room PER FAC FAC Input Output Bill Smith room PER FAC hotel room FAC FAC PHYS PART-WHOLE ※ Person ※ Facilities ※ Facilities ※ Physical
  11. Relation model: Inserting Markers 14 Bill Smith was in the

    hotel room Bill Smith hotel room PER FAC FAC Bill Smith PER hotel FAC [S:PER] Bill Smith [/S:PER] was in the [O:FAC] hotel [/O:FAC] room Bill Smith PER room FAC [S:PER] Bill Smith [/S:PER] was in the hotel [O:FAC] room [/O:FAC] hotel FAC room FAC Bill Smith was in the [S:FAC] hotel [/S:FAC] [O:FAC] room [/O:FAC] …… …… Insert Markers Insert Markers
  12. Relation model 15 Modified Input Relation Encoder …… ・ bert-base-uncased

    ・ albert-xxlarge-v1 ・ scibert-scivocab-uncased Contextualized representations Span-pair representations …… Bill Smith→room …… [S:PER] Bill Smith [/S:PER] was in the hotel [O:FAC] [/O:FAC] room [O:FAC] [S:PER]
  13. Relation model 16 Modified Input Relation Encoder …… ・ bert-base-uncased

    ・ albert-xxlarge-v1 ・ scibert-scivocab-uncased Contextualized representations Span-pair representations …… Bill Smith→room …… [S:PER] Bill Smith [/S:PER] was in the hotel [O:FAC] [/O:FAC] room [O:FAC] Dense Softmax PHYS
  14. Cross-sentence context • Cross-sentence help predict entity types and relations[15],[16]

    • Related work • Add a 3-sentence context window[15] • This work • given an input sentence with n words augment the input with from the left context and right context → Entity: W=300, Relation: W=100 17 [15] Wadden+: Entity, relation, and event extraction with contextualized span representations, EMNLP ‘19 [16] Luan+: A general framework for information extraction using dynamic span graphs, NAACL ‘19
  15. Experiments: Dataset • ACE04[18], ACE05[19] • a variety of domains

    (newswire and online forums) • Entity PER, ORG, GPE, LOC, FAC, VEH, WEA • Relation Physical, Part-whole, Personal-Social, ORG-Affiliation, Agent-Artifact, Gen-Affiliation • SciERC[20] • 500 AI paper abstracts, scientific terms and relations • Entity Task, Method, Evaluation Metric, Material, Other Scientific Terms, Generic • Relation Used-for, Feature-of, Hyponym-of, Part-of, Compare, Conjunction [18] https://catalog.ldc.upenn.edu/LDC2005T09 [19] https://catalog.ldc.upenn.edu/LDC2006T06 [20] Luan+: Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction, EMNLP ‘18 http://nlp.cs.washington.edu/sciIE/ 19
  16. Experiments: Evaluation metrics • micro F1 measure • Named Entity

    Recognition • span boundaries and the predicted entity type are both correct • Relation extraction • (1) boundaries evaluation(Rel) the boundaries of two spans are correct and the predicted relation type is correct • (2) strict evaluation(Rel+) predicted entity types also must be correct in addition Rel 20
  17. Main results: Entity model • Entity model • cross-sentence information

    is useful and pre-trained transformer encoders can capture long-range dependencies from a large context 22 Encoder L=LSTM L+E=LSTM+ELMo Bb=BERT-base Bl=BERT-large ALB=ALBERT-xxlarge-v1 SciB=SciBERT cross-sentence information trained with additional data(e.g., coreference)
  18. Main results: Relation model • Relation model • learning representations

    for entities and relations of different entity pairs, as well as early fusion of entity information in the relation model 23 Encoder L=LSTM L+E=LSTM+ELMo Bb=BERT-base Bl=BERT-large ALB=ALBERT-xxlarge-v1 SciB=SciBERT cross-sentence information trained with additional data(e.g., coreference)
  19. Main results: compared to the previous SOTA • Compared to

    the previous SOTA model without using context • This result clearly demonstrates the superiority of proposed model 24 Encoder L=LSTM L+E=LSTM+ELMo Bb=BERT-base Bl=BERT-large ALB=ALBERT-xxlarge-v1 SciB=SciBERT cross-sentence information trained with additional data(e.g., coreference)
  20. Analysis: Importance of Typed Text Markers 26 [S:PER] Bill Smith

    [/S:PER] was in the hotel [O:FAC] room [/O:FAC] Typed markers Proposed method PHYS 72.6% Untyped markers 70.5% [S] Bill Smith [/S] was in the hotel [O] room [/O] PHYS No marker 67.6% Bill Smith was in the hotel room PHYS Relation F1 Markers + entity auxiliary loss[15],[16] 70.7% [S] Bill Smith [/S] was in the hotel [O] room [/O] PHYS PER FAC [15] Wadden+: Entity, relation, and event extraction with contextualized span representations, EMNLP ‘19 [16] Luan+: A general framework for information extraction using dynamic span graphs, NAACL ‘19
  21. Analysis: Modeling Entity-Relation Interactions • Does sharing encoders help? 27

    + , • Two tasks have different input formats and require different features for predicting entity types and relations → separate encoders indeed learns better task-specific features
  22. Approximation model with batch computations 28 [S:PER] Bill Smith [/S:PER]

    was in the [O:FAC] hotel [/O:FAC] room [S:PER] Bill Smith [/S:PER] was in the hotel [O:FAC] room [/O:FAC] Bill Smith was in the hotel room Bill Smith hotel room PER FAC FAC Bill Smith was in the [S:FAC] hotel [/S:FAC] [O:FAC] room [/O:FAC] …… One shortcoming of this approach is that need to run for every pair of entities
  23. Approximation model with batch computations 29 [S:PER] Bill Smith [/S:PER]

    was in the [O:FAC] hotel [/O:FAC] room [S:PER] Bill Smith [/S:PER] was in the hotel [O:FAC] room [/O:FAC] [S:PER] Bill Smith [/S:PER] was in the hotel [O:FAC] [/O:FAC] room [S:PER] Bill Smith [/S:PER] was in the hotel [O:FAC] [/O:FAC] room [S:PER] Bill Smith [/S:PER] was in the hotel [O:FAC] [/O:FAC] room [S:PER] [/S:PER] [O:FAC] [/O:FAC] the same sentence in one run of the relation model → 8-16x speedup with only 1% accuracy drop
  24. Conclusion • Proposed a simple and effective approach for named

    entity recognition and relation extraction • Learns two independent encoders • Insert typed entity markers in training relation model • An efficient approximation: 8-16x speed up with a small accuracy drop 31 [S:PER] Bill Smith [/S:PER] was in the [O:FAC] hotel [/O:FAC] room Insert Markers Insert Markers