Upgrade to Pro — share decks privately, control downloads, hide ads and more …

StructVAE: Tree-structured Latent Variable Mode...

Watson
August 04, 2018

StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing / snlp2018

Watson

August 04, 2018
Tweet

More Decks by Watson

Other Decks in Research

Transcript

  1. StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing StructVAE:

    Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing ݪ࿦จஶऀ Pengcheng Yin, Chunting Zhou, Junxian He, Graham Neubig ಡΉਓ ே૔୎ਓʢNII / ౦େ ٶඌݚʣ ୈ 10 ճ ࠷ઌ୺ NLP ษڧձ 2018 ೥ 8 ݄ 4 ೔ 1 / 15
  2. StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing ϝλ֓ཁ

    ࿦จ৘ใ ACL 2018 ࠾୒ SNLP 2018 ౤ථ ୈ 7 ҐλΠ ൃදࢿྉɿhttp://pcyin.me/structVAE.acl.pdf ΞʔΧΠϒɿhttps://arxiv.org/abs/1806.07832v1 StructVAE ൒ڭࢣ͋ΓͰֶश͢Δ Semantic Parsing ͷϞσϧ ࣮૷ɿhttps://github.com/pcyin/structvaeʢ༧ఆʣ ˞Ҏ߱ɼಛʹ஫هͳ͖ਤද͸ݪ࿦จɾൃදࢿྉ͔ΒͷҾ༻ ˞Ҿ༻จݙʹ͍ͭͯ΋ݪ࿦จͷจݙϦετΛࢀরͷ͜ͱ 2 / 15
  3. StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing Ϟνϕʔγϣϯ

    Semantic Parsing ࣗવݴޠͷจΛͦͷʮҙຯʯΛද͢ܗࣜදݱʹม׵ ܗࣜදݱɿλ දݱɼSQL จɼ൚༻ϓϩάϥϛϯάݴޠɼetc. ࣗવݴޠ Show me flights to Boston. Semantic Parser ܗࣜදݱ λxflight(x)∧to(x, BOSTON) ˞ࣗ࡞ਤ ෼໺ͷ՝୊ NN Ϟσϧͷڭࢣ͋Γֶशʹ͸๲େͳσʔλ͕ඞཁ Semantic Parsing ༻ͷϥϕϧ෇͖σʔλͷ࡞੒͸ߴίετ ∵ σʔλ࡞੒ʹ஌ࣝɾٕೳ͕ඞཁɼඅ༻͕͔͔Δ 3 / 15
  4. StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing ൒ڭࢣ͋ΓֶशϞσϧ

    λεΫɿࣗવݴޠͷจ x Λҙຯදݱ z ʹม׵ ˠ͜ͷݚڀͰ͸ z ͸໦ߏ଄දݱ (AST) ֶशσʔλɿ2 छྨ ϥϕϧ෇͖σʔλ L = {⟨x, z⟩} ϥϕϧͳ͠σʔλ U = {x} J = ⟨x,z⟩∈L log pφ (z | x) ڭࢣ͋Γ (Js) + α x∈U log p(x) ڭࢣͳ͠ (Ju) ˞ α ͸νϡʔχϯάɾύϥϝʔλ 4 / 15
  5. StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing StructVAE:

    VAEs with Tree-structured Latent Variables VAE: Variational Auto-Encoding જࡏม਺ͷ֬཰෼෍Λ૊ΈࠐΜͩ NN Ϟσϧ ˠ Seq2Seq ͳ൒ڭࢣ͋Γֶशʹద༻ [Miao+ 2016; Kociský+ 2016] StructVAE ࣗવݴޠදݱʹɼજࡏతͳ ໦ߏ଄ΛԾఆ ˠ VAE Ͱڭࢣͳֶ͠श ෮ݩϞσϧ pθ (x | z) ਪ࿦Ϟσϧ qφ (z | x) ˠ qφ (·) △ = pφ (·) ͱ͢Ε͹ Semantic Parser ͦͷ΋ͷ STRUCTVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing Pengcheng Yin, Chunting Zhou, Junxian He, Graham Neubig Language Technologies Institute Carnegie Mellon University {pcyin,ctzhou,junxianh,gneubig}@cs.cmu.edu Abstract Semantic parsing is the task of transducing natural language (NL) utterances into for- mal meaning representations (MRs), com- monly represented as tree structures. An- notating NL utterances with their cor- responding MRs is expensive and time- consuming, and thus the limited availabil- ity of labeled data often becomes the bot- tleneck of data-driven, supervised mod- els. We introduce STRUCTVAE, a vari- ational auto-encoding model for semi- supervised semantic parsing, which learns both from limited amounts of parallel data, Structured Latent Semantic Space (MRs) p(z) Inference Model q (z|x) Reconstruction Model p✓(x|z) Sort my_list in descending order z Figure 1: Graphical Representation of STRUCTVAE 5 / 15
  6. StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing NN

    ΞʔΩςΫνϟ ࣄલ෼෍ p(z)ɿඪ४తͳ LSTM ݴޠϞσϧ [Zaremba+ 2014] ͜͜Ͱ͸ z ͷγʔΫΤϯεදݱ zs Λ༻͍ͨ ίʔυੜ੒ɿASTɼSemantic Parsingɿlinealized s-expression ෮ݩϞσϧ pθ (x | z)ɿ ඪ४తͳ Attentional Seq2Seq ωοτϫʔΫ [Luong+ 2015] Copy ػߏ (Pointer Networks) [Vinyals+ 2015] ਪ࿦Ϟσϧ pφ (z | x)ɿஶऀͷࡢ೥ͷ࿦จϕʔε [Yin+ 2017] χϡʔϥϧ Seq2Seq ωοτϫʔΫ σίʔμ͸ AST ͷτϙϩδʔͰิॿ ˞ৄ͘͠͸ Appendix B Λࢀর 6 / 15
  7. StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing ൒ڭࢣ͋Γֶशᶃ

    ڭࢣ͋Γֶश ϥϕϧ෇͖σʔλ͸෮ݩɾਪ࿦Ϟσϧͷ྆ํͷ࠷దԽʹར༻ɿ Js △ = x,z∈L log qφ (z | x) + log pθ (x | z) ڭࢣͳֶ͠श จͷ֬཰ log p(x) ʹର͢Δม෼ԼքΛ VAE Ͱ࠷େԽɿ log p(x) ≥ L = E z∼qφ(z |x) log pθ (x | z) − λ · KL[qφ (z | x) ∥ p(z)] λ: νϡʔχϯάɾύϥϝʔλ KL[qφ ∥ p]: Kullback-Leibler μΠόʔδΣϯε 7 / 15
  8. StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing ൒ڭࢣ͋Γֶशᶄ

    ڭࢣͳֶ͠शʢଓ͖ʣ qφ (· | x) ͔ΒͷαϯϓϦϯά S(x) Λ༻͍ͯۙࣅܭࢉɿ ∂L ∂φ = ∂ ∂φ E z∼qφ(z |x) log pθ (x | z) − λ (log qθ (z | x) − pθ (z)) ֶशγάφϧ l′(x, z) ≈ 1 |S(x)| zi ∈S(x) l′(x, z) ∂ log qφ (zi | x) ∂φ ֶशΛ҆ఆԽͤ͞ΔͨΊɼֶशγάφϧΛҎԼͷΑ͏ʹ࠶ఆٛɿ l(x, z) △ = l′(x, z) − a · log p(x) + c ϕʔεϥΠϯ log pθ (x | z): ઌʹֶशͨ͠ LSTM ݴޠϞσϧ 8 / 15
  9. StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing ൒ڭࢣ͋Γֶशᶅ

    ڭࢣͳֶ͠शʢଓ͖ʣ ෮ݩϞσϧͷޯ഑͸༰қʹܭࢉͰ͖Δɿ ∂L ∂θ ≈ 1 |S(x)| zi ∈S(x) ∂ log pθ (x | zi ) ∂θ 9 / 15
  10. StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing ڭࢣͳֶ͠श͸ͳͥ໾ཱͭʁ

    ม෼Լք L ͷޯ഑͸ֶशγάφϧ l(x, z) ͱ૬ؔɿ ∂L ∂φ ∝ z′∼qφ l(x, z) × ∂qθ (z′ | x) ∂θ Ұํɼl(x, z) ͸ࣄલ෼෍ p(z) ͱ෮ݩϞσϧ pθ (x | z) ʹؔ࿈ɿ l(x, z) = log pθ (x | z) − λ (log wθ (z | x) − pθ (z)) ௚ײతͳཧղ ڭࢣͳֶ͠श͸ҎԼͷΑ͏ʹ Semantic Parser Λॿ͚Δɿ z ͕؆͔ܿͭࣗવ ˠ p(z) ͕େ z ͕ x ͷҙຯΛ஧࣮ʹ൓ө͍ͯ͠Δ ˠ pθ (x | z) ͕େ 10 / 15
  11. StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing ਪ࿦ϞσϧɿTransition-based

    Parser 3 ͭͷΞΫγϣϯΛ༻͍ͯ AST Λߏ੒ [Yin+ 2017] ApplyConstr[c], Reduce, GenToken[v] ϞσϦϯά͸ɼΞΫγϣϯͷ֬཰ྻ {at } Λ༻͍ͯɿ qφ (z | x) = t p(at | a<t, x) The Inference Model: a Transition-based Parser A transition-based parser that transduces natural language utterances into general-purpose Abstract Syntax Trees [Yin and Neubig, 2017; Rabinovich et al. 2017] Sort my_list in descending order stmt FunctionDef(identifiler name, expr Call(expr func, expr* args, Grammar Specification arguments args, stmt* body) Expr(expr value) keyword* keywords) Str(string id) | Name(identifier id) | | Input Utterance ApplyConstr(Expr) ApplyConstr(Call) ApplyConstr(Name) Transition System . . . GenToken(sorted) Expr Call Name sorted Name my_list Keyword Abstract Syntax Tree . . . Inference Model 11 / 15
  12. StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing ࣮ݧ

    λεΫͱσʔληοτ Semantic Parsingɿλ දݱʹม׵ Atis: 5,410 ݅ͷϑϥΠτ৘ใ໰͍߹Θͤ [Dong+ 2016] ίʔυੜ੒ɿPython ίʔυΛੜ੒ Django: 18,805 ߦͷ Python ίʔυ [Oda+ 2017] σʔληοτ͸͢΂ͯϥϕϧ෇͖ɽϥϯμϜʹαϯϓϦϯάͯ͠ ϥϕϧ෇͖ L ͱ͠ɼ࢒ΓΛϥϕϧͳ͠ U ͱͯ͠ར༻ɽ Research Questions RQ1 StructVAE ͸७ਮͳڭࢣ͋ΓֶशΑΓ΋ੑೳ͕ྑ͍͔ʁ RQ2 ֶशγάφϧ͕ StructVAE Λॿ͚͍ͯΔ౷ܭతূڌ͸ʁ 12 / 15
  13. StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing RQ1:

    StructVAE v.s. ϕʔεϥΠϯ ίʔυੜ੒ (Django) ͷύϑΥʔϚϯε ϕʔεϥΠϯͱͯ͠ڭࢣ͋ΓɾSelfTrain ͱൺֱ গྔσʔλͰ͸ڭࢣ͋ΓΑΓ݁Ռ͕ྑ͍ ˠϥϕϧ෇͖ϑϧσʔλ (16,000) ͩͱઌߦख๏ʹউͯͳ͍ RQ1: STRUCTVAE v.s. Baselines Use all available training utterances as unlabeled data Inference model as standalone supervised parser Self Training (semi-supervised baseline) STRUCTVAE The gap is much more obvious when we use a mediocre parser J 13 / 15
  14. StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing RQ2:

    ֶशγάφϧͷ༗ޮੑ L ʹؚ·Εͳ͍จʹ͍ͭͯɼֶशγάφϧͷ෼෍Λௐࠪ ਖ਼ղαϯϓϧʹ͍ͭͯͷֶशγάφϧ͕େ͖͍܏޲͋Γ RQ2: Empirical Statistics of Learning Signals For each unlabeled utterance , compute the learning signal for gold latent samples and other (imperfect) samples 30 20 10 0 10 20 0.0 0.1 0.2 30 20 10 0 10 20 0.0 0.1 0.2 Gold Samples Imperfect Samples 0 Avg.=2.59 0 Avg.=-5.12 14 / 15
  15. StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing ·ͱΊͱײ૝

    Semantic Parser ͷֶशʹ͸๲େͳϥϕϧ෇͖σʔλ͕ඞཁ but. ϥϕϧ෇͖σʔλͷ࡞੒͸ߴίετ ൒ڭࢣͳֶ͠शͰগͳ͍σʔλྔΛΧόʔ จͷજࡏతͳ໦ߏ଄ΛԾఆ ڭࢣ͋Γɿਪ࿦ɾ෮ݩͷ྆ϞσϧΛ࠷దԽ ڭࢣͳ͠ɿจͷ֬཰ʹର͢Δม෼ԼքΛ VAE Ͱ࠷େԽ ݕূํ๏ Semantic Parsing: Atis σʔληοτ Code Generation: Django σʔληοτ ϥϕϧ෇͖σʔλ͕গྔͷͱ͖ɼڭࢣ͋ΓֶशΑΓߴੑೳ ˠ͔֬ʹ݁Ռ͸ଟগΑ͘ͳ͍ͬͯΔ but. Ϟσϧͷਖ਼౰ੑͷઆ໌ʹ͸ʮ΋΍΋΍ײʯ͕࢒Δ 15 / 15