dentist I hate going to the dentist 1. Turkers ʹΑΔݴ͍͑ 3. ݩͷπΠʔτʹ͋ͬͨ ΄͏ͷ୯ޠΛ target words ʹՃ͑Δ 2. Unsupervised alignment ͰϖΞΛ ݟ͚ͭΔ 4 (1) Ϋϥυιʔγϯά 1000 πΠʔτ from searching for #sarcasm/#sarcastic 5000 πΠʔτ by 5 Turkers
algorithm for paraphrase detection (Barzilay and McKeown, 2001) 2. Statistical machine translation alignment (IBM Model 4 with HMM alignment in Giza+ +; Och and Ney, 2000) 5 → 367 ϖΞ → 70 ϖΞ obtained # of t φ ≥ 0.8
... suppose: u = “I love going to the dentist” (t = “love”) love ͷจ຺ޠ ck in S love ͷจ຺ޠ wj in u -0.9 -0.9 -0.9 -0.9 -0.9 -0.9 -0.9 0.8 0.5 0.3 0.5 -0.9 -0.9 -0.9 -0.9 -0.9 -0.9 -0.9 0.3 -0.9 0.1 -0.1 -0.9 -0.9 -0.9 Sim = 0 ... ... ... ... ... ߦྻ M Sim = 0.8 0.3 repeat until max = 0 or size(M) = (0, 0) 16