We present an approach for automatically learning synonyms from a paraphrase corpus of tweets. This work shows improvement on the task of paraphrase detection when we substitute our extracted synonyms into the training set. The synonyms are learned by using chunks from a shallow parse to create candidate synonyms and their context windows, and the synonyms are incorporated into a paraphrase detection system that uses machine translation metrics as features for a classifier. We demonstrate a 2.29% improvement in F1 when we train and test on the paraphrase training set, providing better coverage than previous systems, which shows the potential power of synonyms that are representative of a specific topic.
Revised: September 24, 2015 |
Published: May 18, 2015
Citation
Antoniak M.A., E.B. Bell, and F. Xia. 2015.Leveraging Paraphrase Labels to Extract Synonyms from Twitter. In Proceedings of the 28th International Florida Artificial Intelligence Research Society Conference (FLAIRS-28), May 18-20, 2015, Hollywood, Florida, 3-7. Palo Alto, California:AAAI Press.PNNL-SA-106823.