Department of Computer Science Faculty Scholarship and Creative Works

In God We Trust. All Others Must Bring Data. - W. Edwards Deming using Word Embeddings to Recognize Idioms

Jing Peng, Montclair State UniversityFollow
Anna Feldman, Montclair State UniversityFollow

Document Type

Conference Proceeding

Publication Date

1-1-2016

Abstract

Expressions, such as add fuel to the fire, can be interpreted literally or idiomatically depending on the context they occur in. Many Natural Language Processing applications could improve their performance if idiom recognition were improved. Our approach is based on the idea that idioms violate cohesive ties in local contexts, while literal expressions do not. We propose two approaches: 1) Compute inner product of context word vectors with the vector representing a target expression. Since literal vectors predict well local contexts, their inner product with contexts should be larger than idiomatic ones, thereby telling apart literals from idioms; and (2) Compute literal and idiomatic scatter (covariance) matrices from local contexts in word vector space. Since the scatter matrices represent context distributions, we can then measure the difference between the distributions using the Frobenius norm. For comparison, we implement Fazly et al. (2009)'s, Sporleder and Li (2009)'s, and Li and Sporleder (2010b)'s methods and apply them to our data. We provide experimental results validating the proposed techniques.

Montclair State University Digital Commons Citation

Peng, Jing and Feldman, Anna, "In God We Trust. All Others Must Bring Data. - W. Edwards Deming using Word Embeddings to Recognize Idioms" (2016). Department of Computer Science Faculty Scholarship and Creative Works. 335.
https://digitalcommons.montclair.edu/compusci-facpubs/335

This document is currently not available here.

COinS

Department of Computer Science Faculty Scholarship and Creative Works

In God We Trust. All Others Must Bring Data. - W. Edwards Deming using Word Embeddings to Recognize Idioms

Document Type

Publication Date

Abstract

Montclair State University Digital Commons Citation

Search

Browse

Author Corner

Links

Department of Computer Science Faculty Scholarship and Creative Works

In God We Trust. All Others Must Bring Data. - W. Edwards Deming using Word Embeddings to Recognize Idioms

Authors

Document Type

Publication Date

Abstract

Montclair State University Digital Commons Citation

Share

Search

Browse

Author Corner

Links

//<![CDATA[ document.write("<a href='mailto:" + "digitalcommons" + "@" + "mail.montclair.edu" + "'>" + "Contact Us" + "<\/a>") //]]>