Document Type
Conference Proceeding
Publication Date
1-1-2024
Journal / Book Title
Eacl 2024 18th Conference of the European Chapter of the Association for Computational Linguistics Findings of Eacl 2024
Abstract
This study investigates the computational processing of euphemisms, a universal linguistic phenomenon, across multiple languages. We train a multilingual transformer model (XLM-RoBERTa) to disambiguate potentially euphemistic terms (PETs) in multilingual and cross-lingual settings. In line with current trends, we demonstrate that zero-shot learning across languages takes place. We also show cases where multilingual models perform better on the task compared to monolingual models by a statistically significant margin, indicating that multilingual data presents additional opportunities for models to learn about cross-lingual, computational properties of euphemisms. In a follow-up analysis, we focus on universal euphemistic “categories” such as death and bodily functions among others. We test to see whether cross-lingual data of the same domain is more important than within-language data of other domains to further understand the nature of the cross-lingual transfer.
Journal ISSN / Book ISBN
85188697328 (Scopus)
Montclair State University Digital Commons Citation
Lee, Patrick; Trujillo, Alain Chirino; Plancarte, Diana Cuevas; Ojo, Olumide Ebenezer; Liu, Xinyi; Shode, Iyanuoluwa; Zhao, Yuan; Peng, Jing; and Feldman, Anna, "MEDs for PETs: Multilingual Euphemism Disambiguation for Potentially Euphemistic Terms" (2024). School of Computing Faculty Scholarship and Creative Works. 45.
https://digitalcommons.montclair.edu/computing-facpubs/45