Date of Award
5-2018
Document Type
Thesis
Degree Name
Master of Science (MS)
College/School
College of Science and Mathematics
Department/Program
Mathematical Sciences
Thesis Sponsor/Dissertation Chair/Project Chair
Aihua Li
Committee Member
Christopher Leberknight
Committee Member
Haiyan Su
Abstract
The paper presents a statistical analysis that explores methods for measuring con- troversy in online news articles collected from 23 RSS feeds. Several baseline datasets are used to re-evaluate previous work and determine the predictive quality of uni- grams for classifying controversial documents. This is achieved by comparing con-troversy and sentiment, exploring sentiment variance, and considering entropy and standard deviation as potential features. The paper tests whether there are more controversial words in negative sentiment than in positive sentiment as well as whether there are more non-controversial words in positive sentiment than in negative sen-timent. Unlike previous studies, we determine that words alone were not useful for detecting controversy as they did not provide enough context. Consequently, fur-ther analysis yields a more fruitful approach using features to detect controversy such as standard deviation and entropy. Results demonstrate that entropy and standard deviation provide greater discrimination quality compared to using posi-tive and negative sentiment to classify controversial documents. Since words alone are not enough, we perform a crowdsourcing experiment on titles to provide more context than words alone. Although the titles are more beneficial than the words, we go one step further and utilize the summaries of the articles, which provide even more context. These features, along with the improvements, provide a cleaner sep-aration of data for classifying controversial documents and may provide useful in- sight for the design of future classification models.
Recommended Citation
Kaplun, Kateryna, "Classifying Controversiality in Article Data" (2018). Theses, Dissertations and Culminating Projects. 133.
https://digitalcommons.montclair.edu/etd/133