Document Type

Conference Proceeding

Publication Date

7-9-2018

Abstract

This study provides preliminary insights into the linguistic features that contribute to Internet censorship in mainland China. We collected a corpus of 344 censored and uncensored microblog posts that were published on Sina Weibo and built a Naive Bayes classifier based on the linguistic, topic-independent, features. The classifier achieves a 79.34% accuracy in predicting whether a blog post would be censored on Sina Weibo.

DOI

10.1145/3200947.3201037

Published Citation

Ng, K. Y., Feldman, A., & Leberknight, C. (2018, July). Detecting censorable content on sina weibo: A pilot study. In Proceedings of the 10th Hellenic Conference on Artificial Intelligence (pp. 1-5).

Included in

Linguistics Commons

COinS