Our goal is to use natural language processing to identify deceptive and non-deceptive passages in transcribed narratives. We begin by motivating an analysis of language-based deception that relies on specific linguistic indicators to discover deceptive statements. The indicator tags are assigned to a document using a mix of automated and manual methods. Once the tags are assigned, an interpreter automatically discriminates between deceptive and truthful statements based on tag densities. The texts used in our study come entirely from "real world" sources-criminal statements, police interrogations and legal testimony. The corpus was hand-tagged for the truth value of all propositions that could be externally verified as true or false. Classification and Regression Tree techniques suggest that the approach is feasible, with the model able to identify 74.9% of the T/F propositions correctly. Implementation of an automatic tagger with a large subset of tags performed well on test data, producing an average score of 68.6% recall and 85.3% precision when compared to the performance of human taggers on the same subset.
MSU Digital Commons Citation
Bachenko, Joan; Fitzpatrick, Eileen; and Schonwetter, Michael, "Verification and Implementation of Language-Based Deception Indicators in Civil and Criminal Narratives" (2008). Department of Linguistics Faculty Scholarship and Creative Works. 14.
Bachenko, J., Fitzpatrick, E., & Schonwetter, M. (2008, August). Verification and implementation of language-based deception indicators in civil and criminal narratives. In Proceedings of the 22nd International Conference on Computational Linguistics (COLING 2008) (pp. 41-48).