Title
Evaluating and Automating the Annotation of a Learner Corpus
Document Type
Article
Publication Date
3-2014
Journal / Book Title
Language Resources and Evaluation
Abstract
The paper describes a corpus of texts produced by non-native speakersof Czech. We discuss its annotation scheme, consisting of three interlinked tiers,designed to handle a wide range of error types present in the input. Each tier correctsdifferent types of errors; links between the tiers allow capturing errors in word orderand complex discontinuous expressions. Errors are not only corrected, but alsoclassified. The annotation scheme is tested on a data set including approx. 175,000words with fair inter-annotator agreement results. We also explore the possibility ofapplying automated linguistic annotation tools (taggers, spell checkers and grammarcheckers) to the learner text to support or even substitute manual annotation.
DOI
10.1007/s10579-013-9226-3
Journal ISSN / Book ISBN
1574-020X
MSU Digital Commons Citation
Rosen, Alexandr; Hana, Jirka; Stindlova, Barbora; and Feldman, Anna, "Evaluating and Automating the Annotation of a Learner Corpus" (2014). Department of Linguistics Faculty Scholarship and Creative Works. 7.
https://digitalcommons.montclair.edu/linguistics-facpubs/7
Published Citation
Rosen, A., Hana, J., Stindlová, B., & Feldman, A. (2014). Evaluating and automating the annotation of a learner corpus. Language Resources and Evaluation, 48(1), 65-92. doi:http://dx.doi.org/10.1007/s10579-013-9226-3