Document Type

Conference Proceeding

Publication Date



This paper describes the development of an idiom-annotated corpus of Russian. The corpus is compiled from freely available resources online and contains texts of different genres. The idiom extraction, annotation procedure, and a pilot experiment using the new corpus are outlined in the paper. Considering the scarcity of publicly available Russian annotated corpora, the corpus is a much-needed resource that can be utilized for literary and linguistic studies, pedagogy as well as for various Natural Language Processing tasks.

Published Citation

Aharodnik, Katsiaryna, Anna Feldman, and Jing Peng. "Designing a Russian idiom-annotated corpus." Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). 2018.

Included in

Linguistics Commons