Department of Linguistics Faculty Scholarship and Creative Works

Annotating an Arabic Learner Corpus for Error

Ghazi Abuhakema, Montclair State University
Reem Faraj, Montclair State University
Anna Feldman, Montclair State UniversityFollow
Eileen Fitzpatrick, Montclair State UniversityFollow

Document Type

Conference Proceeding

Publication Date

5-1-2008

Journal / Book Title

Proceedings of the 6th International Conference on Language Resources and Evaluation

Abstract

This paper describes an ongoing project in which we are collecting a learner corpus of Arabic, developing a tagset for error annotation and performing Computer-aided Error Analysis (CEA) on the data. We adapted the French Interlanguage Database FRIDA tagset (Granger, 2003a) to the data. We chose FRIDA in order to follow a known standard and to see whether the changes needed to move from a French to an Arabic tagset would give us a measure of the distance between the two languages with respect to learner difficulty. The current collection of texts, which is constantly growing, contains intermediate and advanced-level student writings. We describe the need for such corpora, the learner data we have collected and the tagset we have developed. We also describe the error frequency distribution of both proficiency levels and the ongoing work.

Journal ISSN / Book ISBN

ISBN 978-295174084-6

MSU Digital Commons Citation

Abuhakema, Ghazi; Faraj, Reem; Feldman, Anna; and Fitzpatrick, Eileen, "Annotating an Arabic Learner Corpus for Error" (2008). Department of Linguistics Faculty Scholarship and Creative Works. 10.
https://digitalcommons.montclair.edu/linguistics-facpubs/10

Rights

This Open Access conference proceeding is being shared by the publisher under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International License.

Published Citation

Abuhakema, G., Faraj, R., Feldman, A., & Fitzpatrick, E. (2008). Annotating an Arabic learner corpus for error. In Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008 (pp. 1347-1350). European Language Resources Association (ELRA).

Download

Link to Publisher

Included in

Linguistics Commons

COinS

Department of Linguistics Faculty Scholarship and Creative Works

Annotating an Arabic Learner Corpus for Error

Document Type

Publication Date

Journal / Book Title

Abstract

Journal ISSN / Book ISBN

MSU Digital Commons Citation

Rights

Published Citation

Included in

Search

Browse

Author Corner

Links

Department of Linguistics Faculty Scholarship and Creative Works

Annotating an Arabic Learner Corpus for Error

Authors

Document Type

Publication Date

Journal / Book Title

Abstract

Journal ISSN / Book ISBN

MSU Digital Commons Citation

Rights

Published Citation

Included in

Share

Search

Browse

Author Corner

Links

//<![CDATA[ document.write("<a href='mailto:" + "digitalcommons" + "@" + "mail.montclair.edu" + "'>" + "Contact Us" + "<\/a>") //]]>