Department of Linguistics Faculty Scholarship and Creative Works

A Cross-language Approach to Rapid Creation of New Morpho-syntactically Annotated Resources

Document Type

Conference Proceeding

Publication Date

2006

Journal / Book Title

5th Edition of the International Conference on Language Resources and Evaluation

Abstract

We take a novel approach to rapid, low-cost development of morpho-syntactically annotated resources without using parallel corpora or bilingual lexicons. The overall research question is how to exploit language resources and properties to facilitate and automate the creation of morphologically annotated corpora for new languages. This portability issue is especially relevant to minority languages, for which such resources are likely to remain unavailable in the foreseeable future. We compare the performance of our system on languages that belong to different language families (Romance vs. Slavic), as well as different language pairs within the same language family (Portuguese via Spanish vs. Catalan via Spanish). We show that across language families, the most difficult category is the category of nominals (the noun homonymy is challenging for morphological analysis and the order variation of adjectives within a sentence makes it challenging to create a realiable model), whereas different language families present different challenges with respect to their morpho-syntactic descriptions: for the Slavic languages, case is the most challenging category; for the Romance languages, gender is more challenging than case. In addition, we present an alternative evaluation metric for our system, where we measure how much human labor will be needed to convert the result of our tagging to a high precision annotated resource.

MSU Digital Commons Citation

Feldman, Anna; Hana, Jirka; and Brew, Chris, "A Cross-language Approach to Rapid Creation of New Morpho-syntactically Annotated Resources" (2006). Department of Linguistics Faculty Scholarship and Creative Works. 15.
https://digitalcommons.montclair.edu/linguistics-facpubs/15

Rights

This Open Access conference proceeding is being shared by the publisher under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International License.

Published Citation

Feldman, A., Hana, J., & Brew, C. (2006). A cross-language approach to rapid creation of new morpho-syntactically annotated resources. In 5th International Conference on Language Resources and Evaluation, LREC 2006.

Download

Link to Publisher

Included in

Linguistics Commons

COinS

Department of Linguistics Faculty Scholarship and Creative Works

A Cross-language Approach to Rapid Creation of New Morpho-syntactically Annotated Resources

Document Type

Publication Date

Journal / Book Title

Abstract

MSU Digital Commons Citation

Rights

Published Citation

Included in

Search

Browse

Author Corner

Links

Department of Linguistics Faculty Scholarship and Creative Works

A Cross-language Approach to Rapid Creation of New Morpho-syntactically Annotated Resources

Authors

Document Type

Publication Date

Journal / Book Title

Abstract

MSU Digital Commons Citation

Rights

Published Citation

Included in

Share

Search

Browse

Author Corner

Links

//<![CDATA[ document.write("<a href='mailto:" + "digitalcommons" + "@" + "mail.montclair.edu" + "'>" + "Contact Us" + "<\/a>") //]]>