Classification of Errors in Text

This publication doesn't include Faculty of Arts. It includes Faculty of Informatics. Official publication website can be found on muni.cz.

Authors

JAKUBÍČEK Miloš BUŠTA Jan HLAVÁČKOVÁ Dana PALA Karel

Year of publication 2009
Type Article in Proceedings
Conference RASLAN 2009 : Recent Advances in Slavonic Natural Language Processing
MU Faculty or unit

Faculty of Informatics

Citation
Web http://nlp.fi.muni.cz/raslan/2009/
Field Linguistics
Keywords errors in text; classification of errors
Description This paper presents two classifications of errors in Czech texts. As a basic resource we use the corpus (Chyby -- Errors) which has been continuously developed from 1999--2000 ([1]). The corpus text contains various kinds of errors such as spelling, typographical, grammatical, semantic, lexical, and stylistic ones. They have been corrected manually and annotated according to the classification of errors (annotation scheme) developed for this purpose. For the annotation we implemented a tool named WinCorr. We mention the first annotation scheme and discuss the second one which has been designed recently to obtain more adequate description of the errors occurring in texts. We also discuss the principles on which both classifications are based.
Related projects: