Typ kladenští jako problém automatické morfologické analýzy

Title in English Kladenští type as a problem of automatic morphological analysis
Authors

OSOLSOBĚ Klára ŽIŽKOVÁ Hana

Year of publication 2022
Type Article in Periodical
Magazine / Source Jazykovedný časopis
MU Faculty or unit

Faculty of Arts

Citation
Web https://www.juls.savba.sk/ediela/jc/2021/4/jc21-04.pdf
Doi http://dx.doi.org/10.2478/jazcas-2022-0011
Keywords automatic morphological analysis; derivational type Kladenští; part of speech transition
Description The aim of our paper is to demonstrate the procedures by which the data needed to refine tools for automatic morphological analysis of Czech can be obtained using a corpus, namely the Araneum Bohemicum IV Maximum (Czech, 20.03) 7.10 G (hereinafter Araneum). Particularly, we will focus on propria of the Kladenští type, i.e., substantivized adjectives of denoting groups of persons according to affiliation. The goal of the probe into the Aranea web corpus is: 1) a corpus-based description of frequented properties of the Kladenští type, which can be used as a starting point for rule disambiguation; 2) creating a list of the most frequent lemmas belonging to the Kladenští type, which can then be included into dictionaries of automatic morphological analyzers (e.g. the MorfFlex dictionary by Hajič and Hlaváčová). We believe that the probe can help improve the results of tools for automatic morphological analysis of Czech.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.