Information Extraction for Czech Based on Syntactic Analysis

This publication doesn't include Faculty of Arts. It includes Faculty of Informatics. Official publication website can be found on



Year of publication 2011
Type Article in Proceedings
Conference Human Language Technologies as a Challenge for Computer Science and Linguistics, Proceedings of 5th Language and Technology Conference
MU Faculty or unit

Faculty of Informatics

Field Informatics
Keywords information extraction; syntactic analysis; semantic classification; morphological disambiguation
Description We present a complex pipeline of natural language processing tools for Czech that performs extraction of basic facts presented in a text. The input for the tool is a plain text, the output contains verb and noun phrases with basic semantic classification. Automatic syntactic analysis of Czech plays a crucial role in the pipeline. In this paper, we describe the particular tools used in the system, then we give an example of its usage and conclude with a basic evaluation of the overall system accuracy.
Related projects: