Information Extraction for Czech Based on Syntactic Analysis

Warning

This publication doesn't include Faculty of Arts. It includes Faculty of Informatics. Official publication website can be found on muni.cz.

Authors	BAISA Vít KOVÁŘ Vojtěch
Year of publication	2011
Type	Article in Proceedings
Conference	Human Language Technologies as a Challenge for Computer Science and Linguistics, Proceedings of 5th Language and Technology Conference
MU Faculty or unit	Faculty of Informatics
Citation
Field	Informatics
Keywords	information extraction; syntactic analysis; semantic classification; morphological disambiguation
Description	We present a complex pipeline of natural language processing tools for Czech that performs extraction of basic facts presented in a text. The input for the tool is a plain text, the output contains verb and noun phrases with basic semantic classification. Automatic syntactic analysis of Czech plays a crucial role in the pipeline. In this paper, we describe the particular tools used in the system, then we give an example of its usage and conclude with a basic evaluation of the overall system accuracy.
Related projects:	Legal e-dictionary - PES Temporální aspekty znalostí a informací Analýza přirozeného jazyka v prostředí internetu