Power Networks Dialogs - Enhancing Domain-Specific Text Processing Techniques and Resources

Warning

This publication doesn't include Faculty of Arts. It includes Faculty of Informatics. Official publication website can be found on muni.cz.
Authors

KOVÁŘ Vojtěch HORÁK Aleš JAKUBÍČEK Miloš

Year of publication 2008
Type Article in Proceedings
Conference Proceedings of ELNET 2008
MU Faculty or unit

Faculty of Informatics

Citation
Field Informatics
Keywords electrical power networks;czech domain-specific resources;syntax analysis;text corpora
Description In this paper, we describe the process of development of the analytical approaches adapted for the work with technical texts specialized at the domain of electrical power networks (EPN) topics. The process includes improving the quality of the EPN resources. The new data represent one of the largest domain specific corpora containing more than 5 million of text tokens. We show the details of building a new the large domain-specific corpus, its analysis and further processing such as filtering, morphological and syntactical analysis and phrase detection and present, how they help to improve the dialog system.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.