A Distributional Multi-word Thesaurus in Sketch Engine

This publication doesn't include Faculty of Arts. It includes Faculty of Informatics. Official publication website can be found on muni.cz.

Authors

JAKUBÍČEK Miloš RYCHLÝ Pavel

Year of publication 2019
Type Article in Proceedings
Conference Proceedings of the Thirteenth Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2019
MU Faculty or unit

Faculty of Informatics

Citation
Keywords text corpus; Sketch Engine; MWE; multi-word expressions; thesaurus
Description In this paper we present an extension of the current distribu-tional thesaurus as available in the Sketch Engine corpus managementsystem towards multi-word units. We explain how multi-word sketches areused to generate multi-word unit candidates, thus preserving access to theunderlying corpus texts. Finally we present sample results on the BritishNational Corpus and discuss future development as well as difficulties inevaluation.
Related projects: