A Distributional Multi-word Thesaurus in Sketch Engine

Publikace nespadá pod Filozofickou fakultu, ale pod Fakultu informatiky. Oficiální stránka publikace je na webu muni.cz.

Autoři

JAKUBÍČEK Miloš RYCHLÝ Pavel

Rok publikování 2019
Druh Článek ve sborníku
Konference Proceedings of the Thirteenth Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2019
Fakulta / Pracoviště MU

Fakulta informatiky

Citace
Klíčová slova text corpus; Sketch Engine; MWE; multi-word expressions; thesaurus
Popis In this paper we present an extension of the current distribu-tional thesaurus as available in the Sketch Engine corpus managementsystem towards multi-word units. We explain how multi-word sketches areused to generate multi-word unit candidates, thus preserving access to theunderlying corpus texts. Finally we present sample results on the BritishNational Corpus and discuss future development as well as difficulties inevaluation.
Související projekty: