Development of HAMOD: a High Agreement Multi-lingual Outlier Detection dataset

Warning

This publication doesn't include Faculty of Arts. It includes Faculty of Informatics. Official publication website can be found on muni.cz.

Authors

JAKUBÍČEK Miloš ROMANI Emma RYCHLÝ Pavel HERMAN Ondřej

Year of publication 2021
Type Article in Proceedings
Conference Recent Advances in Slavonic Natural Language Processing (RASLAN 2021)
MU Faculty or unit

Faculty of Informatics

Citation
Web
Keywords HAMOD; Distributional thesaurus; Outlier detection; Word embeddings; Sketch Engine
Description In this paper we describe further development of a High Agreement Multi- lingual Outlier Detection dataset (HAMOD) outlier that is used for the purpose of evaluation of automatic distributional thesauri. We briefly introduce the task and methodological motivation for developing such a dataset, then we present the current status of the dataset and related tools as well as results measured on the dataset so far (both in terms of agreement rates and thesauri eveluation). Finally we discuss future developments of HAMOD.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.