Improving Coverage of Translation Memories with Language Modelling

Varování

Publikace nespadá pod Filozofickou fakultu, ale pod Fakultu informatiky. Oficiální stránka publikace je na webu muni.cz.

Autoři	BAISA Vít BUŠTA Josef HORÁK Aleš
Rok publikování	2014
Druh	Článek ve sborníku
Konference	Eighth Workshop on Recent Advances in Slavonic Natural Language Processing
Fakulta / Pracoviště MU	Fakulta informatiky
Citace
www	https://nlp.fi.muni.cz/raslan/2014/11.pdf
Obor	Informatika
Klíčová slova	translation memory; CAT; segment; subsegment leveraging; partial translation; Moses; GIZA++; word matrix; METEOR; MemoQ; language model
Popis	In this paper, we describe and evaluate current improvements to methods for enlarging translation memories. In comparison with the previous results in 2013, we have achieved improvement in coverage by almost 35 percentage points on the same test data. The basic subsegment splitting of the translation pairs is done using Moses and (M)GIZA++ tools, which provide the subsegment translation probabilities. The obtained phrases are then combined with subsegment combination techniques and filtered by large target language models.
Související projekty:	Transfer technologií na Masarykově univerzitě