MUNI-NLP Systems for Lower Sorbian-German and Lower Sorbian-Upper Sorbian Machine Translation @ WMT22

Investor logo

Warning

This publication doesn't include Faculty of Arts. It includes Faculty of Informatics. Official publication website can be found on muni.cz.
Authors

SIGNORONI Edoardo RYCHLÝ Pavel

Year of publication 2022
Type Article in Proceedings
Conference Proceedings of the Seventh Conference on Machine Translation
MU Faculty or unit

Faculty of Informatics

Citation
Web https://www.statmt.org/wmt22/pdf/2022.wmt-1.109.pdf
Keywords NLP;machine translation;low-resource
Attached files
Description We describe our neural machine translation systems for the WMT22 shared task on unsupervised MT and very low resource supervised MT. We submit supervised NMT systems for Lower Sorbian-German and Lower Sorbian-Upper Sorbian translation in both directions. By using a novel tokenization algorithm, data augmentation techniques, such as Data Diversification (DD), and parameter optimization we improve on our baselines by 10.5-10.77 BLEU for Lower Sorbian-German and by 1.52-1.88 BLEU for Lower Sorbian-Upper Sorbian.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.