Cthulhu Hails from Wales: N-gram Frequency Analysis of R'lyehian


This publication doesn't include Faculty of Arts. It includes Faculty of Informatics. Official publication website can be found on muni.cz.



Year of publication 2020
Type Article in Proceedings
Conference Proceedings of the Fourteenth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2020
MU Faculty or unit

Faculty of Informatics

Keywords H. P. Lovecraft; language identification; N-grams; R'lyehian

R'lyehian is a unique fictional language penned by the prolific 20th century horror fiction author H. P. Lovecraft. Prior work in the area of the Lovecraftian mythos has not yet studied the similarities between R'lyehian and natural languages, which are crucial for determining its true origins.

We produced a comprehensive wordlist of R'lyehian and used open-source $N$-gram-based language identification tools to find the most similar natural languages to R'lyehian. From the comprehensive wordlist, we also constructed a frequency table of all unigraphs and digraphs in R'lyehian.

We show that R'lyehian is most similar to Celtic languages, which lays grounds for our hypothesis that R'lyeh, where Cthulhu lies dreaming, might be a place in Wales.

Our frequency tables will prove a useful resource for future work in the area of the Lovecraftian mythos.

Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.