Semi-automatic Theme-Rheme Identification


This publication doesn't include Faculty of Arts. It includes Faculty of Informatics. Official publication website can be found on



Year of publication 2013
Type Article in Proceedings
Conference Seventh Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2013
MU Faculty or unit

Faculty of Informatics

Field Informatics
Keywords theme-rheme; Functional Sentence Perspective; topic-focus articulation;
Description In this paper we start from the theory of the Functional Sentence Perspective developed primarily by Firbas [1], Svoboda [2] and also Sgall, Hajicová [3] and make an attempt to formulate a procedure allowing to semi-automatically recognize which sentence constituents carry information that is contextually dependent and thus known to an adressee (theme), constituents containing new information (rheme), and also constituents bearing non-thematic and non-rhematic information (transition). Having themes and rhemes recognized as successfully as possible we also hope to investigate thematic progression (thematic line) in texts in the future. The core of the procedure and its experimental implementation for Czech (using the bushbank corpus CBB.Blog [4] as a data source) are described in the paper. Since the task is really complicated we only offer basic evaluation, which, in our view, shows that the task is feasible.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.