Doing numbers and Cognitive Linguistics

Keywords: corpus linguistics, forced choice task, logistic regression, Estonian

The paper gives a short overview of the recent trends in Cognitive Linguistics. It focuses on the methodological aspects involved and exemplifies how the performance of a corpus-based statistical model can be evaluated by comparing it against the behaviour of native speakers in a linguistic experiment. A mixed-effects logistic regression model is fitted to the corpus data of the Estonian adessive case and the adposition peal ‘on’ in present-day written Estonian. In order to evaluate the goodness of the corpus-based model, its performance is compared to the behaviour of native speakers in a forced choice task. In general, the results of the study reported in this paper show that an adequately constructed probabilistic model based on richly annotated corpus data can perform at a more or less equal level to human beings.

Jane Klavan (b. 1983), PhD, University of Tartu, Lecturer in English Language,



