Quantitative shift in linguistics – an old discussion revived

Keywords: quantitative methods, theoretical linguistics, formal linguistics, phil­osophy of science, scientific method

The new trend of incorporating a large amount of quantitative data into linguistic research has sparked a wave of scientific positivism, both among researchers as well as the public. In addition to the inevitable comparison and conflict between qualitative and quantitative methods it has raised questions about the relationship between data-based and what are commonly called theoretical approaches to language science, the former casting doubt on the validity of the latter. However, on a closer look it appears that the new forum, regarding itself as methodologically oriented, can sometimes shift its focus from the question of which methodology should be used and when, to the much older divide in linguistics, between the formal and usage-based approaches. The discussion then becomes focussed on the nature of language as a research object. This is why the question asked is: “What will become of theory?” rather than “What will become of non-empirical linguistics?”

The article proposes that there are two different definitions one can assign to the concept of “theory” – one can talk about the traditional part of the scientific model, i.e. the “why?” and “how?” question of every research project, or one can use it as a synonym for formal linguistics. The article claims that neither of them is in any conflict with empirical approaches. As a step in the scientific method it is in fact by definition unavoidable, because even without explicitly speculating about why a tendency occurs in data, a researcher is still forced to use theoretically based terminology. Conflict between formal approaches and data is equally misguided, because all branches of linguistics – descriptive, typological and formal – are cap­able of being, and often are, quantitatively empirical. Regardless of the amount of data, the role of theory cannot become marginalized, because one cannot exist without the other – theory provides data with meaning and data validates theory.

Mari Aigro (b. 1990), PhD Student, University of Tartu,


Anderson, Chris 2008. The end of theory: The data deluge makes the scientific method obsolete. – Wired Online. (10. II 2018).

Biberauer, Theresa, Roberts, Ian 2017. Parameter setting. – The Cambridge Handbook of Historical Syntax. Toim Adan Ledgeway, I. Roberts. Cambridge: Cambridge University Press, lk 134–162.

BIckel, Balthasar 2007. Typology in the 21st century: Major current developments. – Linguistic Typology, nr 11, lk 239–251.

Bod, Rens, Kaplan, Ronald 1998. A probabilistic corpus-driven model for lexical-functional analysis. – Proceedings of the 17th International Conference on Computational Linguistics, kd 1. Montreal, Quebec, Canada, August 10–14, 1998. Stroudsburg: Association for Computational Linguistics, lk 145–151.

Burton, Ian 1963. The quantitative revolution and theoretical geography. – Canadian Geographer, kd 7, nr 4, lk 151–162.

Castellvi, M. Teresa Cabre 2003. Theories of terminology: Their description, prescription and explanation. – Terminology, kd 9, nr 2, lk 163–199.

Chomsky, Noam 1957. Syntactic Structures. The Hague: Mouton.

Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge: MIT Press.

Chomsky, Noam 1975. The Logical Structure of Linguistic Theory. New York: Plenum.

Chomsky, Noam 1986. Knowledge of Language. New York: Praeger.

Chomsky, Noam 2000. New Horizons in the Study of Language and Mind. Cambridge: Cambridge University Press.

Chomsky, Noam 2002. On Nature and Language. Cambridge: Cambridge University Press.

Chomsky, Noam 2017. Some puzzling foundational issues. – Ettekanne. Generative Linguistics in the 21st Century: The Evidence and the Rhetoric. University of Reading, 11. V 2017.

Croft, William 1991. Syntactic Categories and Grammatical Relations. Chicago–London: The University of Chicago Press.

Croft, William 2002. Typology and Universals. Cambridge: Cambridge University Press.

Darwin, Charles, Seward, Anna 1903. More Letters of Charles Darwin. London: John Murray.

Dawid, Richard 2015. Non-empirical confirmation. – Ettekanne. Why Trust a Theory? Ludwig Maximilian University, 7.–9. XII 2015. (10. II 2018).

Dixon, Robert M. W. 1997. The Rise and Fall of Languages. Cambridge: Cambridge University Press.

Dixon, Robert M. W. 2009. Basic Linguistic Theory. Oxford: Oxford University Press.

Dryer, Matthew 2006. Descriptive theories, explanatory theories, and basic linguistics. – Catching Language: The Standing Challenge of Grammar Writing. Toim Felix Ameka, Alan Dench, Nicholas Evans. Berlin: Mouton De Gruyter, lk 207–235.

Evans, Nicholas, Dench, Alan 2006. Catching language. – Catching Language: The Standing Challenge of Grammar Writing. Toim Felix Ameka, A. Dench, N. Evans. Berlin: Mouton De Gruyter, lk 1–41.

Everett, Daniel 2012. Language: the Cultural Tool. New York: Pantheon.

Fabiszak, Małgorzata, Hilpert, Martin, Krawczak, Karolina 2016. Usage-based cognitive-functional linguistics: From theory to method and back again. – Folia Linguistica, kd 50, nr 2, lk 345–353.

Ferreira, Fernanda, Patson, Nikole D. 2007. The „good enough” approach to language comprehension. – Language and Linguistics Compass, kd 1, nr 1–2, lk 78–83.

Glynn, Dylan 2014. Polysemy and synonymy. – Corpus Methods for Semantics: Quantitative Studies in Polysemy and Synonymy. Toim D. Glynn, Justyna A. Robinson. Amsterdam: John Benjamins, lk 7–38.

Gries, Stefan 2010. Corpus linguistics and theoretical linguistics. A love-hate relationship? Not necessarily. – International Journal of Corpus Linguistics, kd 15, nr 3, lk 327–343.

Gries, Stefan 2015. Quantitative linguistics. – International Encyclopedia of the Social and Behavioral Sciences, kd 19. Toim James Wright. Amsterdam: Elsevier, lk 725–732.

Gries, Stefan 2017. Syntactic alternation research. Taking stock and some suggestions for the future. – Belgian Journal of Linguistics, nr 31, lk 7–27.

Haspelmath, Martin 2010a. Framework-free grammatical theory. – The Oxford Handbook of Linguistic Analysis. Toim Bernd Heine, Heiko Narrog. Oxford: Oxford University Press, lk 375–402.

Haspelmath, Martin 2010b. Comparative concepts and descriptive categories in crosslinguistic studies. – Language, kd 86, nr 3, lk 663–687.

Haspelmath, Martin 2016. The challenge of making language description and comparison mutually beneficial. – Linguistic Typology, kd 20, nr 2, lk 299–303.

Haspelmath, Martin 2017a. How comparative concepts and descriptive linguistic categories are different (draft). – Zenodo. (10. II 2018).

Haspelmath, Martin 2017b. Do We Need a „Framework” for Syntax? A Conversation between Richard Larson and Martin Haspelmath. (10. II 2018).

Hauser, Marc D., Chomsky, Noam, Fitch, W. Tecumseh 2002. The faculty of language: What is it, who has it, and how did it evolve? – Science, kd 298, nr 5598, lk 1569–1579.

Hint, Mati 2016. Mõõtmised ei loo teooriat. – Keel ja Kirjandus, nr 8–9, lk 627–637.

Humboldt, Wilhelm von 1836. Über die Verschiedenheit des Menschlichen Sprahenbaues. Berlin. (Chomsky 1965 kaudu.)

Koster, Jan 2006. Is linguistics a natural science? – Organizing Grammar: Linguistic Studies in Honor of Henk van Riemsdijk. Toim Hans Broekhuis, Norbert Corver, Riny Huijbregts, Ursula Kleinhenz, J. Koster. Berlin: Walter de Gruyter, lk 350–358.

Kuhn, Thomas S. 1970. The Structure of Scientific Revolutions. Chicago: The University of Chicago Press.

Levshina, Natalia 2015. How to do Linguistics with R. Amsterdam: John Benjamins Publishing Company.

Ludlow, Peter 2011. The Philosophy of Generative Grammar. Oxford: Oxford University Press.

Manning, Christopher D., Schütze, Hinrich 1999. Foundations of Statistical Natural Language Processing. Cambridge–London: MIT Press.

Mesthrie, Rajend 2012. The Cambridge Handbook of Sociolinguistics. Cambridge: Cambridge University Press.

Miyao, Yusuke, Ninomiya, Takashi, Tsujii, Jun’ichi 2004. Corpus-oriented grammar development for acquiring a head-driven phrase structure grammar from the Penn Treebank. – Natural Language Processing – IJCNLP 2004. Lecture Notes in Computer Science 3248. Toim K. Y. Su, J. Tsujii, J. H. Lee, O.Y. Kwong. Berlin–Heidelberg: Springer, lk 684–694.

Pearl, Lisa, Sprouse, Jon 2013. Syntactic islands and learning biases: Combining experimental syntax and computational modeling to investigate the language acquisition problem. – Language Acquisition, kd 20, nr 1, lk 23–68.

Pinker, Steven 1994. The Language Instinct. New York: William Morrow and Company.

Popper, Karl 1959. The Logic of Scientific Discovery. London–New York: Routledge.

Pullum, Geoffrey K., Scholz, Barbara C. 2002. Empirical assessment of stimulus poverty arguments. – The Linguistic Review, nr 19, lk 9–50.

Romaine, Suzanne 1994. Language in Society: An Introduction to Sociolinguistics. Oxford–New York: Oxford University Press.

Sapir, Edward 1921. Language. An Introduction to the Study of Speech. New York: Harcourt, Brace and company.

Satava, Richard M. 2005. The scientific method is dead – long live the (new) scientific method. – Surgical Innovation, kd 12, nr 2, lk 173–176.

Schütze, Carson T. 2016. The Empirical Base of Linguistics. Berlin: Language Science Press.

Shreeve, James 2004. Craig Venter’s epic voyage to redefine the origin of species. – Wired Online. (10. II 2018).

Van Herk, Gerard 2012. What Is Sociolinguistics? Cambridge–Oxford: Wiley.

Whewell, William 1847. The Philosophy of Inductive Sciences, Founded upon Their History. London: J. W. Parker and Son.

Whewell, William 1860. On the Philosophy of Discovery, Chapters Historical and Critical. London: J. W. Parker and Son.

Willis, David 2011. A minimalist approach to Jespersen’s cycle in Welsh. – Grammatical Change: Origins, Nature, Outcomes. Toim Dianne Jonas, John Whitman, Andrew Garrett. Oxford: Oxford University Press, lk 93–119.

Yngve, Victor H. 1996. From Grammar to Science. New Foundations for General Linguistics. Amsterdam: John Benjamins.