Computational insights into the variation of Finnic folk songs

“Searching for the Comb” and “Sword from the Sea”


Keywords: folklore, oral poetry, runosong, digital humanities, Finnic languages, variation

The article introduces the joint Finnic runosong database and associated web environments and applications developed collaboratively by computer scientists and folklorists from Finland and Estonia. These tools facilitate new approaches to analyzing the extensive dataset. Within the research framework, various computational solutions have been devised in order to identify and associate with one another similar verses and texts that differ in orthography, language, and content. These methods have also been implemented in the web environment Runoregi (runoregi.rahtiapp.fi), allowing researchers and enthusiasts interested in traditional oral poetry to easily navigate the network of variant verses, motifs and texts, and to compare various texts and their elements. Additionally, there is a web application for maps and other visualizations integrated with the database and Runoregi environment.

While Runoregi serves as a valuable tool for the close reading and comparison of texts, obtaining an overview of large amounts of texts (the database currently contains over 280,000 texts) remains a challenge. We address this issue through an examination of the frequently contaminated song types “Searching for the Comb” and “Sword from the Sea”. Given that not all texts in the database are consistently typologized by folklorists, our sample includes texts identified by means of similarity calculations as similar to those sorted under the types under consideration. We computed adjacency scores for verse clusters obtained as a result of clustering verses by their similarity scores using the Chinese whispers method, presenting the results as a network graph (with verse clusters as nodes and adjacency scores as edges). The groups appearing in the network reflect regional plot developments and elements. Despite sharing plotlines and even poetic formulas, a clear divide emerged between Northern and Southern Finnic texts. As our verse similarity calculations may not capture linguistically distant variants, we manually consolidated variants of the same verse (with identical root composition of content words) across different dialects and languages. By applying adjacency computation and network visualisation, the graph now represents the general Finnic plot with the main alternative developments. The graph also highlights the stabler cross-Finnic verse types associated with significant plot turns.


Mari Sarv (b. 1972), PhD, Lead Research Fellow, Estonian Folklore Archives of the Estonian Literary Museum (Vanemuise 42, 51003 Tartu), mari@haldjas.folklore.ee

Kati Kallio (b. 1977), PhD, Academy Research Fellow at the University of Helsinki and the Finnish Literature Society (Hallituskatu 1, 00171 Helsinki, Finland), kati.kallio@finlit.fi

Maciej Michał Janicki (b. 1989), PhD, Postdoctoral Researcher, Department of Digital Humanities, University of Helsinki, maciej.janicki@helsinki.fi



Viited digiteeritud näitetekstide käsikirjalistele allikatele Eesti Kirjandusmuuseumi (EKM) Eesti Rahvaluule Arhiivis (ERA):

AES – Akadeemilise Emakeele Seltsi kogu

E – Matthias Johann Eiseni rahvaluulekogu

ERA – Eesti Rahvaluule Arhiivi rahvaluulekogu

ERM – Eesti Rahva Muuseumi rahvaluulekogu

EÜS – Eesti Üliõpilaste Seltsi rahvaluulekogu

H – Jakob Hurda rahvaluulekogu


Veebivarad ja elektroonilised tööriistad

ERAB = Eesti regilaulude andmebaas. Koost Janika Oras, Liina Saarlo, Mari Sarv, Kanni Labi, Merli Uus, Reda Šmitaite. Eesti Kirjandusmuuseumi Eesti Rahvaluule Arhiiv. https://www.folklore.ee/regilaul/andmebaas

FILTER andmebaas. Koost Maciej Janicki, Eetu Mäkelä ja FILTER projekti töörühm. Helsingi Ülikool, Soome Kirjanduse Selts, Eesti Kirjandusmuuseum.

FILTER visualizations. Loonud Maciej Janicki, Kati Kallio, Eetu Mäkelä, Jukka Saarinen, Mari Sarv. Tööversioon. Helsingi Ülikool, Soome Kirjanduse Selts, Eesti Kirjandusmuuseum. https://filter-visualizations.rahtiapp.fi

JR = Julkaisemattomat runot, digiteeritud versioon. Soome Kirjanduse Selts.

Octavo UI. Loonud Eetu Mäkelä. Helsingi Ülikool. https://jiemakel.github.io/octavo-nui

Runoregi. Loonud Maciej Janicki, Kati Kallio, Mari Sarv, Eetu Mäkelä. Helsingi Ülikool (HELDIG), Soome Kirjanduse Selts, Eesti Kirjandusmuuseum (versioon kuupäevast 28.11.2023). https://runoregi.rahtiapp.fi

Harja otsimine. https://runoregi.rahtiapp.fi/type?id=erab_001001003 https://runoregi.rahtiapp.fi/type?id=erab_orig2312; https://runoregi.rahtiapp.fi/type?id=erab_orig10132

Miekka merestä. https://runoregi.rahtiapp.fi/dendrogram?source=type&type_id=skvr_t010100_2270

Mõõk merest. https://runoregi.rahtiapp.fi/type?id=erab_001001013 https://runoregi.rahtiapp.fi/type?id=erab_orig958; https://runoregi.rahtiapp.fi/type?id=erab_orig1619)

Suka mereen. https://runoregi.rahtiapp.fi/type?id=skvr_t010100_3380

SKVR = SKVR-tietokanta – kalevalaisten runojen verkkopalvelu. Suomalaisen Kirjallisuuden Seura. https://skvr.fi



Bastian, Mathieu; Heymann, Sébastien; Jacomy, Mathieu 2009. Gephi: an open source soft­ware for exploring and manipulating networks. Proceedings of the International AAAI Conference on Weblogs and Social Media, kd 3, nr 1, lk 361–362. https://doi.org/10.1609/icwsm.v3i1.13937

Blondel, Vincent D.; Guillaume, Jean-Loup; Lambiotte, Renaud; Lefebvre, Etienne 2008. Fast unfolding of communities in large networks. – Journal of Statistical Mechanics: Theory and Experiment, nr 10, P10008. https://doi.org/10.1088/1742-5468/2008/10/P10008

Cohen, Margaret 1999. The Sentimental Education of the Novel. Princeton: Princeton University Press.

Hiiemäe, Mall 2006. Kosmogoonilise harja otsimine. – Regilaul – esitus ja tõlgendus. (Eesti Rahvaluule Arhiivi toimetused 23.) Toim Aado Lintrop. Tartu: Eesti Kirjandusmuuseum, lk 21–48.

Janicki, Maciej 2023. Large-scale weighted sequence alignment for the study of inter­textuality in Finnic oral folk poetry. – Journal of Data Mining and Digital Humanities, nr NLP4DH. https://doi.org/10.46298/jdmdh.11390

Janicki, Maciej; Kallio, Kati; Sarv, Mari 2023. Exploring Finnic written oral folk poetry through string similarity. – Digital Scholarship in the Humanities, kd 38, nr 1, lk 180–194. https://doi.org/10.1093/llc/fqac034

Kalkun, Andreas 2015. Seto laul eesti folkloristika ajaloos. Lisandusi representatsiooniloole. (Eesti Rahvaluule Arhiivi toimetused 33.) Tartu: Eesti Kirjandusmuuseum.

Kallio, Kati; Janicki, Maciej; Mäkelä, Eetu; Saarinen, Jukka; Sarv, Mari; Saarlo, Liina 2023. Eteneminen omalla vastuulla. Lähdekriittinen laskennallinen näkökulma sähköisiin kansanrunoaineistoihin. – Elore, kd 30, nr 1, lk 59–90. https://doi.org/10.30666/elore.126008

Kikas, Katre 2014. Folklore collecting as vernacular literacy: Establishing a social position for writing in the 1890s Estonia. – Vernacular literacies – Past, present and future. Toim Ann-Catrine Edlund, Lars-Eric Edlund, Susanne Haugen. (Northern Studies Monographs 3. Vardagligt skriftbruk 3.) Umeå: Umeå University, Royal Skyttean Society, lk 309−323.

Kilgarriff, Adam; Rychly, Pavel; Smrz, Pavel; Tugwell, David 2004. The sketch engine. – Proceedings of the Eleventh EURALEX International Congress, EURALEX 2004. Lorient, France, July 6-10, 2004. Lorient: Université de Bretagne-Sud, lk 105–115.

Kundozerova 2022 = Мария Кундозерова, База данных «Карельские руны»: идея создания, концепция, перспективы. – Альманах североевропейских и балтийских исследований, вып. 7, lk 233–240. https:// doi.org/10.15393/j103.art.2022.2386

Lintrop, Aado 1999. Suur tamm, kuduvad neiud ja punane paat, kadunud harjast rääkimata. – Mäetagused, nr 10, lk 7–23. https://doi.org/10.7592/MT1999.10.tamm

Lintrop, Aado 2000. Suur tamm ja õde-venda. – Mäetagused, nr 13, lk 24–42. https://doi.org/10.7592/MT2000.13.suurtamm

Lintrop, Aado 2024. Kosmogooniline hari ja selestiline kiik. – Keel ja Kirjandus, nr 3, lk 219–237. https://doi.org/10.54013/kk795a1

Moretti, Franco 2000. The Slaughterhouse of Literature. – Modern Language Quarterly, kd 61, nr 1, lk 207–227. http://muse.jhu.edu/journals/mlq/summary/v061/61.1moretti.html

Mäkelä, Eetu; Koivunen, Anu; Kanner, Antti; Janicki, Maciej; Harju, Auli; Hokkanen, Julius; Seuri, Olli 2020. An approach for agile interdisciplinary digital humanities research – a case study in journalism. – TwinTalks 2020: Understanding and Facilitating Collaboration in Digital Humanities 2020. Proceedings of the Twin Talks 2 and 3 Workshops at DHN 2020 and DH 2020. (CEUR Workshop Proceedings 2717.) Toim Steven Krauwer, Darja Fišer. Aachen: RWTH Aachen University, lk 4–14. http://ceur-ws.org/Vol-2717/paper01.pdf

Piela, Ulla 2023. Toiveiden maa. Ylioppilaiden matkakertomuksia autonomian ajalta. (Tietolipas 282.) Helsinki: Suomalaisen Kirjallisuuden Seura.

Salmela, Alfred 1964. Päivän suka. – Kalevalaseuran vuosikirja, kd 44, lk 100–116.

Tarkka, Lotte 2005. Rajarahvaan laulu. Tutkimus Vuokkiniemen kalevalamittaisesta runokulttuurista 1821–1921. (Suomalaisen Kirjallisuuden Seuran Toimituksia 1033.) Helsinki: Suomalaisen Kirjallisuuden Seura.