Computational insights into the variation of Finnic folk songs

“Searching for the Comb” and “Sword from the Sea”


Keywords: folklore, oral poetry, runosong, digital humanities, Finnic languages, variation

The article introduces the joint Finnic runosong database and associated web environments and applications developed collaboratively by computer scientists and folklorists from Finland and Estonia. These tools facilitate new approaches to analyzing the extensive dataset. Within the research framework, various computational solutions have been devised in order to identify and associate with one another similar verses and texts that differ in orthography, language, and content. These methods have also been implemented in the web environment Runoregi (runoregi.rahtiapp.fi), allowing researchers and enthusiasts interested in traditional oral poetry to easily navigate the network of variant verses, motifs and texts, and to compare various texts and their elements. Additionally, there is a web application for maps and other visualizations integrated with the database and Runoregi environment.

While Runoregi serves as a valuable tool for the close reading and comparison of texts, obtaining an overview of large amounts of texts (the database currently contains over 280,000 texts) remains a challenge. We address this issue through an examination of the frequently contaminated song types “Searching for the Comb” and “Sword from the Sea”. Given that not all texts in the database are consistently typologized by folklorists, our sample includes texts identified by means of similarity calculations as similar to those sorted under the types under consideration. We computed adjacency scores for verse clusters obtained as a result of clustering verses by their similarity scores using the Chinese whispers method, presenting the results as a network graph (with verse clusters as nodes and adjacency scores as edges). The groups appearing in the network reflect regional plot developments and elements. Despite sharing plotlines and even poetic formulas, a clear divide emerged between Northern and Southern Finnic texts. As our verse similarity calculations may not capture linguistically distant variants, we manually consolidated variants of the same verse (with identical root composition of content words) across different dialects and languages. By applying adjacency computation and network visualisation, the graph now represents the general Finnic plot with the main alternative developments. The graph also highlights the stabler cross-Finnic verse types associated with significant plot turns.


Mari Sarv (b. 1972), PhD, Lead Research Fellow, Estonian Folklore Archives of the Estonian Literary Museum (Vanemuise 42, 51003 Tartu), mari@haldjas.folklore.ee

Kati Kallio (b. 1977), PhD, Academy Research Fellow at the University of Helsinki and the Finnish Literature Society (Hallituskatu 1, 00171 Helsinki, Finland), kati.kallio@finlit.fi

Maciej Michał Janicki (b. 1989), PhD, Postdoctoral Researcher, Department of Digital Humanities, University of Helsinki, maciej.janicki@helsinki.fi



Viited digiteeritud näitetekstide käsikirjalistele allikatele Eesti Kirjandusmuuseumi (EKM) Eesti Rahvaluule Arhiivis (ERA):

AES – Akadeemilise Emakeele Seltsi kogu

E – Matthias Johann Eiseni rahvaluulekogu

ERA – Eesti Rahvaluule Arhiivi rahvaluulekogu

ERM – Eesti Rahva Muuseumi rahvaluulekogu

EÜS – Eesti Üliõpilaste Seltsi rahvaluulekogu

H – Jakob Hurda rahvaluulekogu


Veebivarad ja elektroonilised tööriistad

ERAB = Eesti regilaulude andmebaas. Koost Janika Oras, Liina Saarlo, Mari Sarv, Kanni Labi, Merli Uus, Reda Šmitaite. Eesti Kirjandusmuuseumi Eesti Rahvaluule Arhiiv. https://www.folklore.ee/regilaul/andmebaas

FILTER andmebaas. Koost Maciej Janicki, Eetu Mäkelä ja FILTER projekti töörühm. Helsingi Ülikool, Soome Kirjanduse Selts, Eesti Kirjandusmuuseum.

FILTER visualizations. Loonud Maciej Janicki, Kati Kallio, Eetu Mäkelä, Jukka Saarinen, Mari Sarv. Tööversioon. Helsingi Ülikool, Soome Kirjanduse Selts, Eesti Kirjandusmuuseum. https://filter-visualizations.rahtiapp.fi

JR = Julkaisemattomat runot, digiteeritud versioon. Soome Kirjanduse Selts.

Octavo UI. Loonud Eetu Mäkelä. Helsingi Ülikool. https://jiemakel.github.io/octavo-nui

Runoregi. Loonud Maciej Janicki, Kati Kallio, Mari Sarv, Eetu Mäkelä. Helsingi Ülikool (HELDIG), Soome Kirjanduse Selts, Eesti Kirjandusmuuseum (versioon kuupäevast 28.11.2023). https://runoregi.rahtiapp.fi

Harja otsimine. https://runoregi.rahtiapp.fi/type?id=erab_001001003 https://runoregi.rahtiapp.fi/type?id=erab_orig2312; https://runoregi.rahtiapp.fi/type?id=erab_orig10132

Miekka merestä. https://runoregi.rahtiapp.fi/dendrogram?source=type&type_id=skvr_t010100_2270

Mõõk merest. https://runoregi.rahtiapp.fi/type?id=erab_001001013 https://runoregi.rahtiapp.fi/type?id=erab_orig958; https://runoregi.rahtiapp.fi/type?id=erab_orig1619)

Suka mereen. https://runoregi.rahtiapp.fi/type?id=skvr_t010100_3380

SKVR = SKVR-tietokanta – kalevalaisten runojen verkkopalvelu. Suomalaisen Kirjallisuuden Seura. https://skvr.fi



