Foreign Accent Corpus and studies of non-native speech


The article begins with an overview of the nature and theoretical models of foreign accent, then the Foreign Accent Corpus developed at the Laboratory of Phonetics and Speech Technology, Institute of Cybernetics at Tallinn University of Technology, is introduced, and finally some research results are discussed. As a rule, adult language learners are faced with difficulties in the pronunciation of foreign sounds and these difficulties result in a foreign accent. In order to study different aspects of foreign accent our accent corpus was initiated in 2006. The aim of the Estonian Foreign Accent Corpus (EFAC) is to provide systematic and high-quality non-native speech data for experimental phonetic studies and for language technology developments.

The text corpus involves 130 neutral sentences including the main phonological oppositions of Estonian, eight questions, two passages, and prompts to elicit spontaneous speech (self-introduction, description of three pictures).

Most of the speech recordings have been done in our own recording studio; several subjects have been recorded at their home universities in Finland (Oulu, Helsinki and Turku), France (Paris), Austria (Wien) and Latvia (Riga). All subjects have filled a questionnaire containing questions about their age, native language, age of learning Estonian, where and how they have learnt the language and how often they use it, as well as self-assessment of their knowledge of Estonian. By now we have recorded 162 non-native speakers (L2) of Estonian representing 18 different language backgrounds. In addition, a reference group of 20 native Estonian speakers (L1) has been recorded.

The EFAC constitutes a valuable resource for phonetic studies of L2 speech. So far, we have been focused on the perception and production of Estonian vowels, short/long oppositions and quantity degrees by native speakers of Russian and Finnish as compared to L1 subjects. The results reveal differences in L1 and L2 groups due to differences in the vowel systems and the different role of duration in Finnish, Russian and Estonian.