Replication Data for: Perceiving and identifying vowels in regional accents of English: Evidence from Dutch- and Spanish-speaking L2 listenersdoi:10.18710/FEC2BODataverseNO2026-01-291Verbeke, Gil; Escudero, Paola; Mitterer, Holger; Simon, Ellen, 2026, "Replication Data for: Perceiving and identifying vowels in regional accents of English: Evidence from Dutch- and Spanish-speaking L2 listeners", https://doi.org/10.18710/FEC2BO, DataverseNO, V1Replication Data for: Perceiving and identifying vowels in regional accents of English: Evidence from Dutch- and Spanish-speaking L2 listenersdoi:10.18710/FEC2BOVerbeke, GilEscudero, PaolaMitterer, HolgerSimon, EllenGhent University2025Flanders, BelgiumRR StudioPraatPsychoPyMS Excel1178623NK253023NK131425NDataverseNOThe Tromsø Repository of Language and Linguistics (TROLLing)Verbeke, GilVerbeke, Gil2025-05-03Arts and HumanitiesEnglish vowelsL2 perceptionacoustic similarityperceived similarityvowel identificationEnglish as a Foreign LanguageDataset abstract This dataset contains the results of a study on cross-language and second-language vowel perception in Dutch-speaking and Spanish-speaking learners of English. The dataset includes both acoustic similarity predictions and behavioral data from two perceptual tasks. For the acoustic comparisons, Linear Discriminant Analysis (LDA) models were trained on native vowel data from Dutch and Spanish speakers, recorded in earlier studies. The models were tested on English vowel tokens produced by speakers of Southern British English (S.Eng), Northern British English (N.Eng), and Australian English (AusE), and predict how similar these English vowels are to Dutch and Spanish vowels based on acoustic properties, such as formant frequencies and vowel duration. In addition to these acoustic predictions, the dataset includes behavioral responses collected during two experimental sessions. In the first session, 40 L1 Dutch and 40 L1 Spanish participants completed (i) a demographic and language background questionnaire, (ii) a cross-language vowel categorization task consisting of 210 trials, and (iii) a general vocabulary test (LexTALE; Lemhöfer & Broersma, 2012). During the cross-language categorization task, participants listened to English vowels produced in the three accents and indicated which vowel from their native language was most similar to that vowel, followed by a goodness-of-fit rating (i.e., how good an example of that vowel the sound was). In the second session, the same participants completed a second-language vowel categorization task with the same 210 trials, in which they were asked to identify which English vowel they heard and to rate how good an example of that vowel it was. The participants’ cross-language categorization responses were compared to the acoustic similarity scores from the LDA models, to assess how perceived (phonetic) similarity and acoustic similarity align. Participants' identification accuracy in the second-language task was analyzed using a mixed-effects logistic regression model. The repository includes all raw and processed data, the R code used for statistical analysis, and the model outputs.Article abstract This study examines how L2 English listeners perceive and categorize vowels produced in three regional accents of English: Southern British (S.Eng), Northern British (N.Eng), and Australian English (AusE). Specifically, we investigate how L1 speakers of Belgian Dutch and European Spanish classify these vowels in terms of their native vowel categories, and how such perceptual classifications relate to acoustic similarity between L1-L2 vowels and L2 vowel identification accuracy. To quantify cross-language acoustic similarity and predict which L2 vowel contrasts would be perceptually challenging, Linear Discriminant Analysis (LDA) models were trained on Dutch and Spanish vowel data and tested on English vowel data. 40 Dutch-speaking and 40 Spanish-speaking participants then completed a cross-language categorization task and second-language vowel identification task using naturally produced /CVC/ syllables. The results demonstrate that LDA-based acoustic similarity largely predicts cross-language perception, although certain vowel categorization patterns point to differences in acoustic cue-weighting between the LDA models and participants. Compared to Spanish listeners, Dutch listeners’ classifications showed greater divergence from the LDA model, likely reflecting the denser vowel inventory of Dutch and the resulting increase in category competition. Additionally, participants’ cross-language vowel categorization responses predicted their L2 vowel identification accuracy. That is, L2 vowels consistently mapped onto a (single) different L1 category with high goodness-of-fit were more likely to be identified correctly. Identification accuracy was highest for S.Eng vowels, aligning with participants’ greater self-reported familiarity with that accent. Together, our findings highlight the complex interplay between cross-language similarity, vowel inventory and accent familiarity in shaping L2 perception. 2025-03-012025-03-31Belgiumsociodemographic and linguistic background informationLexTALE scoresexperimental datacross-language vowel categorization datasecond-language vowel categorization data<ul> <li> Recordings of European Spanish speakers were sourced from the following study: Chládková, K., Escudero, P., & Boersma, P. (2011). Context-specific acoustic differences between Peruvian and Iberian Spanish vowels. The Journal of the Acoustical Society of America, 130(1), 416–428. <a href="https://doi.org/10.1121/1.3592242" target="_blank">https://doi.org/10.1121/1.3592242</a> </li> <li> Recordings of Belgian Dutch speakers were sourced from the following study: Adank, P., Van Hout, R., & Velde, H. V. D. (2007). An acoustic description of the vowels of northern and southern standard Dutch II: Regional varieties. The Journal of the Acoustical Society of America, 121(2), 1130–1141. <a href="https://doi.org/10.1121/1.2409492" target="_blank">https://doi.org/10.1121/1.2409492</a> </li> <li> Recordings of Australian English speakers were sourced from the following study: Elvin, J., Williams, D., & Escudero, P. (2016). Dynamic acoustic properties of monophthongs and diphthongs in Western Sydney Australian English. The Journal of the Acoustical Society of America, 140(1), 576–581. <a href="https://doi.org/10.1121/1.4952387" target="_blank">https://doi.org/10.1121/1.4952387</a> </li> <li> Recordings of Australian English speakers were sourced from the following study: Estival, D., Cassidy, S., Cox, F., & Burnham, D. (2014). AusTalk: An audio-visual corpus of Australian English. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14) (pp. 3105–3109). European Language Resources Association (ELRA). <a href="http://www.lrec-conf.org/proceedings/lrec2014/pdf/520_Paper.pdf" target="_blank">http://www.lrec-conf.org/proceedings/lrec2014/pdf/520_Paper.pdf</a> </li> <li> Recordings of Northern British English speakers were sourced from the following study: Strycharczuk, P., Kirkham, S., Gorman, E., & Nagamine, T. (2025). Dimensionality reduction in lingual articulation of vowels: Evidence from lax vowels in Northern Anglo-English. Language and Speech. <a href="https://doi.org/10.1177/00238309251320581" target="_blank">https://doi.org/10.1177/00238309251320581</a> </li> </ul><a href="http://creativecommons.org/publicdomain/zero/1.0">CC0 1.0</a>Verbeke, G., Escudero, P., Mitterer, H., & Simon, E. (under review). Perceiving and identifying vowels in regional accents of English: Evidence from Dutch- and Spanish-speaking L2 listeners.Verbeke, G., Escudero, P., Mitterer, H., & Simon, E. (under review). Perceiving and identifying vowels in regional accents of English: Evidence from Dutch- and Spanish-speaking L2 listeners.0_RegionalAccents_README.txtREADME file with general and methodological information about the study, as well as an overview of the data and files.text/plainRegionalAccents_DPIA.pdfShort assessment of whether the open publication of this dataset may be said to be in line with applicable legal regulations and research-ethical guidelines. application/pdfRegionalAccents_InformationLetter.pdfInformation letter participants received before giving informed consent. application/pdfRegionalAccents_InformedConsent.pdfInformed consent sheet. application/pdfRegionalAccents_S.Eng_SentenceReading.pdfContains the sentences spoken by the S.Eng (Southern English) speaker during the recording session. application/pdfRegionalAccents_S.Eng_Speakers.csvDemographic and linguistic background of the S.Eng speakers. text/comma-separated-valuesRegionalAccents_AusE.csvMeasurements of Australian English vowels, based on vowel productions in Elvin et al. (2016) and Estival et al. (2014). text/comma-separated-valuesRegionalAccents_Dutch.csvMeasurements of Belgian Dutch vowels, based on vowel productions in Adank et al. (2007). text/comma-separated-valuesRegionalAccents_N.Eng.csvMeasurements of Northern British English vowels, based on vowel productions in Strycharczuk et al. (2025). text/comma-separated-valuesRegionalAccents_S.Eng.csvMeasurements of Southern British English vowels. text/comma-separated-valuesRegionalAccents_Spanish.csvMeasurements of European Spanish vowels, based on vowel productions in Chládková et al. (2011). text/comma-separated-valuesRegionalAccents_Stimuli_Acoustics.csvAcoustic properties of the stimulus materials. text/comma-separated-valuesDutch_AusE.probs.csvOutput of the Dutch-trained LDA model, tested on Australian English vowel data.text/comma-separated-valuesDutch_N.Eng.probs.csvOutput of the Dutch-trained LDA model, tested on Northern British English vowel data.text/comma-separated-valuesDutch_S.Eng.probs.csvOutput of the Dutch-trained LDA model, tested on Southern British English vowel data.text/comma-separated-valuesSpanish_AusE.probs.csvOutput of the Spanish-trained LDA model, tested on Australian English vowel data.text/comma-separated-valuesSpanish_N.Eng.probs.csvOutput of the Spanish-trained LDA model, tested on Northern British English vowel data.text/comma-separated-valuesSpanish_S.Eng.probs.csvOutput of the Spanish-trained LDA model, tested on Southern British English vowel data.text/comma-separated-valuesRegionalAccents_Predictions_Dutch.csvCross-language acoustic similarity predictions based on the Dutch-trained LDA model.text/comma-separated-valuesRegionalAccents_Predictions_Spanish.csvCross-language acoustic similarity predictions based on the Spanish-trained LDA model.text/comma-separated-valuesRegionalAccents_Categorization_Dutch.csvDutch-speaking participants' cross-language and second-language categorization responses. text/comma-separated-valuesRegionalAccents_Categorization_Spanish.csvSpanish-speaking participants' cross- language and second-language categorization responses. text/comma-separated-valuesRegionalAccents_Familiarity_Dutch.csvDutch-speaking participants' self-reported familiarity with regional accents of English. text/comma-separated-valuesRegionalAccents_Familiarity_Spanish.csvSpanish-speaking participants' self-reported familiarity with regional accents of English. text/comma-separated-valuesRegionalAccents_Questionnaire_Dutch.csvDutch-speaking participants' responses to the background questionnaire. text/comma-separated-valuesRegionalAccents_Questionnaire_Spanish.csvSpanish-speaking participants' responses to the background questionnaire. text/comma-separated-valuesRegionalAccents_Cross-Language_Second_Language_Classification.pdfPDF output of the R Markdown script used to analyze participants' cross-language and second-language vowel categorization responses.application/pdfRegionalAccents_Cross-Language_Second_Language_Classification.RmdR Markdown for the analysis of participants' cross-language and second-language vowel categorizations.text/x-r-notebookRegionalAccents_Cross-Language_Vowel_Predictions.pdfPDF output of the R Markdown script used to generate cross-language vowel categorization predictions based on acoustic similarity models.application/pdfRegionalAccents_Cross-Language_Vowel_Predictions.RmdR Markdown for predicting cross-language acoustic similarity.text/x-r-notebook