Description of dataset: The dataset contains quantitative experimental data from a language classification task collected from 9–10th grade lower secondary school pupils (n = 398) from three Norwegian municipalities. The municipalities were located in Eastern Norway (Telemark county), Northern Norway (Nordland county), and Western Norway (Vestland county). In this task, the participants were asked to classify single sentences as either standard or dialectal Norwegian. The sentences were presented in standard written Norwegian (Bokmål and Nynorsk varieties) and various versions of dialect writing (Eastern, Northern, and Western). Participants' accuracy and reaction times were measured. The file with experimental data also contains by-participant measures of average reading speed in the two standard varieties of written Norwegian (Bokmål and Nynorsk) for a subset of 288 participants, which was derived from a separate study. The file further contains information on some of the participants' (n = 179) writing habits in private digital communication, expressed as the number of words the participants shared with the researchers, as well as the number and proportion of words containing speech-like deviations from the standard.
The dataset also includes qualitative background questionnaire data collected from a subset of the same participants (n = 352), which includes information about their age, gender, and various language use patterns, both oral and written. The dataset further includes all stimuli used in the experimental task, the list of questions from the background questionnaire, the analysis code, the OpenSesame experiment file, and the consent form template.
Abstract of the related publication: This study investigates how Norwegian adolescents with different literacy profiles perceive and distinguish standard and dialectal varieties of written Norwegian. Using a variety classification task, we analyzed response accuracy and reaction times (RTs) among 398 lower secondary school pupils across three regions: Eastern Norway (primarily Bokmål-literate), Northern Norway (exposed to Bokmål and dialect writing), and Western Norway (familiar with Nynorsk, Bokmål, and dialect writing). Participants with broader literacy profiles (Western Norway) achieved higher accuracy and showed longer RTs, indicating more thorough processing, while those with more limited exposure (Eastern Norway) employed more reductive or guessing-based strategies. However, even for Western participants, distinguishing the lesser-used standard, Nynorsk, from dialect writing was more challenging than distinguishing Bokmål. A supplementary analysis revealed that frequent use of dialect writing in private digital communication did not undermine standard literacy overall, although in the Western group it was associated with slightly lower accuracy, highlighting the vulnerable status of Nynorsk. These findings suggest that exposure to multiple written varieties enhances metalinguistic awareness and processing depth, while revealing regional asymmetries in the attainment of the two written standards. The study underscores the importance of explicitly addressing both standard varieties and dialect writing in education, with broader relevance for contexts where minoritized or non-standard varieties coexist alongside a dominant written language. |