Replication Data for: A Corpus Based Analysis of V2 Variation in West Flemish and French Flemish Dialects (doi:10.18710/NSFN2B)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link)

Document Description
Citation
Title:	Replication Data for: A Corpus Based Analysis of V2 Variation in West Flemish and French Flemish Dialects
Identification Number:	doi:10.18710/NSFN2B
Distributor:	DataverseNO
Date of Distribution:	2021-11-05
Version:	1
Bibliographic Citation:	Lybaert, Chloé; De Clerck, Bernard; Saelens, Jorien; De Cuypere, Ludovic, 2021, "Replication Data for: A Corpus Based Analysis of V2 Variation in West Flemish and French Flemish Dialects", https://doi.org/10.18710/NSFN2B, DataverseNO, V1
Study Description
Citation
Title:	Replication Data for: A Corpus Based Analysis of V2 Variation in West Flemish and French Flemish Dialects
Identification Number:	doi:10.18710/NSFN2B
Authoring Entity:	Lybaert, Chloé (Ghent University)
	De Clerck, Bernard (Ghent University)
	Saelens, Jorien (Ghent University)
	De Cuypere, Ludovic (Vrije Universiteit Brussel - Ghent University)
Producer:	Ghent University
Date of Production:	2018
Software used in Production:	MS Excel
Software used in Production:	R
Software used in Production:	RStudio
Distributor:	DataverseNO
Distributor:	The Tromsø Repository of Language and Linguistics (TROLLing)
Access Authority:	De Cuypere, Ludovic
Depositor:	De Cuypere, Ludovic
Date of Deposit:	2021-11-02
Date of Distribution:	2021-11-02
Holdings Information:	https://doi.org/10.18710/NSFN2B
Study Scope
Keywords:	Arts and Humanities, Linguistic data, corpus data, spoken corpus data, Dutch, Dutch dialects, West Flemish, French Flemish, Mixed-effects logistic regression analysis, Inversion, word order, V2, Subject-Verb inversion, constituent order, syntactic alternation
Abstract:	<p><b>Dataset abstract</b></p> <p>The dataset includes an annotated dataset of N = 1413 sentences (or parts thereof) taken from an authentic spoken corpus data from West Flemish and French Flemish (Dialects of Dutch). The sentences are annotated for V2 variation (Subject-Verb inversion, the outcome variable of the associated study) and seven predictor variables, including city, region, prosodic integration, form and function of the topicalized constituent, form of the subject, and the number of constituents in the prefield on (non)inverted word order. The dataset also includes geographical data to create a dialect map showing the relative frequencies of V2 variation. An R Notebook with the data analysis is provided.</p>
	<p><b> Article abstract</b></p> <p>This paper explores V2 variation in West Flemish and French Flemish dialects of Dutch based on an extensive corpus of authentic spoken data. After taking stock of the existing literature, we probe into the effect of region, prosodic integration, form and function of the topicalized constituent, form of the subject, and the number of constituents in the prefield on (non)inverted word order. This is the first study that carries out regression analysis on the combined impact of these variables in the entire West Flemish and French Flemish region, with additional visualization of effect sizes. The results show that noninversion is generally more widespread than originally anticipated, with unexpected higher occurrence of noninversion in continental West Flemish and lower frequencies in western West Flemish. With the exception of the variable number of constituents in the prefield, all other variables had a significant impact on word order: Clausal topicalized elements, elements that have peripheral functions, and elements that lack prosodic integration all favor noninverted word order. The form of the subject also impacted word order, but its effect is sometimes overruled by discourse considerations.</p>
Time Period:	1960-1970
Country:	Belgium, Belgium, France
Geographic Coverage:	West-Vlaanderen, Oost-Vlaanderen, Département du Nord - Département 59
Geographic Bounding Box:	West Bounding Longitude: 2.2 East Bounding Longitude: 3.7 South Bounding Latitude: 50.7 North Bounding Latitude: 51.5
Kind of Data:	corpus data
Methodology and Processing
Mode of Data Collection:	Manual selection of corpus data
Sources Statement
Data Sources:	Dialectloket: stemmen uit het verleden: http://www.dialectloket.be/geluid/stemmen-uit-het-verleden
Origins of Sources:	<p> Data for the present study was gathered from the dialect recordings collected by Ghent University and the Meertens Institute in Amsterdam in the 1960s and 1970s; see Dialectloket: Stemmen uit het verleden. URL: http://www.dialectloket.be/geluid/stemmen-uit-het-verleden. </p> <p>The purpose of these recordings was to capture the authentic local dialects that were affected as little as possible by Standard Dutch or other dialects. Recorded speakers had to meet several criteria: they had to be born and raised in the same place, have a relatively old age (older than 60) and a low level of education. Ideally both their parents and their partner spoke the same dialect. Most of the recorded dialect speakers who met these criteria were farmers born around 1900. All of the dialect speakers were born and raised well before the democratisation of education and the introduction of the mass media, which enhanced the spread of Standard Dutch in Flanders. </p> <p> The authentic local dialects were collected based on what Mesthrie et al. (2009:90) refer to as sociolinguistic interviews: An interviewer asks questions about the interviewee’s youth, profession, his/her experiences in times of war, and so on. To minimize the distance between the “middle-class researcher versus the subject” (Mesthrie et al. 2009:90) and the impact of age or class differences between the interviewer and the interviewee, the interviews proceeded in an informal environment with an interviewer taking on the role of a student. </p> <p> References: <br> Mesthrie, Rajend, Joan Swann, Ana Deumert, & William L. Leap. 2009. Introducing sociolinguistics. 2nd edn. Edinburgh: Edinburgh University Press. </p>
Documentation and Access to Sources:	http://www.dialectloket.be/geluid/stemmen-uit-het-verleden/
Cleaning Operations:	consistency checking
Data Access
Notes:	<a href="http://creativecommons.org/licenses/by-nc/4.0">CC BY-NC 4.0</a>
Other Study Description Materials
Related Materials
	Dialectloket: stemmen uit het verleden: http://www.dialectloket.be/geluid/stemmen-uit-het-verleden
	Oostendorp, M. van (2014). “Phonological and phonetic databases at the Meertens Institute.” The Oxford Handbook of Corpus Phonology. Eds. J. Durand & G. Kristoffersen. Oxford: OUP. 546-551.
Related Publications
Citation
Title:	Lybaert, C., De Clerck, B., Saelens, J., & De Cuypere, L. (2019). A Corpus-Based Analysis of V2 Variation in West Flemish and French Flemish Dialects. Journal of Germanic Linguistics, 31(1), 43-100. doi:10.1017/S1470542718000028
Identification Number:	10.1017/S1470542718000028
Bibliographic Citation:	Lybaert, C., De Clerck, B., Saelens, J., & De Cuypere, L. (2019). A Corpus-Based Analysis of V2 Variation in West Flemish and French Flemish Dialects. Journal of Germanic Linguistics, 31(1), 43-100. doi:10.1017/S1470542718000028
Other Study-Related Materials
Label:	0_ReadMe_Inversion_20211105.txt
Notes:	text/plain
Other Study-Related Materials
Label:	Inversion_20211102.csv
Text:	Data file, csv file with ";" as separator. "NA" = missing value.
Notes:	text/csv
Other Study-Related Materials
Label:	Inversion_data_analysis_20211102.Rmd
Text:	R Notebook with data analysis and source code for statistical analysis and figures in the related publication.
Notes:	application/octet-stream
Other Study-Related Materials
Label:	Region_20211102.csv
Text:	data file, csv file with ";" as separator. Location data with geographical coordinates for cities included in Inversion_20211102.csv.
Notes:	text/csv
Other Study-Related Materials
Label:	Sentences_20211102.csv
Text:	data file, csv file with ";" as separator. Source observations for Inversion_20211102.csv. ID is the key between both files.
Notes:	text/comma-separated-values