Replication Data for: A Corpus Based Analysis of V2 Variation in West Flemish and French Flemish Dialects (doi:10.18710/NSFN2B)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link)

Document Description

Citation

Title:

Replication Data for: A Corpus Based Analysis of V2 Variation in West Flemish and French Flemish Dialects

Identification Number:

doi:10.18710/NSFN2B

Distributor:

DataverseNO

Date of Distribution:

2021-11-05

Version:

1

Bibliographic Citation:

Lybaert, Chloé; De Clerck, Bernard; Saelens, Jorien; De Cuypere, Ludovic, 2021, "Replication Data for: A Corpus Based Analysis of V2 Variation in West Flemish and French Flemish Dialects", https://doi.org/10.18710/NSFN2B, DataverseNO, V1

Study Description

Citation

Title:

Replication Data for: A Corpus Based Analysis of V2 Variation in West Flemish and French Flemish Dialects

Identification Number:

doi:10.18710/NSFN2B

Authoring Entity:

Lybaert, Chloé (Ghent University)

De Clerck, Bernard (Ghent University)

Saelens, Jorien (Ghent University)

De Cuypere, Ludovic (Vrije Universiteit Brussel - Ghent University)

Producer:

Ghent University

Date of Production:

2018

Software used in Production:

MS Excel

Software used in Production:

R

Software used in Production:

RStudio

Distributor:

DataverseNO

Distributor:

The Tromsø Repository of Language and Linguistics (TROLLing)

Access Authority:

De Cuypere, Ludovic

Depositor:

De Cuypere, Ludovic

Date of Deposit:

2021-11-02

Date of Distribution:

2021-11-02

Holdings Information:

https://doi.org/10.18710/NSFN2B

Study Scope

Keywords:

Arts and Humanities, Linguistic data, corpus data, spoken corpus data, Dutch, Dutch dialects, West Flemish, French Flemish, Mixed-effects logistic regression analysis, Inversion, word order, V2, Subject-Verb inversion, constituent order, syntactic alternation

Abstract:

<p><b>Dataset abstract</b></p> <p>The dataset includes an annotated dataset of N = 1413 sentences (or parts thereof) taken from an authentic spoken corpus data from West Flemish and French Flemish (Dialects of Dutch). The sentences are annotated for V2 variation (Subject-Verb inversion, the outcome variable of the associated study) and seven predictor variables, including city, region, prosodic integration, form and function of the topicalized constituent, form of the subject, and the number of constituents in the prefield on (non)inverted word order. The dataset also includes geographical data to create a dialect map showing the relative frequencies of V2 variation. An R Notebook with the data analysis is provided.</p>

<p><b> Article abstract</b></p> <p>This paper explores V2 variation in West Flemish and French Flemish dialects of Dutch based on an extensive corpus of authentic spoken data. After taking stock of the existing literature, we probe into the effect of region, prosodic integration, form and function of the topicalized constituent, form of the subject, and the number of constituents in the prefield on (non)inverted word order. This is the first study that carries out regression analysis on the combined impact of these variables in the entire West Flemish and French Flemish region, with additional visualization of effect sizes. The results show that noninversion is generally more widespread than originally anticipated, with unexpected higher occurrence of noninversion in continental West Flemish and lower frequencies in western West Flemish. With the exception of the variable number of constituents in the prefield, all other variables had a significant impact on word order: Clausal topicalized elements, elements that have peripheral functions, and elements that lack prosodic integration all favor noninverted word order. The form of the subject also impacted word order, but its effect is sometimes overruled by discourse considerations.</p>

Time Period:

1960-1970

Country:

Belgium, Belgium, France

Geographic Coverage:

West-Vlaanderen, Oost-Vlaanderen, Département du Nord - Département 59

Geographic Bounding Box:

  • West Bounding Longitude: 2.2
  • East Bounding Longitude: 3.7
  • South Bounding Latitude: 50.7
  • North Bounding Latitude: 51.5

Kind of Data:

corpus data

Methodology and Processing

Mode of Data Collection:

Manual selection of corpus data

Sources Statement

Data Sources:

Dialectloket: stemmen uit het verleden: http://www.dialectloket.be/geluid/stemmen-uit-het-verleden

Origins of Sources:

<p> Data for the present study was gathered from the dialect recordings collected by Ghent University and the Meertens Institute in Amsterdam in the 1960s and 1970s; see Dialectloket: Stemmen uit het verleden. URL: http://www.dialectloket.be/geluid/stemmen-uit-het-verleden. </p> <p>The purpose of these recordings was to capture the authentic local dialects that were affected as little as possible by Standard Dutch or other dialects. Recorded speakers had to meet several criteria: they had to be born and raised in the same place, have a relatively old age (older than 60) and a low level of education. Ideally both their parents and their partner spoke the same dialect. Most of the recorded dialect speakers who met these criteria were farmers born around 1900. All of the dialect speakers were born and raised well before the democratisation of education and the introduction of the mass media, which enhanced the spread of Standard Dutch in Flanders. </p> <p> The authentic local dialects were collected based on what Mesthrie et al. (2009:90) refer to as sociolinguistic interviews: An interviewer asks questions about the interviewee’s youth, profession, his/her experiences in times of war, and so on. To minimize the distance between the “middle-class researcher versus the subject” (Mesthrie et al. 2009:90) and the impact of age or class differences between the interviewer and the interviewee, the interviews proceeded in an informal environment with an interviewer taking on the role of a student. </p> <p> References: <br> Mesthrie, Rajend, Joan Swann, Ana Deumert, & William L. Leap. 2009. Introducing sociolinguistics. 2nd edn. Edinburgh: Edinburgh University Press. </p>

Documentation and Access to Sources:

http://www.dialectloket.be/geluid/stemmen-uit-het-verleden/

Cleaning Operations:

consistency checking

Data Access

Notes:

<a href="http://creativecommons.org/licenses/by-nc/4.0">CC BY-NC 4.0</a>

Other Study Description Materials

Related Materials

Dialectloket: stemmen uit het verleden: http://www.dialectloket.be/geluid/stemmen-uit-het-verleden

Oostendorp, M. van (2014). “Phonological and phonetic databases at the Meertens Institute.” The Oxford Handbook of Corpus Phonology. Eds. J. Durand & G. Kristoffersen. Oxford: OUP. 546-551.

Related Publications

Citation

Title:

Lybaert, C., De Clerck, B., Saelens, J., & De Cuypere, L. (2019). A Corpus-Based Analysis of V2 Variation in West Flemish and French Flemish Dialects. Journal of Germanic Linguistics, 31(1), 43-100. doi:10.1017/S1470542718000028

Identification Number:

10.1017/S1470542718000028

Bibliographic Citation:

Lybaert, C., De Clerck, B., Saelens, J., & De Cuypere, L. (2019). A Corpus-Based Analysis of V2 Variation in West Flemish and French Flemish Dialects. Journal of Germanic Linguistics, 31(1), 43-100. doi:10.1017/S1470542718000028

Other Study-Related Materials

Label:

0_ReadMe_Inversion_20211105.txt

Notes:

text/plain

Other Study-Related Materials

Label:

Inversion_20211102.csv

Text:

Data file, csv file with ";" as separator. "NA" = missing value.

Notes:

text/csv

Other Study-Related Materials

Label:

Inversion_data_analysis_20211102.Rmd

Text:

R Notebook with data analysis and source code for statistical analysis and figures in the related publication.

Notes:

application/octet-stream

Other Study-Related Materials

Label:

Region_20211102.csv

Text:

data file, csv file with ";" as separator. Location data with geographical coordinates for cities included in Inversion_20211102.csv.

Notes:

text/csv

Other Study-Related Materials

Label:

Sentences_20211102.csv

Text:

data file, csv file with ";" as separator. Source observations for Inversion_20211102.csv. ID is the key between both files.

Notes:

text/comma-separated-values