Replication data for: Allomorphs of French de in coordination: a reproducible study (doi:10.18710/GF8QZ5)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link)

Document Description

Citation

Title:

Replication data for: Allomorphs of French de in coordination: a reproducible study

Identification Number:

doi:10.18710/GF8QZ5

Distributor:

DataverseNO

Date of Distribution:

2014-12-29

Version:

1

Bibliographic Citation:

Zuraw, Kie, 2014, "Replication data for: Allomorphs of French de in coordination: a reproducible study", https://doi.org/10.18710/GF8QZ5, DataverseNO, V1

Study Description

Citation

Title:

Replication data for: Allomorphs of French de in coordination: a reproducible study

Identification Number:

doi:10.18710/GF8QZ5

Authoring Entity:

Zuraw, Kie (University of California, Los Angeles)

Producer:

University of California, Los Angeles

Date of Production:

2015

Distributor:

DataverseNO

Distributor:

The Tromsø Repository of Language and Linguistics (TROLLing)

Access Authority:

Zuraw, Kie

Date of Deposit:

2014-12-27

Holdings Information:

https://doi.org/10.18710/GF8QZ5

Study Scope

Keywords:

Arts and Humanities, French, coordination, phonology-syntax interface, Google n-grams, reproducibility

Topic Classification:

Field: Phonology, Time-depth: synchronic, Topic: adpositions, Topic: conjunctions

Abstract:

It is known that French de ‘of’ can take wide scope in coordination—that is, the coordination can optionally be reduced by omitting the second de: de X et/ou (de) Y, meaning roughly ‘of X and/or (of) Y’. De has an allomorph d’ that is used when the following word begins with a vowel. This paper shows, using a large written corpus, that the two allomorphs, de and d’, do not behave the same when it comes to reduction/wide scope. Two main factors seem to be at play: resistance of the d’ allomorph to taking wide scope, and hiatus avoidance between et/ou (which are both vowel-final) and a following vowel-initial word. The existence of phonological factors that affect reduction rate implies that the grammar and/or processing architecture must retrieve some phonological information about X and Y before the final “decision” about reduction is made—or that the phonology is powerful enough to delete the second de on its own. This paper also aims to make a methodological contribution to reproducibility. The web materials accompanying the paper (scripts and documentation) allow the reader to reproduce all the steps of the data processing analysis, starting from a publicly available corpus.

Kind of Data:

corpus

Methodology and Processing

Sources Statement

Data Access

Other Study Description Materials

Related Publications

Citation

Title:

Zuraw, Kie. "Allomorphs of French de in coordination: a reproducible study" Linguistics Vanguard, vol. 1, no. 1, 2015, pp. 57-68. https://doi.org/10.1515/lingvan-2014-1017

Identification Number:

10.1515/lingvan-2014-1017

Bibliographic Citation:

Zuraw, Kie. "Allomorphs of French de in coordination: a reproducible study" Linguistics Vanguard, vol. 1, no. 1, 2015, pp. 57-68. https://doi.org/10.1515/lingvan-2014-1017

Other Study-Related Materials

Label:

compile_ngrams_de.py

Text:

From input french_de_grepped.txt, creates output french_de_collated.txt. Used in Step 3 as outlined in introduction to HTML/Rmd file.

Notes:

text/plain; charset=UTF-8

Other Study-Related Materials

Label:

FrenchDe_Zuraw_ForPosting.html

Text:

HTML-format guide to reproducing the study. Includes narrative, R code, and R output and figures.

Notes:

text/html

Other Study-Related Materials

Label:

FrenchDe_Zuraw_ForPosting.Rmd

Text:

Rmd-format file that generates the HTML file. Includes narrative and R code. Open this file in RStudio and use knitr() to re-create HTML file.

Notes:

text/plain; charset=US-ASCII

Other Study-Related Materials

Label:

french_de_collated.txt

Text:

Output of python script that collates relevant lines from Google n-grams corpus. Second file in data pipeline. Output of Step 3 as outlined in introduction of HTML/Rmd file.

Notes:

text/plain; charset=UTF-8

Other Study-Related Materials

Label:

french_de_grepped.txt

Text:

Relevant lines extracted from Google n-grams file. First processed file in pipeline. Output of Step 2 as outlined in introduction of HMTL/Rmd file.

Notes:

text/plain; charset=US-ASCII

Other Study-Related Materials

Label:

french_de_handcoded.txt

Text:

Result of applying hand-coding (with criteria described in HTML/Rmd file) to french_de_collated.txt. Third file in data pipeline. Output of Step 4 as outlined in introduction of HTML/Rmd file.

Notes:

text/plain; charset=UTF-8

Other Study-Related Materials

Label:

french_de_inferences.txt

Text:

Result of automatically applying close inferences to french_de_handcoded.txt. Fourth file in data pipeline. Output of step 5 as outlined in introduction of HTML/Rmd file.

Notes:

text/plain

Other Study-Related Materials

Label:

french_de_more_inferences.txt

Text:

Result of applying further automatic inferences to french_de_inferences.txt. Fifth and final file in data pipeline. All analysis done in R script after this occurs without writing to a new data file. Output of Step 6 as outlined in introduction of HTML/Rmd file.

Notes:

text/plain