Replication Data for: The decade construction rivalry in Russian: Using a corpus to study historical linguistics (doi:10.18710/QKHCVE)

View:

Part 1: Document Description
Part 2: Study Description
Part 3: Data Files Description
Part 4: Variable Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link)

Document Description

Citation

Title:

Replication Data for: The decade construction rivalry in Russian: Using a corpus to study historical linguistics

Identification Number:

doi:10.18710/QKHCVE

Distributor:

DataverseNO

Date of Distribution:

2017-12-19

Version:

1

Bibliographic Citation:

Nesset, Tore; Makarova, Anastasia, 2017, "Replication Data for: The decade construction rivalry in Russian: Using a corpus to study historical linguistics", https://doi.org/10.18710/QKHCVE, DataverseNO, V1, UNF:6:Db2qzplaVWoISX4NVN/VXw== [fileUNF]

Study Description

Citation

Title:

Replication Data for: The decade construction rivalry in Russian: Using a corpus to study historical linguistics

Identification Number:

doi:10.18710/QKHCVE

Authoring Entity:

Nesset, Tore (UiT The Arctic University of Norway)

Makarova, Anastasia (UiT The Arctic University of Norway)

Producer:

UiT The Arctic University of Norway

Distributor:

DataverseNO

Distributor:

The Tromsø Repository of Language and Linguistics (TROLLing)

Access Authority:

Nesset, Tore

Depositor:

Nesset, Tore

Date of Deposit:

2017-12-18

Holdings Information:

https://doi.org/10.18710/QKHCVE

Study Scope

Keywords:

Arts and Humanities, Russian, corpus linguistics, temporal adverbials, leveling, sociolinguistic differentiation, semantic differentiation, CART analysis, Rival forms, historical linguistics

Abstract:

This dataset contains 3 data files, 5 files with R code, and a short read-me file with documentation. The data files contain information about the development of two competing constructions in Russian temporal adverbials. The files with R code give the code for analysis of the databases.

ARTICLE ABSTRACT: What can a corpus do for the historical linguist? How can corpus data shed light on the diachronic development of so-called rival forms, i.e., words or grammatical constructions that appear to be synonyms? This article addresses these questions based on a detailed empirical analysis of two seemingly synonymous constructions in Russian. Corresponding to the English ‘decade construction’ in the twenties, Russian has two rival constructions, viz. v dvadcatye gody [lit. “in the twentieth years”] (with the numeral and noun in the accusative) and v dvadcatyx godax (with the numeral and noun in the locative case). Three hypotheses about rival forms are considered: leveling (whereby one form ousts its rival), sociolinguistic differentiation (whereby the two rivals survive in different varieties of a language) and semantic differentiation (whereby the two rivals develop different meanings over time). Contrary to what has been suggested in the literature, we find little evidence for semantic and sociolinguistic differentiation. Instead, we demonstrate that leveling is taking place, since the accusative construction is in the process of ousting its rival. While our study shows that corpus data facilitate detailed analysis of the interaction between leveling, sociolinguistic differentiation and semantic differentiation, our analysis also points to limitations, especially when it comes to corpus-based analysis of sociolinguistic and semantic factors.

Kind of Data:

Corpus data

Methodology and Processing

Sources Statement

Data Sources:

Russian National Corpus (www.ruscorpora.ru)

Data Access

Other Study Description Materials

Related Publications

Citation

Title:

Nesset, Tore & Anastasia Makarova (2018). The decade construction rivalry in Russian: Using a corpus to study historical linguistics. Diachronica 35(1). 71–106. doi: https://doi.org/10.1075/dia.16043.nes

Identification Number:

10.1075/dia.16043.nes

Bibliographic Citation:

Nesset, Tore & Anastasia Makarova (2018). The decade construction rivalry in Russian: Using a corpus to study historical linguistics. Diachronica 35(1). 71–106. doi: https://doi.org/10.1075/dia.16043.nes

File Description--f2193

File: 03DecadeDatabaseForCART.tab

  • Number of cases: 3128

  • No. of variables per record: 34

  • Type of File: text/tab-separated-values

Notes:

UNF:6:qSqbdOZ2A2CU0wHdnZTtQw==

File Description--f2194

File: 04 DecadeDatabaseForCARTOneExPerAuthor.tab

  • Number of cases: 783

  • No. of variables per record: 34

  • Type of File: text/tab-separated-values

Notes:

UNF:6:gUurEdr7JF1El8uhGH8iTw==

Variable Description

List of Variables:

Variables

ReversedLeftContext

f2193 Location:

Variable Format: character

Notes: UNF:6:uHoLsexvHurhewGqjRdjWw==

ReversedCenter

f2193 Location:

Variable Format: character

Notes: UNF:6:lEtDrOv2CzahB83pPS9UuQ==

Left context

f2193 Location:

Variable Format: character

Notes: UNF:6:te08dlSpOeoLpEj21YgOKQ==

Center

f2193 Location:

Variable Format: character

Notes: UNF:6:ojQ/ET9pKs5yZelWY8olJw==

RightContext

f2193 Location:

Variable Format: character

Notes: UNF:6:37tHh/xbIdAAuCJzomOxdA==

CASE

f2193 Location:

Variable Format: character

Notes: UNF:6:zZOcA/e2LMaiw4rkHLFJQQ==

DECADE

f2193 Location:

Variable Format: character

Notes: UNF:6:OUC5IdddDa/S6EqT/brCsg==

Title

f2193 Location:

Variable Format: character

Notes: UNF:6:cYzzUvr9+16HKQvQMOIs7w==

Author

f2193 Location:

Variable Format: character

Notes: UNF:6:M3QHSgvmV75hNHnrR8llMw==

AuthorPreferenceWritersWithMoreThanThreeAttestationsOfDecadeConstructions

f2193 Location:

Variable Format: character

Notes: UNF:6:PvgChN82Nf1ok8YyjdH4ug==

AuthorGender

f2193 Location:

Variable Format: character

Notes: UNF:6:q0dQMwK/ZcFnwsKzH9tl1w==

OneExPerAuthor

f2193 Location:

Summary Statistics: Min. 1.0; Max. 2.0; StDev 0.4332663811717813; Mean 1.749680306905371; Valid 3128.0

Variable Format: numeric

Notes: UNF:6:iIdVHD6REhg1xt8PzH837Q==

Birthday

f2193 Location:

Variable Format: character

Notes: UNF:6:GjmTXuYeNTeHElRWJCRSPA==

BirthdayPeriod

f2193 Location:

Variable Format: character

Notes: UNF:6:6dM2AQk/85A3TdDn82SYtQ==

Header

f2193 Location:

Variable Format: character

Notes: UNF:6:RLnCqAGKDF4IuouLK1g04Q==

Created

f2193 Location:

Variable Format: character

Notes: UNF:6:71zKrqyQ1YZo6yc+5zWCOA==

PeriodCreated

f2193 Location:

Variable Format: character

Notes: UNF:6:zZsMbdet2MJ5zM7yl1sq+w==

Sphere

f2193 Location:

Variable Format: character

Notes: UNF:6:J4EedCLdQme451ldOBKR8g==

SphereConflated

f2193 Location:

Variable Format: character

Notes: UNF:6:PBIgjGYoDDUBfANIKU+lig==

Type

f2193 Location:

Variable Format: character

Notes: UNF:6:k3/gzI+fPPy9cynAd8ZVqg==

Topic

f2193 Location:

Variable Format: character

Notes: UNF:6:YLWx0ZJgxp7KcszWlkAc4A==

Publication

f2193 Location:

Variable Format: character

Notes: UNF:6:Ccqhah2v2u4PWcl3ZcrBcg==

PublicationYear

f2193 Location:

Variable Format: character

Notes: UNF:6:i4fnTCNm+C4jRLjL+nVQAA==

Medium

f2193 Location:

Variable Format: character

Notes: UNF:6:zezKbTSuahNxa4Qs+JmN+w==

Ambiguity

f2193 Location:

Variable Format: character

Notes: UNF:6:TewDMRwgDRVSn2kD0SBjxA==

AppositiveParticipleOrGerundConstruction

f2193 Location:

Variable Format: character

Notes: UNF:6:+h5NTFfakuSibSwoitVJsw==

ASPECT

f2193 Location:

Variable Format: character

Notes: UNF:6:OLMQVTFWCH+KgntK3ApFHQ==

WordOrder

f2193 Location:

Variable Format: character

Notes: UNF:6:1xw6lHFKhSsm0cSofKprww==

AspectualType

f2193 Location:

Variable Format: character

Notes: UNF:6:pM8+I59fWG1eBSwiDVRPfw==

FrequencyPredicateLemmaExWithBirthday

f2193 Location:

Variable Format: character

Notes: UNF:6:eYYpUKISszgbV/g2v1DqdQ==

PredicateLemma

f2193 Location:

Variable Format: character

Notes: UNF:6:FTW7EefIc25dxPBzscdrOw==

PredicateInflected

f2193 Location:

Variable Format: character

Notes: UNF:6:7UKz6wrjH42XP1PhJ4m3yw==

PredicateGrammaticalForm

f2193 Location:

Variable Format: character

Notes: UNF:6:OuMiDdUK0CbjErrHPWfbUQ==

FullContext

f2193 Location:

Variable Format: character

Notes: UNF:6:3kL2tyOTMs9ndAR6EYHy5Q==

ReversedLeftContext

f2194 Location:

Variable Format: character

Notes: UNF:6:5A58JfoKldFR7fKZR8aTjw==

ReversedCenter

f2194 Location:

Variable Format: character

Notes: UNF:6:L9ONMsb5YHz9cG9xD1035w==

Left context

f2194 Location:

Variable Format: character

Notes: UNF:6:aA/CAaLSsjnTr+8Lq16Q7A==

Center

f2194 Location:

Variable Format: character

Notes: UNF:6:it/MyQ2QwIaOvgK3nmZPgg==

RightContext

f2194 Location:

Variable Format: character

Notes: UNF:6:soPV8uRmwwH/6rO0WqAN2A==

CASE

f2194 Location:

Variable Format: character

Notes: UNF:6:tNr/1GRbb3dFsO2OLlRyyQ==

DECADE

f2194 Location:

Variable Format: character

Notes: UNF:6:pMtDAG/08dxxs4m/tZgMNA==

Title

f2194 Location:

Variable Format: character

Notes: UNF:6:BPaD/rBEcHIY1bQA3uHS0g==

Author

f2194 Location:

Variable Format: character

Notes: UNF:6:D7GOiHuYdteJsuxGSXWHZw==

AuthorPreferenceWritersWithMoreThanThreeAttestationsOfDecadeConstructions

f2194 Location:

Variable Format: character

Notes: UNF:6:lEfeSepCbS/w/q8eEvNe4g==

AuthorGender

f2194 Location:

Variable Format: character

Notes: UNF:6:VM9lC/8pzbi0vJ74jHTjBw==

OneExPerAuthor

f2194 Location:

Summary Statistics: Max. 1.0; Mean 1.0; Valid 783.0; Min. 1.0; StDev 0.0

Variable Format: numeric

Notes: UNF:6:+Asbv4bifxHo4VrFsM8law==

Birthday

f2194 Location:

Variable Format: character

Notes: UNF:6:lIJ4fBtV1//M5LxE1nhUhQ==

BirthdayPeriod

f2194 Location:

Variable Format: character

Notes: UNF:6:/5pUEEdQqh8cxgqkCZz3GQ==

Header

f2194 Location:

Variable Format: character

Notes: UNF:6:RF3YKv5o5hjuU3nV5w76ag==

Created

f2194 Location:

Variable Format: character

Notes: UNF:6:ibjJodkYY9bNcr3EzUwXew==

PeriodCreated

f2194 Location:

Variable Format: character

Notes: UNF:6:DkYHR6NG5d7xrry9Vjf21g==

Sphere

f2194 Location:

Variable Format: character

Notes: UNF:6:Fg4jGACobLfWP0Vap61LBg==

SphereConflated

f2194 Location:

Variable Format: character

Notes: UNF:6:QDdNm0RM317r6lZq/JnMtg==

Type

f2194 Location:

Variable Format: character

Notes: UNF:6:OtZYFg+BTPNkvtbN3Gb93g==

Topic

f2194 Location:

Variable Format: character

Notes: UNF:6:AmwZOaf74K30WWJeQKBpPQ==

Publication

f2194 Location:

Variable Format: character

Notes: UNF:6:wQjtUMn+PYF+0evCFyPP5g==

PublicationYear

f2194 Location:

Variable Format: character

Notes: UNF:6:O4AWArmuSENg3gnNf/+JBQ==

Medium

f2194 Location:

Variable Format: character

Notes: UNF:6:7KljvotmOlZnbz/jpw7lag==

Ambiguity

f2194 Location:

Variable Format: character

Notes: UNF:6:INqAmC8mI2Avl31lFM5Y7w==

AppositiveParticipleOrGerundConstruction

f2194 Location:

Variable Format: character

Notes: UNF:6:/4IxnfTFn8pecDwwiLj7YA==

ASPECT

f2194 Location:

Variable Format: character

Notes: UNF:6:zsCg+xnt9JC78YJ55xqhQg==

WordOrder

f2194 Location:

Variable Format: character

Notes: UNF:6:eDGPwfJMKaWLEkMfdN1LDA==

AspectualType

f2194 Location:

Variable Format: character

Notes: UNF:6:uGcFKcjm5PVxn+SGj/jdWw==

FrequencyPredicateLemmaExWithBirthday

f2194 Location:

Variable Format: character

Notes: UNF:6:e/r8xSDOCHN2tOjjjC3GKg==

PredicateLemma

f2194 Location:

Variable Format: character

Notes: UNF:6:4YNHTXOZoAsj7ZA6FsRXFA==

PredicateInflected

f2194 Location:

Variable Format: character

Notes: UNF:6:3mbT5AsB63K5+CWbW1Uscw==

PredicateGrammaticalForm

f2194 Location:

Variable Format: character

Notes: UNF:6:SVmYo1+tgD8/rXgkmsaPgg==

FullContext

f2194 Location:

Variable Format: character

Notes: UNF:6:6hOcm2rMI5vkvux76BNd1A==

Other Study-Related Materials

Label:

01ReadMeDecadeForTrolling.txt

Text:

This file contains a description of all files, as well as documentation for the three tables included in this TROLLING post.

Notes:

text/plain

Other Study-Related Materials

Label:

02EntireDecadeDatabase.csv

Text:

This file contains all collected data about the choice of case in Russian temporal adverbials describing decades.

Notes:

text/csv

Other Study-Related Materials

Label:

02EntireDecadeDatabase.txt

Text:

This file contains all collected data about the choice of case in Russian temporal adverbials describing decades. (.txt format)

Notes:

text/plain

Other Study-Related Materials

Label:

05GenreRCode.txt

Text:

This file gives the R code described in section 4.1 of the article, which analyzes the importance of genre for the choice of temporal adverbial construction.

Notes:

text/plain

Other Study-Related Materials

Label:

06GenderRCode.txt

Text:

This is the R code for gender reported in section 4.2 of the article, which analyzes the importance of author gender (male vs. female author) for the choice of temporal adverbial construction.

Notes:

text/plain

Other Study-Related Materials

Label:

07AspectualTypeRCode.txt

Text:

This file gives the R code for the statistical tests of punctual verbs reported in footnotes 16-17 in section 5.2 of the article, which analyzes the importance of aspectual types (Vendler-inspired classes) for the choice of temporal adverbial construction.

Notes:

text/plain

Other Study-Related Materials

Label:

08DecadeCARTandTreeForestRCode.txt

Text:

This is the R code for the CART and Tree and Forest analysis reported in section 6 (footnote 18, Figure 7), which investigates the importance of a number of factors for the choice of temporal adverbial construction.

Notes:

text/plain

Other Study-Related Materials

Label:

09DecadeCARTandTreeForestOnePerAuthorRCode.txt

Text:

This file contains the R code reported in figure 8 in section 6 of the article, which investigates the importance of genre for the choice of temporal adverbial construction. This file has one example per author.

Notes:

text/plain