Replication Data for: The history of Slavonic clausal complementation: a corpus view (doi:10.18710/FY7R8N)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link)

Document Description

Citation

Title:

Replication Data for: The history of Slavonic clausal complementation: a corpus view

Identification Number:

doi:10.18710/FY7R8N

Distributor:

DataverseNO

Date of Distribution:

2021-09-01

Version:

1

Bibliographic Citation:

Eckhoff, Hanne, 2021, "Replication Data for: The history of Slavonic clausal complementation: a corpus view", https://doi.org/10.18710/FY7R8N, DataverseNO, V1

Study Description

Citation

Title:

Replication Data for: The history of Slavonic clausal complementation: a corpus view

Identification Number:

doi:10.18710/FY7R8N

Authoring Entity:

Eckhoff, Hanne (University of Oxford)

Producer:

University of Oxford

Software used in Production:

R

Distributor:

DataverseNO

Distributor:

The Tromsø Repository of Language and Linguistics (TROLLing)

Access Authority:

Eckhoff, Hanne

Depositor:

Eckhoff, Hanne

Date of Deposit:

2019-08-27

Holdings Information:

https://doi.org/10.18710/FY7R8N

Study Scope

Keywords:

Arts and Humanities, syntax, complementation, Old Church Slavonic, Old East Slavonic, Middle Russian

Abstract:

<p>This dataset provides replication data for an article on complementation structures in early Slavonic. Syntactic annotation of historical text, with no access to native-speaker intuitions, poses a number of problems to the annotator who is faced with the task of giving a single analysis of each sentence. The article reports on the experiences from annotating complementation structures in Old Church Slavonic and Old East Slavonic in the PROIEL and TOROT treebanks.</p> <p></p> <p>Two case studies are examined: complement clauses in Old Church Slavonic and the history of Russian čьto ‘what, which, that’. In the first case the annotation scheme is shown to work well in terms of interannotator agreement and retrievability. However, the price is that a large number of examples with jako ‘that’ are analysed as complement clauses with a subjunction, even though many of these examples are in fact ambiguous and jako can equally well be interpreted as a quotative particle followed by direct speech.</p> <p></p> <p>The second case study looks at a development from situation where čьto could be taken to be an interrogative pronoun in all subordinated clauses, to a situation where a subjunction and a relative pronoun analysis are also available. This leads to a large number of ambiguous occurrences. The solution in TOROT is to to analyse unambiguous interrogative pronoun and subjunction examples at face value, while all of the remaining occurrences are analysed as relative clauses. This makes the annotator's job manageable, but causes retrievability problems, since individual researchers will have to sift through the relative clause examples themselves.</p>

Time Period:

0863-01-01-1675-12-31

Date of Collection:

2017-03-26-2017-04-05

Country:

Macedonia, the Former Yugoslav Republic of, Bulgaria, Ukraine, Russian Federation

Kind of Data:

datasets with linguistic annotation

Kind of Data:

R script

Methodology and Processing

Sources Statement

Data Sources:

<p>The PROIEL Treebank. Available at <a href="https://proiel.github.io/" title="PROIEL" target="_blank">https://proiel.github.io/</a>.</p> <p></p> <p>Dag T. T. Haug and Marius L. Jøhndal. 2008. '<a href="http://www.hf.uio.no/ifikk/english/research/projects/proiel/Activities/proiel/publications/marrakech.pdf" title="Haug+Joehndal+2008" target="_blank">Creating a Parallel Treebank of the Old Indo-European Bible Translations</a>'. In Caroline Sporleder and Kiril Ribarov (eds.). <i>Proceedings of the Second Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2008) (2008)</i>, pp. 27-34.</p>

<p>The TOROT Treebank. Available at <a href="https://torottreebank.github.io/" title="TOROT" target="_blank">https://torottreebank.github.io/</a>.</p> <p></p> <p>Hanne Martine Eckhoff and Aleksandrs Berdicevskis. 2015. 'Linguistics vs. digital editions: The Tromsø Old Russian and OCS Treebank'. <i>Scripta & e-Scripta</i> 14–15, pp. 9-25. Open Access version availalbe at <a href="https://hdl.handle.net/10037/22366" title="OA version" target="_blank">https://hdl.handle.net/10037/22366</a>.</p>

Data Access

Other Study Description Materials

Related Publications

Citation

Title:

Hanne Eckhoff (2021): The history of Slavonic clausal complementation: a corpus view. In Björn Wiemer and Barbara Sonnenhauser: Clausal Complementation in South Slavic. Berlin: De Gruyter Mouton. https://doi.org/10.1515/9783110725858-008

Identification Number:

10.1515/9783110725858-008

Bibliographic Citation:

Hanne Eckhoff (2021): The history of Slavonic clausal complementation: a corpus view. In Björn Wiemer and Barbara Sonnenhauser: Clausal Complementation in South Slavic. Berlin: De Gruyter Mouton. https://doi.org/10.1515/9783110725858-008

Other Study-Related Materials

Label:

apos.csv

Text:

Additional dataset for correlative structures, to be processed by comps.r

Notes:

text/csv

Other Study-Related Materials

Label:

comps.r

Text:

R script processing the main datasets

Notes:

type/x-r-syntax

Other Study-Related Materials

Label:

comps260317.csv

Text:

Full data set to be read by comps.r

Notes:

text/csv

Other Study-Related Materials

Label:

readme.txt

Text:

README with descriptions of all the other files

Notes:

text/plain

Other Study-Related Materials

Label:

thirdperson_tagged.csv

Text:

Dataset with annotation for conversion of person reference in jako-clauses

Notes:

text/csv