Replication Data for: The history of Slavonic clausal complementation: a corpus view (doi:10.18710/FY7R8N)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link)

Document Description
Citation
Title:	Replication Data for: The history of Slavonic clausal complementation: a corpus view
Identification Number:	doi:10.18710/FY7R8N
Distributor:	DataverseNO
Date of Distribution:	2021-09-01
Version:	1
Bibliographic Citation:	Eckhoff, Hanne, 2021, "Replication Data for: The history of Slavonic clausal complementation: a corpus view", https://doi.org/10.18710/FY7R8N, DataverseNO, V1
Study Description
Citation
Title:	Replication Data for: The history of Slavonic clausal complementation: a corpus view
Identification Number:	doi:10.18710/FY7R8N
Authoring Entity:	Eckhoff, Hanne (University of Oxford)
Producer:	University of Oxford
Software used in Production:	R
Distributor:	DataverseNO
Distributor:	The Tromsø Repository of Language and Linguistics (TROLLing)
Access Authority:	Eckhoff, Hanne
Depositor:	Eckhoff, Hanne
Date of Deposit:	2019-08-27
Holdings Information:	https://doi.org/10.18710/FY7R8N
Study Scope
Keywords:	Arts and Humanities, syntax, complementation, Old Church Slavonic, Old East Slavonic, Middle Russian
Abstract:	<p>This dataset provides replication data for an article on complementation structures in early Slavonic. Syntactic annotation of historical text, with no access to native-speaker intuitions, poses a number of problems to the annotator who is faced with the task of giving a single analysis of each sentence. The article reports on the experiences from annotating complementation structures in Old Church Slavonic and Old East Slavonic in the PROIEL and TOROT treebanks.</p> <p></p> <p>Two case studies are examined: complement clauses in Old Church Slavonic and the history of Russian čьto ‘what, which, that’. In the first case the annotation scheme is shown to work well in terms of interannotator agreement and retrievability. However, the price is that a large number of examples with jako ‘that’ are analysed as complement clauses with a subjunction, even though many of these examples are in fact ambiguous and jako can equally well be interpreted as a quotative particle followed by direct speech.</p> <p></p> <p>The second case study looks at a development from situation where čьto could be taken to be an interrogative pronoun in all subordinated clauses, to a situation where a subjunction and a relative pronoun analysis are also available. This leads to a large number of ambiguous occurrences. The solution in TOROT is to to analyse unambiguous interrogative pronoun and subjunction examples at face value, while all of the remaining occurrences are analysed as relative clauses. This makes the annotator's job manageable, but causes retrievability problems, since individual researchers will have to sift through the relative clause examples themselves.</p>
Time Period:	0863-01-01-1675-12-31
Date of Collection:	2017-03-26-2017-04-05
Country:	Macedonia, the Former Yugoslav Republic of, Bulgaria, Ukraine, Russian Federation
Kind of Data:	datasets with linguistic annotation
Kind of Data:	R script
Methodology and Processing
Sources Statement
Data Sources:	<p>The PROIEL Treebank. Available at <a href="https://proiel.github.io/" title="PROIEL" target="_blank">https://proiel.github.io/</a>.</p> <p></p> <p>Dag T. T. Haug and Marius L. Jøhndal. 2008. '<a href="http://www.hf.uio.no/ifikk/english/research/projects/proiel/Activities/proiel/publications/marrakech.pdf" title="Haug+Joehndal+2008" target="_blank">Creating a Parallel Treebank of the Old Indo-European Bible Translations</a>'. In Caroline Sporleder and Kiril Ribarov (eds.). <i>Proceedings of the Second Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2008) (2008)</i>, pp. 27-34.</p>
	<p>The TOROT Treebank. Available at <a href="https://torottreebank.github.io/" title="TOROT" target="_blank">https://torottreebank.github.io/</a>.</p> <p></p> <p>Hanne Martine Eckhoff and Aleksandrs Berdicevskis. 2015. 'Linguistics vs. digital editions: The Tromsø Old Russian and OCS Treebank'. <i>Scripta & e-Scripta</i> 14–15, pp. 9-25. Open Access version availalbe at <a href="https://hdl.handle.net/10037/22366" title="OA version" target="_blank">https://hdl.handle.net/10037/22366</a>.</p>
Data Access
Other Study Description Materials
Related Publications
Citation
Title:	Hanne Eckhoff (2021): The history of Slavonic clausal complementation: a corpus view. In Björn Wiemer and Barbara Sonnenhauser: Clausal Complementation in South Slavic. Berlin: De Gruyter Mouton. https://doi.org/10.1515/9783110725858-008
Identification Number:	10.1515/9783110725858-008
Bibliographic Citation:	Hanne Eckhoff (2021): The history of Slavonic clausal complementation: a corpus view. In Björn Wiemer and Barbara Sonnenhauser: Clausal Complementation in South Slavic. Berlin: De Gruyter Mouton. https://doi.org/10.1515/9783110725858-008
Other Study-Related Materials
Label:	apos.csv
Text:	Additional dataset for correlative structures, to be processed by comps.r
Notes:	text/csv
Other Study-Related Materials
Label:	comps.r
Text:	R script processing the main datasets
Notes:	type/x-r-syntax
Other Study-Related Materials
Label:	comps260317.csv
Text:	Full data set to be read by comps.r
Notes:	text/csv
Other Study-Related Materials
Label:	readme.txt
Text:	README with descriptions of all the other files
Notes:	text/plain
Other Study-Related Materials
Label:	thirdperson_tagged.csv
Text:	Dataset with annotation for conversion of person reference in jako-clauses
Notes:	text/csv