Replication Data for: The history of Slavonic clausal complementation: a corpus viewhttps://doi.org/10.18710/FY7R8NEckhoff, HanneDataverseNO2021-09-012022-07-18T17:26:31Z<p>This dataset provides replication data for an article on complementation structures in early Slavonic. Syntactic annotation of historical text, with no access to native-speaker intuitions, poses a number of problems to the annotator who is faced with the task of giving a single analysis of each sentence. The article reports on the experiences from annotating complementation structures in Old Church Slavonic and Old East Slavonic in the PROIEL and TOROT treebanks.</p>
<p></p>
<p>Two case studies are examined: complement clauses in Old Church Slavonic and the history of Russian čьto ‘what, which, that’. In the first case the annotation scheme is shown to work well in terms of interannotator agreement and retrievability. However, the price is that a large number of examples with jako ‘that’ are analysed as complement clauses with a subjunction, even though many of these examples are in fact ambiguous and jako can equally well be interpreted as a quotative particle followed by direct speech.</p>
<p></p>
<p>The second case study looks at a development from situation where čьto could be taken to be an interrogative pronoun in all subordinated clauses, to a situation where a subjunction and a relative pronoun analysis are also available. This leads to a large number of ambiguous occurrences. The solution in TOROT is to to analyse unambiguous interrogative pronoun and subjunction examples at face value, while all of the remaining occurrences are analysed as relative clauses. This makes the annotator's job manageable, but causes retrievability problems, since individual researchers will have to sift through the relative clause examples themselves.</p>Arts and HumanitiessyntaxcomplementationOld Church SlavonicOld East SlavonicMiddle RussianEnglishHanne Eckhoff (2021): The history of Slavonic clausal complementation: a corpus view. In Björn Wiemer and Barbara Sonnenhauser: Clausal Complementation in South Slavic. Berlin: De Gruyter Mouton. https://doi.org/10.1515/9783110725858-008, doi, 10.1515/9783110725858-008, https://doi.org/10.1515/9783110725858-0082021-09-01Eckhoff, Hanne2019-08-270863-01-011675-12-312017-03-262017-04-05datasets with linguistic annotationR script<p>The PROIEL Treebank. Available at <a href="https://proiel.github.io/"
title="PROIEL" target="_blank">https://proiel.github.io/</a>.</p>
<p></p>
<p>Dag T. T. Haug and Marius L. Jøhndal. 2008. '<a href="http://www.hf.uio.no/ifikk/english/research/projects/proiel/Activities/proiel/publications/marrakech.pdf"
title="Haug+Joehndal+2008" target="_blank">Creating a Parallel Treebank of the Old Indo-European Bible Translations</a>'. In Caroline Sporleder and Kiril Ribarov (eds.). <i>Proceedings of the Second Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2008) (2008)</i>, pp. 27-34.</p><p>The TOROT Treebank. Available at <a href="https://torottreebank.github.io/"
title="TOROT" target="_blank">https://torottreebank.github.io/</a>.</p>
<p></p>
<p>Hanne Martine Eckhoff and Aleksandrs Berdicevskis. 2015. 'Linguistics vs. digital editions: The Tromsø Old Russian and OCS Treebank'. <i>Scripta & e-Scripta</i> 14–15, pp. 9-25. Open Access version availalbe at <a href="https://hdl.handle.net/10037/22366"
title="OA version" target="_blank">https://hdl.handle.net/10037/22366</a>.</p>Macedonia, the Former Yugoslav Republic ofBulgariaUkraineRussian Federation<p>This dataset, "Replication Data for: The history of Slavonic clausal complementation: a corpus view", may be reused according to the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license as described here: <a href="https://creativecommons.org/licenses/by-nc-sa/4.0/"
title="TermsOfUse" target="_blank">https://creativecommons.org/licenses/by-nc-sa/4.0/</a>.</p>
<p></p>
<p>The raw data annotated and enriched in the tabular files in this dataset, "Replication Data for: The history of Slavonic clausal complementation: a corpus view", has been obtained from the following sources as described in the README file contained in this dataset:</p>
<p></p>
<p>The PROIEL Treebank; available at <a href="https://proiel.github.io/"
title="PROIEL" target="_blank">https://proiel.github.io/</a>; used under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0 International) license (<a href="https://creativecommons.org/licenses/by-nc-sa/4.0/"
title="TermsOfUse" target="_blank">https://creativecommons.org/licenses/by-nc-sa/4.0/</a>).</p>
<p></p>
<p>The TOROT Treebank; available at <a href="https://torottreebank.github.io/"
title="TOROT" target="_blank">https://torottreebank.github.io/</a>; used under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States (CC BY-NC-SA 3.0 US) license (<a href="https://creativecommons.org/licenses/by-nc-sa/3.0/us/"
title="TermsOfUse" target="_blank">https://creativecommons.org/licenses/by-nc-sa/3.0/us/</a>).</p>
<p></p>
<p>According to Creative Commons (cf. <a href="https://creativecommons.org/share-your-work/licensing-considerations/compatible-licenses"
title="CC Compatible Licenses" target="_blank">Compatible Licenses</a>), BY-SA licenses are compatible with, i.a., "BY-SA 3.0, or a later version of the BY-SA license".</p>