Replication Data for: Predicting Russian aspect by frequency across genresdoi:10.18710/BIIGT6DataverseNO2017-12-031Eckhoff, Hanne; Janda, Laura; Lyashevskaya, Olga Nikolayevna, 2017, "Replication Data for: Predicting Russian aspect by frequency across genres", https://doi.org/10.18710/BIIGT6, DataverseNO, V1, UNF:6:TCa0jCAvvGll3zm3uYltvg== [fileUNF]Replication Data for: Predicting Russian aspect by frequency across genresdoi:10.18710/BIIGT6Eckhoff, HanneJanda, LauraLyashevskaya, Olga NikolayevnaUiT The Arctic University of NorwayDataverseNOThe Tromsø Repository of Language and Linguistics (TROLLing)Eckhoff, HanneEckhoff, Hanne Martine2017-03-12Arts and Humanitiessemanticsaspectcorrespondence analysisRussianverbsfrequencyWe ask whether the aspect of individual verbs can be predicted based on the statistical distribution of their inflectional forms and how this is influenced by genre. To address these questions, we present an analysis of the “grammatical profiles” (relative frequency distributions of inflectional forms) of three samples of verbs extracted from the Russian National Corpus, representing three genres: Journalistic prose, Fiction, and Scientific-Technical prose. We find that the aspect of a given verb can be correctly predicted from the distribution of its forms alone with an average accuracy of 92.7%. Remarkably, this accuracy is statistically indistinguishable from the accuracy of prediction of aspect based on morphological marking. We maintain that it would be possible for first language learners to use distributional tendencies, in addition to morphological and other cues (for example semantic and syntactic cues), in acquiring the verbal category of aspect in Russian.Russian National CorpusEckhoff, Hanne M., et al. “Predicting Russian aspect by frequency across genres.” The Slavic and East European Journal, vol. 61, no. 4, 2017, pp. 844–75. JSTOR, http://www.jstor.org/stable/26633829.www.jstor.org/stable/26633829Eckhoff, Hanne M., et al. “Predicting Russian aspect by frequency across genres.” The Slavic and East European Journal, vol. 61, no. 4, 2017, pp. 844–75. JSTOR, http://www.jstor.org/stable/26633829.fic50factor1tagged.tab2258text/tab-separated-valuesUNF:6:FcLokpcNo9/lqGqnJr8/Fg==journ50_factor1tagged.tab1858text/tab-separated-valuesUNF:6:i5RDeUNEke185uknffIvNQ==rus.fiction.tab780841text/tab-separated-valuesUNF:6:Qvntfkg9Nzf20k7M+Vi4lA==rus.journ.tab527161text/tab-separated-valuesUNF:6:rGFY0vp0eA7zN5+s5q19GQ==rus.scitech_corrected.tab435281text/tab-separated-valuesUNF:6:7JWXykZeJAzpCsOhSolXpQ==scitech50factor1tagged.tab1728text/tab-separated-valuesUNF:6:RDO6l2SoPtSoBKdY0RjAEQ==lemmaUNF:6:HoWxUNy2PWjvdYWHzOl7mA==factor1-1.89732600670106225.00.06715094366508910.0218676068135301920.681086667741942.0.45812542488245310.0UNF:6:lgoQd+tZGQUXbw5ma5mnbQ==aspUNF:6:4rtcn5WCDZb6YRIwFHnOGA==freq50.096.04808.0170.16000000000005225.00.0.361.5862760116872UNF:6:P84VF4ky5m0q3mrmLquyQw==morph1UNF:6:eoMDWnwqhOBUVE0m6sz02w==morph2UNF:6:wyfWAFbhAsHmmHO439tUAw==semUNF:6:cPBWumPehzhjlw8/kOeimg==commentUNF:6:nn4mszXTUcjzg8WKMW8fSA==lemmaUNF:6:qA08d87TxeEzG1VL1HMfdw==factor1185.0-0.0471937148041496-1.4306265462582.-0.0511421855515012960.58336632443015660.01.05373275750803UNF:6:Ur3Rto8lreR+VjbEqaQlPQ==aspUNF:6:souK4BkC/a/+ANTJYGPojQ==freq0.02763.0.226.8726390721828185.050.085.0133.1513513513513UNF:6:34q56UPy+THVgL3eG8NF4g==morph1UNF:6:/7N0nE01ywwJVyWDk3e58A==morph2UNF:6:rF6gT6VapA3UoLWbfDF6+w==semUNF:6:zAucHmAEUBKQXlbm/OiQWw==commentUNF:6:Ry0bWcXYoeVasNxIuvfdCg==FormTranslit;LemmaTranslit;MoodTense;Trans;Voice;VoicePartcp;Person;Number;Gender;Long;AspPair;Aspect;Mood;TenseUNF:6:Qvntfkg9Nzf20k7M+Vi4lA==FormTranslit;LemmaTranslit;MoodTense;Trans;Voice;VoicePartcp;Person;Number;Gender;Long;AspPair;Aspect;Mood;TenseUNF:6:rGFY0vp0eA7zN5+s5q19GQ==FormTranslit;LemmaTranslit;MoodTense;Trans;Voice;VoicePartcp;Person;Number;Gender;Long;AspPair;Aspect;Mood;TenseUNF:6:7JWXykZeJAzpCsOhSolXpQ==lemmaUNF:6:gq6d5HyY6z4AmuXNSFKk9w==factor1.0.18416460779009550.00.8162565887154810.6612926017933347-0.06773800473451592172.0-1.62907144120676UNF:6:y92Kx7unB59nX0LH5LULpw==aspUNF:6:Q7kbZVw9RK+mIVZLWcZ78A==freq125.7325581395349350.0172.01629.085.00.0160.84956645056826.UNF:6:Thb1CfmbMiC4O8lcpXfVtQ==morph1UNF:6:x2MshY4GGm6tDAG1nR68IQ==morph2UNF:6:IRDT9xGx3ELGUEaJ5bw+2g==semUNF:6:Y54lVSCvRiXEtY3YJ1GRng==commentUNF:6:ByMWKGd6JsNaL9pSNROKgg==00_readme_file.txtReadMe file for dataset, whith description of the individual files.text/plain01journ.rR script that analyses the data from the journalistic register (rus.journ.tab, original format csv).type/x-r-syntax02fic.rR script that analyses the data from the fiction register (rus.fiction.tab, original format csv).type/x-r-syntax03scitech.rR script that analyses the data from the scientific-technical register (rus.scitech_corrected.tab, original format csv).type/x-r-syntax04verbtags.rR script that takes care of the analysis of verbs by derivational morpology and semantics. Uses the three dataset files fic50_factor1tagged.tab, journ50_factor1tagged.tab and scitech50_factor1tagged.tab (original format csv).type/x-r-syntax