10.18710/QAJKZWCvrček, VáclavVáclavCvrček0000-0003-3977-2393Czech National CorpusMulti-Dimensional Analysis of CzechDataverseNO2018Arts and Humanitiesmulti-dimensional analysisregister variationfactor analysiscorpusCzechLukeš, DavidDavidLukešCzech National CorpusCzech National CorpusCvrček, VáclavVáclavCvrčekKomrsková, ZuzanaZuzanaKomrskováLukeš, DavidDavidLukešPoukarová, PetraPetraPoukarováŘehořková, AnnaAnnaŘehořkováZasina, Adrian JanAdrian JanZasinaThe Tromsø Repository of Language and Linguistics (TROLLing)TheTromsø Repository of Language and Linguistics (TROLLing)2018-10-122018-10-122023-09-282017/2018corpus data10.1515/cllt-2018-00204928862701874206271007application/vnd.openxmlformats-officedocument.wordprocessingml.documentapplication/pdftext/tab-separated-valuestype/x-r-syntax1.1CC0 1.0<p> Original data for a general-purpose multi-dimensional analysis model of register variation in Czech. </p> <p> This post contains a CSV data set of 137 linguistic features measured on 3428 Czech text chunks, and an R script which performs a factor analysis on this data set. The results of this factor analysis were used as a basis for an 8-dimensional model of register variation in Czech (see Related Publications), following the methodology introduced by Douglas Biber (see e.g. his 1988 seminal work <a href="https://doi.org/10.1017/CBO9780511621024"> Variation Across Speech and Writing </a> for details on the methodology, or his 2014 article <a href="https://doi.org/10.1075/lic.14.1.02bib"> “Using multi-dimensional analysis to explore cross-linguistic universals of register variation” </a> for a review of MDA results across a variety of languages). </p> <p> The data is derived from the <a href="https://wiki.korpus.cz/doku.php/en:cnk:koditex"> Koditex corpus </a>, which aims to be as diversified as possible, covering various forms of spoken and written (both print and on-line) Czech. In compiling this corpus, the purpose was to provide a solid empirical basis for a comprehensive general-purpose model of register variation in Czech. </p> <p> Apart from this data set and related publications, additional resources pertaining to the project are available via the <a href="https://github.com/czcorpus/mda"> czcorpus/mda </a> GitHub repository. </p>R: A Language and Environment for Statistical Computing, 3.4.3psych: Procedures for Personality and Psychological Research (R package), 1.7.8Prague, Czech RepublicEuropean Regional Development FundCZ.02.1.01/0.0/0.0/16_013/0001758