2024-03-28T11:31:40Zhttps://dataverse.no/oai
doi:10.18710/09GQFO2024-03-20T02:03:53ZtrollingHarvesterDataverseNOhvldataverseno
Replication data for: Playing with fire compoundshttps://doi.org/10.18710/09GQFOStrand, Bror-Magnus S.DataverseNO<p>The dataset contains: </p> <p></p> <p>Praat scripts for extracting and annotating relevant utterances from larger sound files, and extracting data (F0) from shorter sound files for further analysis. </p> <p></p> <p>Sound files (.wav) containing single utterances </> <p>Praat Pitch files with F0 contours of pitch accent tones </> <p>Praat TextGrid Files </> <p></p> <p>R script for smoothing F0 contours using functional data analysis (fda), and making plots from and calculating correlation coefficients on the contours.</> <p></p> <p>All material from a corpus of 7 children engaging in free peer interaction and self recording of 5 adults for baseline data.</p> <p></p> <p></p> <p>Publication abstract:</p> <p></p> <p>Prosodic features are some of the most salient features of dialect variation in Norway. It is therefore no wonder that the switch in prosodic systems is what is first recognized by caretakers and scholars when Norwegian children code-switch to something resembling the dialect of the capital (henceforth Urban East Norwegian, UEN) in role play. With focus on the Scandinavian system of lexical accent tones, this paper investigates the spontaneous speech of North Norwegian children engaging in peer social role play. The paper makes the case that children fail to apply the target accent tone in compounds in consistency with UEN in role play, although the production of accent tones otherwise seems to be phonetically target like UEN. Put in other words, they perform in accordance with UEN phonetics, but not UEN morpho-phonology.</p>Arts and HumanitiesProsodyRole PlayToneAccent toneTone accentFunctional data analysisPhonologyPhoneticsAcoustic analysisPlayRole playCompoundsNorwegianNorthern NorwegianNorth NorwegianUrban East NorwegianEnglish2021-03-26Strand, Bror-Magnus S.Anderssen, MereteVangsnes, Øystein A.AcqVA AuroraStrand, Bror-Magnus S., 2020, "Replication Data for: Morphological variation and development in a Northern Norwegian role play register", https://doi.org/10.18710/TU1GSY, DataverseNO,Experimental dataSound filesPitch data filesPraat scriptsR scripts
doi:10.18710/0JC95M2024-03-20T02:03:46ZtrollingdataversenoHarvesterDataverseNOhvluit
Replication data for: Prefix variation in путать: в-. за-, пере- and с-https://doi.org/10.18710/0JC95MNordrum, MariaDataverseNOThis case study of the four Natural Perfectives of the Russian simplex verb путать ‘tangle’ sheds light on the following questions: Is it possible to predict the choice of prefix when there is prefix variation in Russian? And if yes, how? Since these questions are particularly relevant for second-language learners, the author also discusses how the present study and similar ones, can be used to make second language learning of Russian more effective. The analysis is based on a database of 630 sentences from the Russian National Corpus (RNC) and takes two factors into consideration: type of construction and semantic category of the internal argument.The uploaded data contain 3 files: "Database, everything": Each sentence is tagged according to prefix, form of the verb (Active vs Passive), type of construction and semantic category of the internal argument. The four types of constructions and four types of semantic categories are explained with examples from the database inside the article. "Database_simplified": This version of the database contains the three parameters for the sentences: prefix, type of construction and semantic category of the internal argument. The simplified database was created to do statistical analyses in R. "R_putat": The R script that was used in order to produce the cTree which is presented in the article.Arts and HumanitiesRussianaspectprefix variationNatural Perfectivesclassification tree analysissecond language learningEnglish2014-06corpus
doi:10.18710/0U0KN22024-03-20T02:03:49ZtrollinguithvlHarvesterDataverseNOdataverseno
Norwegian compounds and their Russian equivalentshttps://doi.org/10.18710/0U0KN2Nesset, ToreDataverseNOThis post contains the dataset discussed in two related publications: Nesset, Tore (2018a): When a single word is enough: Norwegian compounds and their Russian counterparts. Slovo. http://www.moderna.uu.se/slaviska/slovo/ Nesset, Tore (2018b): How to translate compounds into Russian? Scando-Slavica 64.2.Arts and Humanitiescompoundword-formationRussianNorwegianrelational adjectivegenitive constructionEnglish2018-09-13Nesset, ToreJosefsen, Linn TheaSkjølsvold, Jens KristianSverdrupsen, HåkonZubchenko, IrinaReynolds, RobertSentsova, Ulianacorpus data
doi:10.18710/1GNZSC2024-03-20T02:03:46ZtrollingdataversenoHarvesterDataverseNOhvluit
Metonymy in Word-Formation: Russian, Czech, and Norwegianhttps://doi.org/10.18710/1GNZSCJanda, Laura A.DataverseNOPublication abstract: A foundational goal of cognitive linguistics is to explain linguistic phenomena in terms of general cognitive strategies rather than postulating an autonomous language module (Langacker 1987: 12-13). Metonymy is identified among the imaginative capacities of cognition (Langacker 2009: 46-47). Whereas the majority of scholarship on metonymy has focused on lexical metonymy, this study explores the systematic presence of metonymy in word-formation. I argue that in many cases, the semantic relationships between stems, affixes, and the words they form can be analyzed in terms of metonymy, and that this analysis yields a better, more insightful classification than traditional descriptions of word-formation. I present a metonymic classification of suffixal word-formation in three languages: Russian, Czech, and Norwegian. The system of classification is designed to maximize comparison between lexical and word-formational metonymy. This comparison supports another central claim of cognitive linguistics, namely that grammar (in this case word-formation) and lexicon form a continuum (Langacker 1987: 18-19), since I show that metonymic relationships in the two domains can be described in nearly identical terms. While many metonymic relationships are shared across the lexical and grammatical domains, some are specific to only one domain, and the two domains show different preferences for SOURCE and TARGET concepts. Furthermore, I find that the range of metonymic relationships expressed in word-formation is more diverse than what has been found in lexical metonymy. There is remarkable similarity in word-formational metonymy across the three languages, despite their typological differences: Russian and Czech present lexicons comprised almost entirely of word-formational families (Dokulil 1962: 14), whereas Norwegian is more he avily invested in compounding. Although this study is limited to three Indo-European languages, the goal is to create a classification system that could be implemented (perhaps with modifications) across a wider spectrum of languages.This study involves the collection of three databases representing the types of suffixal word-formation found in Russian, Czech and Norwegian and their metonymic interpretations, giving the vehicle (starting point) for the metonymy (also called the source in the published article), and the target of the metonymy, and a single example for each type. Other factors that were examined were also the number of metonymy designations (vehicle-target pairs) for each suffix, whether a given metonymy designation was represented also in lexical metonymy, whether a given metonymy designation could be reversed (i.e. both agent for action and action for agent).Arts and HumanitiesRussianCzechNorwegianmetonymyword-formationmorphologysuffixationEnglish2011corpus
doi:10.18710/1J0YZG2024-03-20T02:03:46ZtrollinguitHarvesterDataverseNOdataversenohvl
Solving Russian velars: Palatalization, the lexicon and gradient contrast utilizationhttps://doi.org/10.18710/1J0YZGParker, JeffDataverseNOThis dataset consists of (1) an excel file with type and token counts of all paired consonants word-finally and before non-front vowels, their probabilities, and the entropies of the pairs in each context; (2) the same entropies in separate files for word-final and before non-front vowels; and (3) R code to generate plots and perform statistical analysis. Article abstract: Palatalized velars in Russian are often considered exceptional because they are neither fully predictable, nor clearly unpredictable. They are an example of a common phonological relationship in which sounds have the potential to distinguish words but are only utilized in limited contexts and/or lexical items. These 'intermediate phonological relationships' (Goldsmith 1995) are problematic for traditional phonological theories which make a binary distinction between predictable sounds (allophones; dealt with in the grammar) and unpredictable sounds (phonemes; dealt with in the lexicon). To deal with intermediate phonological relationships in a principled way we must reconsider assumptions about the type and amount of information stored in the lexicon. In this paper I show that in Russian, both palatalized and non-palatalized velars occur in a variety of contexts, evidence that they have the potential to distinguish words. I also show, using information-theoretic metrics, that the potential is utilized to a minimal degree across both lexical items and phonetic contexts. However, and importantly, I show that many other consonants likewise do not fully utilize the (same) palatalization contrast across contexts. This suggests that velars are not an 'exception'; instead, they represent a relationship which lies at one end of a continuum along which the palatalization contrast is utilized. I argue that it is not velars, or intermediate phonological relationship s more generally, that are at problematic. Rather, it is our assumptions about the type and amount of information speakers store that is at issue. I argue that memory-rich models of the lexicon, which assume a great deal of storage of phonetic, contextual and distributional information, better account for velars in Russian. Moreover, the type of relationship that velars represent is a natural and expected outcome in such models. Thus, Russian velars provide important evidence that pushes us to reconsider some of the basic assumptions of our phonological models and phonological relationships more generally, and the problem that has long vexed Slavists can be solved within a memory-rich model of the lexicon.Arts and HumanitiesRussianvelarspalatalizationcontrastphonologygradienceEnglish2014corpus
doi:10.18710/1JMFVR2024-03-20T02:03:53ZtrollinghvldataversenoHarvesterDataverseNO
Concessive constructions in varieties of English: Corpus datahttps://doi.org/10.18710/1JMFVRSchützler, OleDataverseNOThe data were used in a corpus-based study that investigates the variation of concessive constructions across nine varieties of English. Concessive constructions are here taken to consist of a subordinate clause linked to a matrix clause using one of the three subordinating conjunctions 'although', 'though' or 'even though'. For each occurrence, the data contain information concerning its semantic properties, the position of the subordinate clause, the conjunction that was used, the finite or nonfinite status of the subordinate clause as well as its length. Further, each token is annotated for variety, mode of production (spoken vs. written) and genre (or text type). It is also possible to model the text frequencies of conjunctions and semantic subtypes, since in the respective data tables counts are given for each text in the corpora, along with the total word count per text.Arts and HumanitiesCorpus linguisticsConcessivesEnglishSubordinating conjunctionsEnglish2021-02-11Schützler, OleVetter, FabianCorpus dataNine components of The International Corpus of English: Great Britain, Ireland, Canada, Australia, Jamaica, Nigeria, India, Singapore, Hong Kong
doi:10.18710/1U2AQJ2024-03-20T02:03:57ZtrollingHarvesterDataverseNOdataversenohvl
Replication Data for: The Verbal prefix do- in Russian and Ukrainianhttps://doi.org/10.18710/1U2AQJSchledewitz, DavidDataverseNO<b>Dataset description:</b> <p>This dataset contains corpus data used in the paper described below.</p> <p>The dataset set consists of html-pages that contain the results for corpus searches in the Russian National Corpus (RNC) as described in the methodology of the corresponding paper and in the methodological information of this README file. Furthermore, it contains the scripts that were used to save these html-pages and to extract the relevant information from them. The scripts created csv files which were then imported into a LibreOffice Calc document with the ".ods" extension.</p><b>Article description:</b> <p>The present small-scale study compares the usage of the verbal prefix do- in contemporary Russian and Ukrainian using the Ukrainian parallel corpus of the Russian National Corpus. Two datasets were analyzed: In the first one, translations of Russian do- verbs into Ukrainian were analyzed, whereas the second dataset dealt with translations of Ukrainian do- verbs into Russian. The focus of the discussion was on cognate translations with different prefixes.</p> <p>While the amount of data does not allow any strong conclusions, it is shown that in both languages do- prefixes can express the same meanings, namely REACH, REACH (ABSTRACT), ADD, CONVEY, and, when used together with postfix -sja, EXCESS. As the discussion shows, there is reason to believe that the CONVEY meaning is less productive in Russian where it is used in words restricted to official contexts and in fixed expressions.</p> <p>A quantitative analysis showed that among cognate translations from Ukrainian into Russian, the prefix was more often different than in translations from Russian into Ukrainian. This can be seen as a further clue for a wider application of Ukrainian do- compared to its Russian counterpart.</p>Arts and HumanitiesRussianUkrainiancorpus dataverbal prefixescognatestranslationEnglish2023-05-15Schledewitz, Davidcorpus data<p>The Ukrainian parallel corpus of the Russian National Corpus (RNC), available at <a href="https://ruscorpora.ru/">ruscorpora.ru</a>.</p> <p>The extracted text fragments that are contained in the data files of this dataset only represent insubstantial portions of the source listed above, and they do not represent coherent larger texts. Reuse of such excerpts is permitted under exceptions in IPR and database protection regulations, such as Fair use (cf. <a href="https://www.copyright.gov/fair-use/more-info.html">US Copyright Act</a>), the <a href="http://data.europa.eu/eli/dir/1996/9/oj">EU Database Directive</a> (cf. art 8 Rights and obligations of lawful users), and the Norwegian Copyright Act (cf. <a href="https://lovdata.no/lov/2018-06-15-40/§24">§ 24 Eneretten til databaser</a>).</p>
doi:10.18710/2CPQHQ2024-03-25T02:03:46ZtrollingdataversenohvlHarvesterDataverseNOearth_and_environmental
doi:10.18710/2NKJPG2024-03-20T02:03:46ZtrollinghvlHarvesterDataverseNOdataversenouit
Replication data for: Slangs go online, or the rise and fall of the Olbanian languagehttps://doi.org/10.18710/2NKJPGBerdicevskis, AleksandrsZvereva, VeraDataverseNOAll the data were taken from the website udaff.com (the center of the padonki culture and one of the cradles of the Olbanian language), from the section kreativy ('creative stories') where users upload their own short stories. This is one of the oldest and most important sections on the website, and its name is a symbol of padonki culture. It was chosen as the largest and most diachronically representative collection of texts a) with a large number of erratic spellings; b) written by people who identify themselves as padonki, i.e."native speakers" of Olbanian. Texts were selected from 975 webpages covering the time period from January 2001 to December 2011. One text was selected randomly from each page (each page contained 50 texts), and a random fragment of 100 words was extracted for analysis. If a text was for some reason not suitable for analysis (e.g. it was shorter than 100 words), another random text was selected. This resulted in 975 100-word fragments produced by 729 authors (156 authors produced more than one text, the largest number of texts per author was nine, the mean was 1.34). No adjustment was made for the fact that some authors had more than one fragment included in the sample: while this gives their idiolect additional chances to contribute to the observed variation, that must mirror the actual situation. For every word, it was noted how many deviations from the norm it contained. All kinds of deviations were counted, and not all of them are strictly Olbanian. However, the analysis of distribution of deviations a cross different types shows that the number of indisputably non-Olbanian deviations is relatively small and constant and does not distort the general picture.Arts and Humanitiesslanganti-languageorthographynorm deviationcomputer-mediated communicationpadonkiOlbanianEnglish2012corpus
doi:10.18710/2UJHHU2024-03-20T02:03:53ZtrollingHarvesterDataverseNOhvldataverseno
Dataset for "Scrabble yourself to success: Methods in teaching transcription."https://doi.org/10.18710/2UJHHUSönning, LukasDataverseNOThis dataset is from a quasi-experimental study that evaluated two methods for teaching phonemic transcription to university students of English: (i) the transcription of auditory stimuli and (ii) an activity-based approach employing a phonemic version of the classic board game Scrabble®. Participants were recruited from three (parallel) "English Phonetics & Phonology" courses, where the final exam involves phonemic transcriptions. A pretest-posttest control group design was used, with group sizes of 9 (control group), 15 (audio group) and 12 (scrabble group). Participants signed up for two different time slots, which were then assigned to different methods of instruction. The training groups were offered five 90-minute sessions, which were identical for both groups in the first 45 minutes each week, but used different methods in the second half of each session. Transcription ability was assessed prior to and after training using the same (40-minute) test, which covered a wide range of segmental and suprasegmental features, with a maximum of 345 points attainable. The dataset also includes further information on the students, which was elicited using a questionnaire (e.g. whether (or how long) they had spent time abroad, whether they did their homework for the course regularly).Arts and Humanitiesteaching methodsphonemic transcriptionevaluationTEFLTeaching English as a Foreign Languagetertiary educationphoneticsphonologyquasi-experimental studyEnglishL1 GermanEnglish2010-06-30Sönning, Lukasquasi-experimental data
b2Zmc2V0OjoxMHxzZXQ6OnRyb2xsaW5nfHByZWZpeDo6b2FpX2Rj