Persistent Identifier
|
doi:10.18710/MHGXDH |
Publication Date
|
2024-06-16 |
Title
| Background data for: Regression and random forests: Synergies for variationist corpus research |
Author
| Romasanta, Raquel P. (University of Santiago de Compostela) - ORCID: 0000-0001-7508-4745 |
Point of Contact
|
Use email button above to contact.
Romasanta, Raquel P. (University of Santiago de Compostela) |
Description
| This dataset contains tabular files recording occurrences of the verb REGRET complemented by a that- or -ing-complement clause (CC) in the GloWbE corpus. Tokens were retrieved using the online interface (https://www.english-corpora.org/glowbe/) and manually annotated for several syntactic and semantic variables (variety, text type, finiteness, meaning of the verb regret, voice of the CC, words in the CC, coreferentiality, intervening material, negation in the CC, temporal relation). See ReadMe file for more details. Related publication: Sönning, Lukas, Jason Grafmiller & Raquel P. Romasanta. 2024. Regression and random forests: Synergies for variationist corpus research. ICAME 45, University of Vigo, 18-22 June 2024. (2024-06-12) |
Subject
| Arts and Humanities |
Keyword
| corpus linguistics
complementation
GloWbE
British English
American English
Singaporean English
English |
Related Publication
| Sönning, Lukas, Jason Grafmiller & Raquel P. Romasanta. 2024. Regression and random forests: Synergies for variationist corpus research. ICAME 45, University of Vigo, 18-22 June 2024. |
Language
| English |
Producer
| University of Vigo https://www.uvigo.gal/en |
Funding Information
| The Spanish Ministry of Economy and Competitiveness: FFI2017-82162-P
The Spanish Ministry of Economy and Competitiveness: PRE2018-083249
The Spanish Ministry of Science and Innovation funded by MCIN/AEI/10.13039/501100011033: PID2020-117030GB-I00
The Recovery, Transformation, and Resilience Plan of the European Union “NextGenerationEU”, University of Vigo: 585507 |
Distributor
| The Tromsø Repository of Language and Linguistics (TROLLing) (TROLLing) https://trolling.uit.no/ |
Depositor
| Romasanta, Raquel P. |
Deposit Date
| 2024-06-12 |
Time Period
| Start Date: 2012 ; End Date: 2013 |
Date of Collection
| Start Date: 2017-01-01 ; End Date: 2017-12-31 |
Data Type
| Annotated corpus data |
Software
| Excel
R |
Related Material
| The R code written for all aspects of data analysis in the publication cited above is available at https://osf.io/5u8bt/. |
Data Source
| GloWbE: Davies, Mark (2015) Introducing the 1.9 billion word Global Web-Based English Corpus (GloWbE). The 21st Century Text, 5. Available online at https://www.english-corpora.org/glowbe/.
; The extracted text fragments that are contained in the data file of this dataset only represent non-substantial portions of the source listed above, and they do not represent coherent larger texts. Therefore, the reuse (including redistribution) of these excerpts is permitted by the exceptions rules in IPR and database protection regulations, such as Fair use (USA cf. US Copyright Act), Fair dealing (UK; cf. Exceptions to copyright), "lover, forskrifter, rettsavgjørelser og andre vedtak av offentlig myndighet" (Norway; cf. § 14 in Åndsverkloven), "uvesentlige deler av databaser" (Norway; cf. § 24 in Åndsverkloven), "sitatretten" (Norway; cf. § 29 in Åndsverkloven). |