|
Persistent Identifier
|
doi:10.18710/SJ89E3 |
|
Publication Date
|
2025-12-05 |
|
Title
| Replication Data for: Metaphor analysis meets lexical strings: Finetuning the Metaphor Identification Procedure for quantitative semantic analyses |
|
Author
| De Backer, LaurenceGhent UniversityORCID0000-0003-3950-0566 |
|
Point of Contact
|
Use email button above to contact.
De Backer, Laurence (Ghent University) |
|
Description
| Dataset Abstract: This is the data that serves as the basis for a methodological article which proposes and illustrates two ways to extend the Metaphor Identification Procedure in such a way as to allow it to capture (metaphorical) lexical strings, in addition to (simple) metaphor-related lexical units. It includes a sample of 25 linguistic metaphors which stem from a larger corpus compiled by the first author, between October 2021 and May 2023. The corpus contains newspaper articles published in the Spanish-language, US-based newspaper El Diario (the El Paso and Juárez local editions). These articles revolve around the DACA (Deferred Action for Childhood Arrivals program) immigration program and were published between November 2020 and May 2023. (2023-05-08)
Article abstract: Recent years have witnessed the development of the Metaphor Identification Procedure (MIP/VU), a step-by-step protocol designed to identify metaphorically-used words in discourse. However, MIP(VU)’s merits notwithstanding, the procedure poses a problem to scholars intending to use its output as the basis for a semantic field analysis involving a quantitative component. Depending on the research question, metaphor analysts may be interested in chunks of language situated above the procedure’s standardized level of analysis (i.e, the lexical unit), including phrases and sentences. Yet, attempts to decenter the method’s exclusive focus on metaphor-related words have been the target of critique, among others on the grounds of their lack of clear unit-formation guidelines and, hence, their inconsistent unit of analysis and measurement. Drawing on data derived from a Spanish-language US-based newspaper’s coverage of the migration program known as DACA (Deferred Action for Childhood Arrivals), this article describes challenges that analysts can run into when attempting to use a dataset containing atomized metaphor-related words as the input for subsequent quantitative semantic analyses. Its main methodological contribution consists in a proposal and illustration of two possible ways to extend the existing MIP(VU)-protocol in such a way as to allow it to catch metaphorical strings, on top of words, in a reliable and systematic manner. One approach is procedural, and entails formulating a-priori grouping-directives based on the research question(s). The other is exploratory, involving the ad hoc grouping of units and adding a descriptive parameter meant to keep track of grouping-decisions made by the analyst, thereby safeguarding transparency at all times (2025-11-17) |
|
Subject
| Arts and Humanities |
|
Keyword
| Spanish
metaphors
semantic fields
referents
corpus
collocational analysis
DACA
El Diario |
|
Related Publication
| De Backer, L., Enghels, R., & Goethals, P. (2023). Metaphor analysis meets lexical strings: Finetuning the metaphor identification procedure for quantitative semantic analyses. Frontiers in Psychology, 14. https://doi.org/10.3389/fpsyg.2023.1214699 https://doi.org/10.3389/fpsyg.2023.1214699 |
|
Language
| English |
|
Producer
| Ghent University https://www.ugent.be |
|
Contributor
| Data Curator: Patrick Goethals
Data Collector: Sven Van Hulle |
|
Funding Information
| Fonds Wetenschappelijk Onderzoek Vlaanderen |
|
Distributor
| The Tromsø Repository of Language and Linguistics (TROLLing) (TROLLing) https://trolling.uit.no/ |
|
Depositor
| De Backer, Laurence |
|
Deposit Date
| 2023-05-09 |
|
Time Period
| Start Date: 2020-11-03; End Date: 2023-05-01 |
|
Date of Collection
| Start Date: 2021-10-01; End Date: 2023-05-01 |
|
Data Type
| Linguistic data; Corpus data |
|
Software
| MS Excel |
|
Data Source
|
- The data contained in this dataset are annotations of data retrieved from a corpus of newspaper articles compiled from the newspaper El Diario by the first author, between October 2021 and May 2023.
- R script for a collostructional analysis based on Gries, S.T. (2010). Behavioral profiles: A fine-grained and quantitative approach in corpus-based lexical semantics. The mental lexicon, 5(3), 323-346. Script downloaded: Gries, S.T. (2024). Coll.analysis 4.1. A script for R to compute perform collostructional analyses. https://www.stgries.info/teaching/groningen/index.html.
- 100-token sample of N + ADJ (punto + intermedio) and 100-token sample of V+N (asestar + golpe) gathered from the Spanish reference corpus of esTenTen10, accessed via the Sketch Engine platform (Kilgarriff & Renau 2013). These samples served as the input for a collocational strength analysis performed in R. Full reference: Kilgarriff, A., & Renau, I. (2013). esTenTen, a Vast Web Corpus of Peninsular and American Spanish. Procedia - Social and Behavioral Sciences, 95, 12-19. https://doi.org/10.1016/j.sbspro.2013.10.617.
As mentioned above, the data contained in this dataset originate from the Spanish reference corpus of esTenTen10, accessed via Sketch Engine. The corpus was used under the Sketch Engine Terms of Use. Additionally, the data originate from the the newspaper El Diario, all rights reserved.
The extracted text fragments that are contained in the data files of this dataset only represent non-substantial portions of the corpus, and they do not represent coherent larger texts. Therefore, the reuse (including redistribution) of these excerpts is permitted by the exceptions rules in IPR and database protection regulations, such as Fair use (USA cf. US Copyright Act), Fair dealing (UK; cf. Exceptions to copyright), the EU Database Directive (cf. article 8 Rights and obligations of lawful users), "lover, forskrifter, rettsavgjørelser og andre vedtak av offentlig myndighet" (Norway; cf. § 14 in Åndsverkloven), "uvesentlige deler av databaser" (Norway; cf. § 24 in Åndsverkloven), "sitatretten" (Norway; cf. § 29 in Åndsverkloven). |