{"id":7740,"identifier":"7TSABU","persistentUrl":"https://doi.org/10.18710/7TSABU","protocol":"doi","authority":"10.18710","publisher":"DataverseNO","publicationDate":"2019-06-06","storageIdentifier":"S3://10.18710/7TSABU","datasetVersion":{"id":3429,"datasetId":7740,"datasetPersistentId":"doi:10.18710/7TSABU","storageIdentifier":"S3://10.18710/7TSABU","versionNumber":1,"versionMinorNumber":3,"versionState":"RELEASED","productionDate":"2016","UNF":"UNF:6:/bvw/86x+A+n0yewX4UV7A==","lastUpdateTime":"2023-09-28T19:45:52Z","releaseTime":"2023-09-28T19:45:52Z","createTime":"2023-09-17T14:36:29Z","publicationDate":"2019-06-06","citationDate":"2019-06-06","license":{"name":"CC0 1.0","uri":"http://creativecommons.org/publicdomain/zero/1.0","iconUri":"https://licensebuttons.net/p/zero/1.0/88x31.png"},"fileAccessRequest":true,"metadataBlocks":{"citation":{"displayName":"Citation Metadata","name":"citation","fields":[{"typeName":"title","multiple":false,"typeClass":"primitive","value":"Replication data for: Chunking or predicting – frequency information and reduction in the perception of multi-word sequences"},{"typeName":"author","multiple":true,"typeClass":"compound","value":[{"authorName":{"typeName":"authorName","multiple":false,"typeClass":"primitive","value":"Lorenz, David"},"authorAffiliation":{"typeName":"authorAffiliation","multiple":false,"typeClass":"primitive","value":"University of Rostock"},"authorIdentifierScheme":{"typeName":"authorIdentifierScheme","multiple":false,"typeClass":"controlledVocabulary","value":"ORCID"},"authorIdentifier":{"typeName":"authorIdentifier","multiple":false,"typeClass":"primitive","value":"0000-0002-7451-099X"}},{"authorName":{"typeName":"authorName","multiple":false,"typeClass":"primitive","value":"Tizón-Couto, David"},"authorAffiliation":{"typeName":"authorAffiliation","multiple":false,"typeClass":"primitive","value":"University of Vigo"},"authorIdentifierScheme":{"typeName":"authorIdentifierScheme","multiple":false,"typeClass":"controlledVocabulary","value":"ORCID"},"authorIdentifier":{"typeName":"authorIdentifier","multiple":false,"typeClass":"primitive","value":"0000-0003-0788-7954"}}]},{"typeName":"datasetContact","multiple":true,"typeClass":"compound","value":[{"datasetContactName":{"typeName":"datasetContactName","multiple":false,"typeClass":"primitive","value":"Lorenz, David"},"datasetContactAffiliation":{"typeName":"datasetContactAffiliation","multiple":false,"typeClass":"primitive","value":"University of Rostock"},"datasetContactEmail":{"typeName":"datasetContactEmail","multiple":false,"typeClass":"primitive","value":"david.lorenz2@uni-rostock.de"}}]},{"typeName":"dsDescription","multiple":true,"typeClass":"compound","value":[{"dsDescriptionValue":{"typeName":"dsDescriptionValue","multiple":false,"typeClass":"primitive","value":"
This is the data and code from a word-monitoring task, in which participants responded to the word 'to' in verb + to-infinitive structures (V-to-Vinf) in English, where 'to' could occur in a full or reduced pronunciation. Accuracy and response times were analysed with mixed-effects generalized additive models (GAMM); the code also includes visualisations of these models. The paper is accepted for publication in Cognitive Linguistics.\nThe experiment was run with OpenSesame (version 3.0.7 for Mac, cf. Mathôt et al. 2012). The data include information on frequencies of occurrence of words and bigrams; this was extracted from the Corpus of Contemporary American English (COCA, Davies 2008–). We used R (R Core Team 2017) for all data analyses, hence the code can best be replicated in R.
\n\n\nAbstract:\nFrequently used linguistic structures become entrenched in memory; this is often assumed to make their consecutive parts more predictable, as well as fuse them into a single unit (chunking). High frequency moreover leads to a propensity for phonetic reduction. We present a word recognition experiment which tests how frequency information (string frequency, transitional probability) interacts with reduction in speech perception. Detection of the element to is tested in V-to-Vinf sequences in English (e.g. need to Vinf), where to can undergo reduction (“needa”). Results show that reduction impedes recognition, but this can be mitigated by the predictability of the item. Recognition generally benefits from surface frequency, while a modest chunking effect is found in delayed responses to reduced forms of high-frequency items. Transitional probability shows a facilitating effect on reduced but not on full forms. Reduced forms also pose more difficulty when the phonological context obscures the onset of to. We conclude that listeners draw on frequency information in a predictive manner to cope with reduction. High-frequency structures are not inevitably perceived as chunks, but depend on cues in the phonetic form – reduction leads to perceptual prominence of the whole over the parts and thus promotes a holistic access.
"},"dsDescriptionDate":{"typeName":"dsDescriptionDate","multiple":false,"typeClass":"primitive","value":"2019-04-18"}}]},{"typeName":"subject","multiple":true,"typeClass":"controlledVocabulary","value":["Arts and Humanities"]},{"typeName":"keyword","multiple":true,"typeClass":"compound","value":[{"keywordValue":{"typeName":"keywordValue","multiple":false,"typeClass":"primitive","value":"speech perception"}},{"keywordValue":{"typeName":"keywordValue","multiple":false,"typeClass":"primitive","value":"phonetic reduction"}},{"keywordValue":{"typeName":"keywordValue","multiple":false,"typeClass":"primitive","value":"chunking"}},{"keywordValue":{"typeName":"keywordValue","multiple":false,"typeClass":"primitive","value":"frequency information"}},{"keywordValue":{"typeName":"keywordValue","multiple":false,"typeClass":"primitive","value":"entrenchment"}},{"keywordValue":{"typeName":"keywordValue","multiple":false,"typeClass":"primitive","value":"English"}}]},{"typeName":"publication","multiple":true,"typeClass":"compound","value":[{"publicationCitation":{"typeName":"publicationCitation","multiple":false,"typeClass":"primitive","value":"Lorenz, David and Tizón-Couto, David. \"Chunking or predicting – frequency information and reduction in the perception of multi-word sequences \" Cognitive Linguistics, vol. 30, no. 4, 2019, pp. 751-784. https://doi.org/10.1515/cog-2017-0138"},"publicationIDType":{"typeName":"publicationIDType","multiple":false,"typeClass":"controlledVocabulary","value":"doi"},"publicationIDNumber":{"typeName":"publicationIDNumber","multiple":false,"typeClass":"primitive","value":"10.1515/cog-2017-0138"},"publicationURL":{"typeName":"publicationURL","multiple":false,"typeClass":"primitive","value":"https://doi.org/10.1515/cog-2017-0138"}}]},{"typeName":"language","multiple":true,"typeClass":"controlledVocabulary","value":["English"]},{"typeName":"producer","multiple":true,"typeClass":"compound","value":[{"producerName":{"typeName":"producerName","multiple":false,"typeClass":"primitive","value":"University of Freiburg"},"producerURL":{"typeName":"producerURL","multiple":false,"typeClass":"primitive","value":"https://uni-freiburg.de/en/"}},{"producerName":{"typeName":"producerName","multiple":false,"typeClass":"primitive","value":"University of Vigo"},"producerURL":{"typeName":"producerURL","multiple":false,"typeClass":"primitive","value":"https://www.uvigo.gal/en"}}]},{"typeName":"productionDate","multiple":false,"typeClass":"primitive","value":"2016"},{"typeName":"productionPlace","multiple":true,"typeClass":"primitive","value":["Freiburg","Vigo"]},{"typeName":"grantNumber","multiple":true,"typeClass":"compound","value":[{"grantNumberAgency":{"typeName":"grantNumberAgency","multiple":false,"typeClass":"primitive","value":"Spanish Ministry of Economy and Competitiveness"},"grantNumberValue":{"typeName":"grantNumberValue","multiple":false,"typeClass":"primitive","value":"FFI2016-77018-P"}},{"grantNumberAgency":{"typeName":"grantNumberAgency","multiple":false,"typeClass":"primitive","value":"European Regional Development Fund"},"grantNumberValue":{"typeName":"grantNumberValue","multiple":false,"typeClass":"primitive","value":"IJCI-2015-25843"}},{"grantNumberAgency":{"typeName":"grantNumberAgency","multiple":false,"typeClass":"primitive","value":"Xunta de Galicia"},"grantNumberValue":{"typeName":"grantNumberValue","multiple":false,"typeClass":"primitive","value":"ED431C 2017/50"}},{"grantNumberAgency":{"typeName":"grantNumberAgency","multiple":false,"typeClass":"primitive","value":"Wissenschaftliche Gesellschaft Freiburg"}}]},{"typeName":"distributor","multiple":true,"typeClass":"compound","value":[{"distributorName":{"typeName":"distributorName","multiple":false,"typeClass":"primitive","value":"The Tromsø Repository of Language and Linguistics (TROLLing)"},"distributorAbbreviation":{"typeName":"distributorAbbreviation","multiple":false,"typeClass":"primitive","value":"TROLLing"},"distributorURL":{"typeName":"distributorURL","multiple":false,"typeClass":"primitive","value":"https://trolling.uit.no/"}}]},{"typeName":"depositor","multiple":false,"typeClass":"primitive","value":"Lorenz, David"},{"typeName":"dateOfDeposit","multiple":false,"typeClass":"primitive","value":"2019-04-18"},{"typeName":"timePeriodCovered","multiple":true,"typeClass":"compound","value":[{"timePeriodCoveredStart":{"typeName":"timePeriodCoveredStart","multiple":false,"typeClass":"primitive","value":"2016-05-09"},"timePeriodCoveredEnd":{"typeName":"timePeriodCoveredEnd","multiple":false,"typeClass":"primitive","value":"2016-11-24"}}]},{"typeName":"dateOfCollection","multiple":true,"typeClass":"compound","value":[{"dateOfCollectionStart":{"typeName":"dateOfCollectionStart","multiple":false,"typeClass":"primitive","value":"2016-05-09"},"dateOfCollectionEnd":{"typeName":"dateOfCollectionEnd","multiple":false,"typeClass":"primitive","value":"2016-11-24"}}]},{"typeName":"kindOfData","multiple":true,"typeClass":"primitive","value":["experimental data"]},{"typeName":"software","multiple":true,"typeClass":"compound","value":[{"softwareName":{"typeName":"softwareName","multiple":false,"typeClass":"primitive","value":"OpenSesame"},"softwareVersion":{"typeName":"softwareVersion","multiple":false,"typeClass":"primitive","value":"3.0.7."}},{"softwareName":{"typeName":"softwareName","multiple":false,"typeClass":"primitive","value":"R"}}]},{"typeName":"dataSources","multiple":true,"typeClass":"primitive","value":["– recorded sentences\n– Corpus of Contemporary American English https://www.english-corpora.org/coca/"]}]},"geospatial":{"displayName":"Geospatial Metadata","name":"geospatial","fields":[{"typeName":"geographicCoverage","multiple":true,"typeClass":"compound","value":[{"country":{"typeName":"country","multiple":false,"typeClass":"controlledVocabulary","value":"United States"}}]}]}},"files":[{"description":"Description of the data sets","label":"00_ReadMe_ChunkingPredicting.txt","restricted":false,"version":1,"datasetVersionId":3429,"categories":["Documentation"],"dataFile":{"id":7879,"persistentId":"doi:10.18710/7TSABU/CFGO19","pidURL":"https://doi.org/10.18710/7TSABU/CFGO19","filename":"00_ReadMe_ChunkingPredicting.txt","contentType":"text/plain","filesize":8470,"description":"Description of the data sets","categories":["Documentation"],"storageIdentifier":"S3://2002-yellow-dataverseno:16b2cb984e2-3b7511a84d03","rootDataFileId":-1,"md5":"ab842df2a5d9114643a2ce86c3130552","checksum":{"type":"MD5","value":"ab842df2a5d9114643a2ce86c3130552"},"creationDate":"2019-06-06"}},{"description":"R data frames sall_results, sall_clean, sall_items, sall_target","label":"ChunkingPredicting_Data.RData","restricted":false,"version":1,"datasetVersionId":3429,"categories":["Data"],"dataFile":{"id":7741,"persistentId":"doi:10.18710/7TSABU/NUVYNK","pidURL":"https://doi.org/10.18710/7TSABU/NUVYNK","filename":"ChunkingPredicting_Data.RData","contentType":"application/x-rlang-transport","filesize":316412,"description":"R data frames sall_results, sall_clean, sall_items, sall_target","categories":["Data"],"storageIdentifier":"S3://2002-yellow-dataverseno:16a30689a7d-041a86c79023","rootDataFileId":-1,"md5":"5305268bb0abaf0d3a129ce48a983aa3","checksum":{"type":"MD5","value":"5305268bb0abaf0d3a129ce48a983aa3"},"creationDate":"2019-04-18"}},{"description":"mixed-effects generalized additive model (GAMM) for response times, and visualizations of the results","label":"ChunkingPredicting_GAMM_and_plots.R.txt","restricted":false,"version":1,"datasetVersionId":3429,"categories":["Code"],"dataFile":{"id":7745,"persistentId":"doi:10.18710/7TSABU/QW4EH4","pidURL":"https://doi.org/10.18710/7TSABU/QW4EH4","filename":"ChunkingPredicting_GAMM_and_plots.R.txt","contentType":"text/plain","filesize":31345,"description":"mixed-effects generalized additive model (GAMM) for response times, and visualizations of the results","categories":["Code"],"storageIdentifier":"S3://2002-yellow-dataverseno:16a3068e9dc-937ddcc82a8b","rootDataFileId":-1,"md5":"a15ee576d24ea22b34c1ea5eabd5b971","checksum":{"type":"MD5","value":"a15ee576d24ea22b34c1ea5eabd5b971"},"creationDate":"2019-04-18"}},{"description":"mixed-effects generalized additive model (GAMM) for accuracy, and visualizations of the results","label":"ChunkingPredicting_accuracy-model.R.txt","restricted":false,"version":1,"datasetVersionId":3429,"categories":["Code"],"dataFile":{"id":7747,"persistentId":"doi:10.18710/7TSABU/H60HS9","pidURL":"https://doi.org/10.18710/7TSABU/H60HS9","filename":"ChunkingPredicting_accuracy-model.R.txt","contentType":"text/plain","filesize":25648,"description":"mixed-effects generalized additive model (GAMM) for accuracy, and visualizations of the results","categories":["Code"],"storageIdentifier":"S3://2002-yellow-dataverseno:16a3068c6d7-bffe912912a3","rootDataFileId":-1,"md5":"59f0e6205722082c189644cf640b854b","checksum":{"type":"MD5","value":"59f0e6205722082c189644cf640b854b"},"creationDate":"2019-04-18"}},{"description":"includes only ‘correct’ responses","label":"sall_clean.tab","restricted":false,"version":1,"datasetVersionId":3429,"categories":["Data"],"dataFile":{"id":7883,"persistentId":"doi:10.18710/7TSABU/WLOESE","pidURL":"https://doi.org/10.18710/7TSABU/WLOESE","filename":"sall_clean.tab","contentType":"text/tab-separated-values","filesize":575470,"description":"includes only ‘correct’ responses","categories":["Data"],"storageIdentifier":"S3://2002-yellow-dataverseno:16b2cc60ad1-3465213c6d0e","originalFileFormat":"text/csv","originalFormatLabel":"Comma Separated Values","originalFileSize":526622,"originalFileName":"sall_clean.csv","UNF":"UNF:6:9i7F5I6FRlXgGVaNe1UzrQ==","rootDataFileId":-1,"md5":"ba3028c94ab1732bde4676a8aa36ab36","checksum":{"type":"MD5","value":"ba3028c94ab1732bde4676a8aa36ab36"},"creationDate":"2019-06-06"}},{"description":"includes only target items [i.e. no control and distractor items], responses marked for 'correct' (yes/no)","label":"sall_items.tab","restricted":false,"version":1,"datasetVersionId":3429,"categories":["Data"],"dataFile":{"id":7882,"persistentId":"doi:10.18710/7TSABU/YNRZUY","pidURL":"https://doi.org/10.18710/7TSABU/YNRZUY","filename":"sall_items.tab","contentType":"text/tab-separated-values","filesize":369845,"description":"includes only target items [i.e. no control and distractor items], responses marked for 'correct' (yes/no)","categories":["Data"],"storageIdentifier":"S3://2002-yellow-dataverseno:16b2cc57dc8-77490dcb0074","originalFileFormat":"text/csv","originalFormatLabel":"Comma Separated Values","originalFileSize":330256,"originalFileName":"sall_items.csv","UNF":"UNF:6:dKyFgZ2A1dA3cX4/D8gKjA==","rootDataFileId":-1,"md5":"a7b71a04a0ecc1b015ff085bee1aad7b","checksum":{"type":"MD5","value":"a7b71a04a0ecc1b015ff085bee1aad7b"},"creationDate":"2019-06-06"}},{"description":"the complete ‘raw’ data","label":"sall_results.tab","restricted":false,"version":1,"datasetVersionId":3429,"categories":["Data"],"dataFile":{"id":7884,"persistentId":"doi:10.18710/7TSABU/GJTFDE","pidURL":"https://doi.org/10.18710/7TSABU/GJTFDE","filename":"sall_results.tab","contentType":"text/tab-separated-values","filesize":599815,"description":"the complete ‘raw’ data","categories":["Data"],"storageIdentifier":"S3://2002-yellow-dataverseno:16b2cc6fde0-42efcaa36664","originalFileFormat":"text/csv","originalFormatLabel":"Comma Separated Values","originalFileSize":537033,"originalFileName":"sall_results.csv","UNF":"UNF:6:SckM1Rs9ZcBHngUF9hmZaQ==","rootDataFileId":-1,"md5":"b8cf288dcc91b059925d986f0ce8026c","checksum":{"type":"MD5","value":"b8cf288dcc91b059925d986f0ce8026c"},"creationDate":"2019-06-06"}},{"description":"includes only correct responses on target items","label":"sall_target.tab","restricted":false,"version":1,"datasetVersionId":3429,"categories":["Data"],"dataFile":{"id":7885,"persistentId":"doi:10.18710/7TSABU/PZPSRO","pidURL":"https://doi.org/10.18710/7TSABU/PZPSRO","filename":"sall_target.tab","contentType":"text/tab-separated-values","filesize":352431,"description":"includes only correct responses on target items","categories":["Data"],"storageIdentifier":"S3://2002-yellow-dataverseno:16b2cc7b16f-577c8b85007d","originalFileFormat":"text/csv","originalFormatLabel":"Comma Separated Values","originalFileSize":324026,"originalFileName":"sall_target.csv","UNF":"UNF:6:H947lcKHGCtYINvOUArC0w==","rootDataFileId":-1,"md5":"87b935d1f6fb25d674a4d37148aa066c","checksum":{"type":"MD5","value":"87b935d1f6fb25d674a4d37148aa066c"},"creationDate":"2019-06-06"}}],"citation":"Lorenz, David; Tizón-Couto, David, 2019, \"Replication data for: Chunking or predicting – frequency information and reduction in the perception of multi-word sequences\", https://doi.org/10.18710/7TSABU, DataverseNO, V1, UNF:6:/bvw/86x+A+n0yewX4UV7A== [fileUNF]"}}