View: |
Part 1: Document Description
|
Citation |
|
---|---|
Title: |
Czech word and MWE lists |
Identification Number: |
doi:10.18710/PGDWXC |
Distributor: |
DataverseNO |
Date of Distribution: |
2020-04-09 |
Version: |
1 |
Bibliographic Citation: |
Cvrček, Václav, 2020, "Czech word and MWE lists", https://doi.org/10.18710/PGDWXC, DataverseNO, V1 |
Citation |
|
Title: |
Czech word and MWE lists |
Identification Number: |
doi:10.18710/PGDWXC |
Authoring Entity: |
Cvrček, Václav (Czech National Corpus) |
Other identifications and acknowledgements: |
Cvrček, Václav |
Other identifications and acknowledgements: |
Komrsková, Zuzana |
Other identifications and acknowledgements: |
Lukeš, David |
Other identifications and acknowledgements: |
Poukarová, Petra |
Other identifications and acknowledgements: |
Řehořková, Anna |
Other identifications and acknowledgements: |
Zasina, Adrian Jan |
Producer: |
Czech National Corpus |
Grant Number: |
CZ.02.1.01/0.0/0.0/16_013/0001758 |
Distributor: |
DataverseNO |
Distributor: |
The Tromsø Repository of Language and Linguistics (TROLLing) |
Access Authority: |
Lukeš, David |
Depositor: |
Lukeš, David |
Date of Deposit: |
2020-04-06 |
Holdings Information: |
https://doi.org/10.18710/PGDWXC |
Study Scope |
|
Keywords: |
Arts and Humanities, multi-dimensional analysis, lexicon, Czech, word list |
Abstract: |
This post contains word and MWE (multi-word expression) lists used for the operationalization of some of the linguistic features in the multi-dimensional analysis (MDA) of Czech project carried out at the Czech National Corpus. The MDA procedure requires identifying and operationalizing linguistic features relevant for register variation in the language under scrutiny. In the Czech MDA project, some of these features were operationalized by compiling lists of words and multi-word expressions, which can then be matched against a text to identify occurrences. Compiling such a list can be tedious and error prone work, which is why we provide ours as a resource for other linguists either to adopt wholesale or at least use as a starting point to build on top of. |
Time Period: |
1990-2014 |
Date of Collection: |
2017-2018 |
Country: |
Czech Republic |
Kind of Data: |
corpus data |
Methodology and Processing |
|
Sources Statement |
|
Data Sources: |
Koditex corpus (https://wiki.korpus.cz/doku.php/en:cnk:koditex) |
Data Access |
|
Other Study Description Materials |
|
Related Studies |
|
https://doi.org/10.18710/QAJKZW |
|
Related Publications |
|
Citation |
|
Title: |
Cvrček, V., Komrsková, Z., Lukeš, D. et al. Comparing web-crawled and traditional corpora. Lang Resources & Evaluation (2020). |
Identification Number: |
10.1007/s10579-020-09487-4 |
Bibliographic Citation: |
Cvrček, V., Komrsková, Z., Lukeš, D. et al. Comparing web-crawled and traditional corpora. Lang Resources & Evaluation (2020). |
Citation |
|
Title: |
Cvrček, V., Komrsková, Z., Lukeš, D., Poukarová, P., Řehořková, A., & Zasina, A. (2018). From extra- to intratextual characteristics: Charting the space of variation in Czech through MDA, Corpus Linguistics and Linguistic Theory (published online ahead of print). |
Identification Number: |
10.1515/cllt-2018-0020 |
Bibliographic Citation: |
Cvrček, V., Komrsková, Z., Lukeš, D., Poukarová, P., Řehořková, A., & Zasina, A. (2018). From extra- to intratextual characteristics: Charting the space of variation in Czech through MDA, Corpus Linguistics and Linguistic Theory (published online ahead of print). |
Label: |
00_README.docx |
Text: |
Start here. |
Notes: |
application/vnd.openxmlformats-officedocument.wordprocessingml.document |
Label: |
00_README.pdf |
Text: |
Start here. |
Notes: |
application/pdf |
Label: |
ASIM.txt |
Notes: |
text/plain |
Label: |
AUG.txt |
Notes: |
text/plain |
Label: |
COH2.txt |
Notes: |
text/plain |
Label: |
DEMI.txt |
Notes: |
text/plain |
Label: |
DP.txt |
Notes: |
text/plain |
Label: |
DT.txt |
Notes: |
text/plain |
Label: |
EXP.txt |
Notes: |
text/plain |
Label: |
FEI.txt |
Notes: |
text/plain |
Label: |
FOUU.txt |
Notes: |
text/plain |
Label: |
FUOU.txt |
Notes: |
text/plain |
Label: |
FYEJ.txt |
Notes: |
text/plain |
Label: |
GENL.txt |
Notes: |
text/plain |
Label: |
GRAAA.txt |
Notes: |
text/plain |
Label: |
KONT.txt |
Notes: |
text/plain |
Label: |
LAT.txt |
Notes: |
text/plain |
Label: |
MOD.txt |
Notes: |
text/plain |
Label: |
PAJ.txt |
Notes: |
text/plain |
Label: |
PAV.txt |
Notes: |
text/plain |
Label: |
POE.txt |
Notes: |
text/plain |
Label: |
PRE2.txt |
Notes: |
text/plain |
Label: |
PROPA.txt |
Notes: |
text/plain |
Label: |
PROPT.txt |
Notes: |
text/plain |
Label: |
PSB.txt |
Notes: |
text/plain |
Label: |
PVB.txt |
Notes: |
text/plain |
Label: |
RST.txt |
Notes: |
text/plain |
Label: |
VD.txt |
Notes: |
text/plain |
Label: |
VTS.txt |
Notes: |
text/plain |
Label: |
VUL.txt |
Notes: |
text/plain |
Label: |
VYPW.txt |
Notes: |
text/plain |