K-SPAN (Korean Surface Phones and Neighborhoods) (doi:10.18710/TWM79F)

View:

Part 1: Document Description
Part 2: Study Description
Part 3: Data Files Description
Part 4: Variable Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

K-SPAN (Korean Surface Phones and Neighborhoods)

Identification Number:

doi:10.18710/TWM79F

Distributor:

DataverseNO

Date of Distribution:

2016-06-03

Version:

2

Bibliographic Citation:

Holliday, Jeffrey J.; Turnbull, Rory; Eychenne, Julien, 2016, "K-SPAN (Korean Surface Phones and Neighborhoods)", https://doi.org/10.18710/TWM79F, DataverseNO, V2, UNF:6:NWbRmiBvO5wWcDN2QHCQJw== [fileUNF]

Study Description

Citation

Title:

K-SPAN (Korean Surface Phones and Neighborhoods)

Identification Number:

doi:10.18710/TWM79F

Authoring Entity:

Holliday, Jeffrey J. (Korea University)

Turnbull, Rory (Laboratoire de Sciences Cognitives et Psycholinguistique (ENS, EHESS))

Eychenne, Julien (Hankuk University of Foreign Studies)

Distributor:

DataverseNO

Distributor:

The Tromsø Repository of Language and Linguistics (TROLLing)

Access Authority:

Eychenne, Julien

Depositor:

Eychenne, Julien

Date of Deposit:

2016-06-02

Holdings Information:

https://doi.org/10.18710/TWM79F

Study Scope

Keywords:

Arts and Humanities, Korean, corpus, phonetic forms, neighborhood density

Abstract:

This corpus provides surface phonetic forms derived from a publicly available orthographic corpus of Korean, along with neighborhood density statistics for each word in the corpus. The surface phonetic forms are rendered in an ASCII-encoded scheme, which allows users to explore and query the corpus without having to read Korean orthography.

Methodology and Processing

Sources Statement

Data Access

Other Study Description Materials

File Description--f1010

File: kspan_base.tab

  • Number of cases: 63836

  • No. of variables per record: 19

  • Type of File: text/tab-separated-values

Notes:

UNF:6:NWbRmiBvO5wWcDN2QHCQJw==

Variable Description

List of Variables:

Variables

WordNum

f1010 Location:

Summary Statistics: StDev 23875.359357897643; Valid 63836.0; Min. 1.0; Mean 40961.21700922892; Max. 82501.0;

Variable Format: numeric

Notes: UNF:6:wP5dAt0a6vEh8x3cZQleWg==

ModernKey

f1010 Location:

Variable Format: character

Notes: UNF:6:bCt6aAcTHEUHQGMZrWx5WA==

ConservativeKey

f1010 Location:

Variable Format: character

Notes: UNF:6:4Hb4esETz9d1l6h0CpCkfQ==

OrthographyKey

f1010 Location:

Variable Format: character

Notes: UNF:6:vzMs1ju8sLQTaV6f9tpJeQ==

ModernPhon

f1010 Location:

Variable Format: character

Notes: UNF:6:Tbr7aAFn65Y14GEGtHQ2Dg==

ConservativePhon

f1010 Location:

Variable Format: character

Notes: UNF:6:wIV0mXAKw80XAEU1mnPU7w==

SyllableCount

f1010 Location:

Summary Statistics: Max. 10.0; Mean 2.91746036719095; Valid 63836.0; StDev 1.026584050526256; Min. 1.0

Variable Format: numeric

Notes: UNF:6:c3vRvCdowbFSv2pLKWiS0A==

OrthographyNumNeighbors

f1010 Location:

Summary Statistics: Valid 63836.0; Max. 181.0; Min. 0.0; Mean 7.429851494454548; StDev 16.387613588969945

Variable Format: numeric

Notes: UNF:6:EvTpynjUIXqduJt4776w3Q==

OrthographyMeanNeighborFreq

f1010 Location:

Summary Statistics: Max. 10023.1; Mean 77.09740852190298; Min. 1.0; StDev 275.44037689710274; Valid 37574.0

Variable Format: numeric

Notes: UNF:6:0B3xlj0lz0oI3VhEojbAQQ==

ModernNumNeighbors

f1010 Location:

Summary Statistics: Max. 234.0; Valid 63836.0; Mean 9.36042358543796; StDev 21.16226491838188; Min. 0.0

Variable Format: numeric

Notes: UNF:6:WT1lELJ1bbjeHDE9gyrk4Q==

ModernMeanNeighborFreq

f1010 Location:

Summary Statistics: Min. 0.0; StDev 231.20052218599767; Max. 24386.0; Valid 63836.0; Mean 50.685222706630775;

Variable Format: numeric

Notes: UNF:6:DCa4VwB9Rsi4ENkfUVPBRA==

ConservativeNumNeighbors

f1010 Location:

Summary Statistics: Max. 173.0; Mean 5.698978632746256; Valid 63836.0; StDev 13.66515360965798; Min. 0.0

Variable Format: numeric

Notes: UNF:6:WecqlUDo8/v8YfP/+Frf/A==

ConservativeMeanNeighborFreq

f1010 Location:

Summary Statistics: Max. 25003.5714285714; Valid 36510.0; Mean 82.38331688681004; Min. 1.0; StDev 335.1962698824006

Variable Format: numeric

Notes: UNF:6:NXAYTSuWjxYl3qgVNVwy2w==

OrthographyNumNeighborsSyll

f1010 Location:

Summary Statistics: Max. 1972.0; Min. 0.0; Mean 152.62068425339794; Valid 63836.0; StDev 349.7542614646666;

Variable Format: numeric

Notes: UNF:6:kEcqEcKk0/H4mr0v74BsLQ==

OrthographyMeanNeighborFreqSyll

f1010 Location:

Summary Statistics: Min. 0.0; Valid 63836.0; Mean 51.0810263805391; StDev 117.5061517928684; Max. 4936.0;

Variable Format: numeric

Notes: UNF:6:cT4Z9id+crWBuhp+gZdOdg==

ModernNumNeighborsSyll

f1010 Location:

Summary Statistics: Valid 63836.0; StDev 345.500225063067; Min. 0.0; Max. 1963.0; Mean 143.23494579860804

Variable Format: numeric

Notes: UNF:6:4DGiQVKEtosWPUuFiC3YVQ==

ModernMeanNeighborFreqSyll

f1010 Location:

Summary Statistics: Valid 63836.0; StDev 114.11248254957268; Min. 0.0; Max. 3321.0; Mean 52.209681697250424

Variable Format: numeric

Notes: UNF:6:QSkWDFczMCkzkotUUxTUTg==

ConservativeNumNeighborsSyll

f1010 Location:

Summary Statistics: Min. 0.0; Max. 1963.0; Mean 138.3095118741772; Valid 63836.0; StDev 344.09839390287453

Variable Format: numeric

Notes: UNF:6:uW5guQkBl41FRbcEpIz0Hw==

ConservativeMeanNeighborFreqSyll

f1010 Location:

Summary Statistics: Min. 0.0; StDev 114.45215990922387; Valid 63836.0; Mean 51.088684792538665; Max. 3327.3333333333

Variable Format: numeric

Notes: UNF:6:ia6pTWbo/DSCF/N3Fv0PtA==

Other Study-Related Materials

Label:

kspan_doc.pdf

Text:

documentation for the K-SPAN database

Notes:

application/pdf

Other Study-Related Materials

Label:

merge_kspan.py

Text:

Script to merge K-SPAN with the NIKL corpus (see documentation)

Notes:

text/x-python