Metrics
1,731,750 Downloads
Featured Dataverses

In order to use this feature you must have at least one published or linked dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

129,591 to 129,600 of 130,364 Results
Tab-Delimited - 221.2 KB - MD5: b6a77c3f6cf5451c53fcf4473e0fc9a6
This is the list of TOROT lemmata (used for lemma guessing, see 3.1 and 4.2)
Comma Separated Values - 442.2 KB - MD5: a135b411b70cc2b0177b6c4e224b0218
This is the output of the RNC tagger (to get access to the tagger itself, contact the third author)
XML - 348.9 KB - MD5: 42d6811c314b60d52ca4594aaefdcd59
This is the output of the TOROT tagger (to get access to the tagger itself, contact the second author)
XML - 387.9 KB - MD5: 878b959e625550c33f452fb2a47a6da3
This is the gold standard (see 3.2)
Unknown - 23.2 KB - MD5: 8c3ee64ea7fe6fc9c6086c68e380ae4f
This is the comparison script (use Ruby 1.9.0 or higher to launch it. Make sure files 1, 6, 7, 8 and 9 are in the same directory. Warning messages about duplicated keys can most likely be ignored, otherwise make sure all Unicode symbols are being read correctly. The script will g...
Tab-Delimited - 163.1 KB - MD5: 381d0df6e33962bc76e7245c55260a3d
This is the same output as in 11 in a slightly different form (intended to facilitate manual comparisons). Can be generated by file 10.
Comma Separated Values - 583.6 KB - MD5: 569b847e697879571314d4e3dad67bc7
This is the morphological tagging output, aligned with each other and with gold. Can be generated by file 10. Meant to be used as input for file 14.
Unknown - 6.7 KB - MD5: 32d111fce088bc33663f88325641350f
This is the script that performs comparison of morphological tagging (all other comparisons are made by File 10). Use Ruby 1.9.0 or higher to launch it. Make sure files 8, 9 and 13 are in the same directory. The script will generate file 16. Contact the second author if you have...
Tab-Delimited - 46.8 KB - MD5: 2d35413a615dba85139fb2c031bfd75b
This is the detailed information about whether each guess of lemma and POS by both taggers is correct or no. Can be generated by file 10.
Add Data

Log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.