This dataset contains replication data, code and documentation for two experiments – one translation task and one judgment task with Likert-scale items – which together form the empirical basis for the article “On the Proper Treatment of Particles: The Deictic Particles os’ and ot in Ukrainian” (to appear in Russian Linguistics). The translation task investigates how the Russian deictic particle vot is rendered in Ukrainian, while the judgment task investigates how Ukrainian deictic particles (os’, ot and a dummy particle) are evaluated in different contextual configurations. The dataset includes data files, variable documentation and methodological information necessary to understand, reuse and replicate the analyses reported in the article. (2026-06-23)
Abstract of paper:
The paper investigates the competition between the Ukrainian deictic discourse particles os’ and ot, which are often treated as near-synonymous counterparts of Russian vot ‘(look,) here’. The central hypothesis is that anchoring to the actual utterance time–world favors os’, while distance from the deictic ‘here-and-now’, including future temporal shift and modal displacement, favors ot. Methodologically, the study combines parallel-corpus evidence and a controlled translation task with a Ukrainian-only acceptability-judgment task and corpus-based distributional profiling.
Russian–Ukrainian parallel data from the Russian National Corpus show that both os’ and ot frequently correspond to Russian vot, but translation data also reveal strong translationese effects that blur internal distinctions within the Ukrainian system. To obtain Ukrainian-internal evidence, a 4-point Likert-scale judgment study with a 2 × 2 design (±Future, ±Modality) was analyzed using a Bayesian ordinal mixed-effects model. The model shows that os’ is strongly preferred to ot in non-modal, non-future “here-and-now” contexts, that modality gives ot a large acceptability boost, and that future temporal shift favors ot more weakly and with greater uncertainty.
A balanced co-occurrence profile for Ukrainian originals in the RNC and preliminary large-scale evidence from the GRAC corpus corroborate this division of labor: ot is systematically overrepresented with negation and modal markers, whereas os’ is tightly linked to immediate spatial anchoring. Purely temporal markers do not robustly distinguish the two particles. The study thus refines the “subjective-modal” semantics of os’ and ot and highlights the usefulness of combining Bayesian mixed-effects modeling of acceptability judgments with transparent corpus-based profiling in the analysis of discourse particles. |