Дистрибутивно-квантитативные и семантические  характеристики глаголов знания в старославянской  и древнерусской письменности

Oleg F. Zholobov; Victor A. Baranov

doi:10.21638/spbu09.2021.104

Authors

Oleg F. Zholobov Kazan Federal University, 18, Kremlevskaia ul., Kazan, 420008, Russia https://orcid.org/0000-0002-7178-1890
Victor A. Baranov Izhevsk State Technical University, 7, Studencheskaia ul., Izhevsk, 426069, Russia

DOI:

https://doi.org/10.21638/spbu09.2021.104

Abstract

The article presents the first attempt of a distributive and quantitative analysis of the lexico-semantic series in the Old Russian language based on three multi-genre subcorpora of the historical corpus “Manuscript” (manuscripts.ru): lists of the Gospels, menaia, and chronicles. The authors made a correlation between the software processing of the diachronic corpus of data and their historical and linguistic status. The semantic relations between the verbs with the general meaning ‘know’ in the Old Russian language are considered: věděti, vdati, znati. Their substitution in the modern Russian standard by only the verb znat’ raises the question about this lexical group’s evolutionary dynamics. The authors established that the entire series belongs to the original lexical system, although the verb vědati was not found in Old Slavonic manuscripts and became widespread in Old Russian sources, both colloquial and literary. The analysis proves that the verb vědati in the Old Russian written sources acts as a substitute for the athematic verb věděti. The revealed quantitative bigram-indicators made it possible to establish contrasting collocations with the verbs věděti and vědati, on the one hand, and with the verb znati, on the other. Among the collocates of the verb znati, there were neither abstract nouns nor collocates that attach propositions. Distributive areas of the synonymy distribution in the verb row were found, which reflected the start of diachronic competition in the chain of lexemes. As a method for assessing the proximity of the bigrams’ components (analyzed verbs and their collocates), the authors used the statistical measure T-score and the r-Pearson correlation coefficient to determine the distribution’s degree series mutual correspondence.

Keywords:

Old Church Slavonic and Old Russian written sources, verbs from the semantic cluster ‘know’, n-grams, distributive and quantitative analysis

Downloads

Download data is not yet available.

References

Литература

Баранов 2019 — Баранов В. А. Создание и использование исторических корпусов славянских письменных памятников. Scripta & e-Scripta. 2019, 19: 33–57.

Баранов, Жолобов 2020 — Баранов В. А., Жолобов О. Ф. Лингвостатистическое исследование частотных слов в Словах Кирилла Туровского (по рукописи РНБ, F. п. I. 39). Slověne. 2020, 9 (1): 29–80. https://doi.org/10.31168/2305-6754.2020.9.1.2.

Бобкова 2015 — Бобкова Т. Извлечение коллокаций из корпуса украинских текстов. Research Journal Studies about Languages. 2015, (27): 93–105.

Браславский, Соколов 2006 — Браславский П., Соколов Е. Сравнение четырех методов автоматического извлечения двухсловных терминов из текста. В кн.: Компьютерная лингвистика и интеллектуальные технологии. М.: Изд-во Рос. гос. гумм. ун-та, 2006. https://clck.ru/RzcG6 (дата обращения: 01.08.2020).

Жолобов 2014 — Жолобов O. О рефлексах ti- и t-форм глаголов в древнерусском языке. Russian Linguistics. 2014, 38 (1): 121–163.

Зализняк 2004 — Зализняк А. А. Древненовгородский диалект. М.: Языки славянской культуры, 2004. 872 с.

Захаров, Хохлова 2014 — Захаров В. П., Хохлова М. В. Автоматическое выявление терминологических словосочетаний. Структурная и прикладная лингвистика. 2014, (10): 182–200.

Кочеткова 2013 — Кочеткова Н. А. Статистические языковые методы. Коллокации и коллигации. Новые информационные технологии в автоматизированных системах. 2013, (16): 301–305. https://clck.ru/RzcNe (дата обращения: 13.06.2020).

Масевич, Захаров 2016 — Масевич А. Ц., Захаров В. П. Методы корпусной лингвистики в исторических и культурологических исследованиях. В сб.: Компьютерная лингвистика и вычислительные онтологии. СПб., 2016. С. 24–43. https://clck.ru/RzcWC (дата обращения: 01.08.2020).

Митрофанова и др. 2008 — Митрофанова О. А, Белик В. В., Кадина В. В. Корпусное исследование сочетаемостных предпочтений частотных лексем русского языка. В сб.: Компьютерная лингвистика и интеллектуальные технологии. М., 2008. https://clck.ru/RzcYr (дата обращения: 01.08.2020).

Митрофанова, Соколова 2017 — Митрофанова О. А., Соколова Е. В. Автоматическое извлечение ключевых слов и словосочетаний из русскоязычных текстов с помощью алгоритма KEA. В сб.: Компьютерная лингвистика и вычислительные онтологии. Вып. 1. СПб., 2017. С. 157–165. https://clck.ru/RzcaY (дата обращения: 01.08.2020).

Непараметрические корреляции — Непараметрические корреляции. StatSoft. https://clck.ru/RzcdD (дата обращения: 30.10.2020).

Птенцова 2008 — Птенцова А. В. Семантическая оппозиция глаголов знати и вѣдѣти на материале русских оригинальных памятников XI–XVI вв. Die Welt der Slaven. 2008, (LIII): 265–278.

Соломоновская 2014 — Соломоновская А. Л. Семантическое поле «интеллектуальная деятельность» в идиолекте средневекового переводчика. Вестник славянских культур. 2014, (31): 118–127.

Ягунова, Пивоварова 2013 — Ягунова Е. В., Пивоварова Л. М. От коллокаций к конструкциям. В сб.: Русский язык: конструкционные и лексико-семантические подходы. СПб., 2013. https://bit. ly/2OWkAmC (дата обращения: 13.06.2020).

Baranov 2018 — Baranov V. A. Text Corpus of Medieval Manuscripts as a Goal and a Tool for Linguistic Research. In: Editing Mediaeval Texts from a Different Angle: Slavonic and Multilingual Traditions. Paris, Bristol, Connecticut: Peeters Leuven, 2018. P. 283–308.

Baranov, Gnutikov 2019 — Baranov V. A., Gnutikov R. M. The statistics and n-gram modules of the historical corpus “Manuscript”. In: Digital and Analytical Approaches to the Written Heritage. Sofia: Gutenberg Publishing House, 2019. P. 9–28.

Evert 2005 — Evert S. The statistics of word cooccurences word pairs and collocations. PhD thesis. Stuttgart: Universität Stuttgart, 2005. 353 p. https://clck.ru/Rzcob (accessed 15.07.2020).

Kutuzov et al. 2018 — Kutuzov A., Øvrelid L., Szymanski T., Velldal E. Diachronic word embeddings and semantic shifts: a survey. In: Proceedings of the 27th International Conference on Computational Linguistic. Santa Fe, 2018. P. 1384–1397. https://clck.ru/Rzcpk (accessed 15.07.2020).

Manning, Schütze 2000 — Manning C., Schütze H. Foundations of Statistical Natural Language Processing, 2000. 680 p. https://clck.ru/RzcrE (accessed 15.07.2020).

Žolobov 2016 — Žolobov O. Present tense forms variability in the Paroemiarion Zacharianum d. 1271 (to the parchment internet-edition). Zeitschrift für Slawistik. 2016, 61 (2): 305–321.

References

Баранов 2019 — Baranov V.A. Creation and use of historical corpuses of Slavic written monuments. Scripta & e-Scripta 2019, (19): 33–57. (In Russian)

Баранов, Жолобов 2020 — Baranov V.A., Zholobov O.F. Statistical and linguistic research of frequent words in the Words of Kirill Turovsky (based on the manuscript of the National Library of Russia, F.p. I. 39). Slověne. 2020, 9 (1): 29–80. https://doi.org/10.31168/2305-6754.2020.9.1.2. (In Russian)

Бобкова 2015 — Bobkova T. Extraction of collocations from the corpus of Ukrainian texts. Research Journal Studies about Languages. 2015, (27): 93–105. (In Russian)

Браславский, Соколов 2006 — Braslavskii P., Sokolov E. Comparison of four methods for automatically extracting two-word terms from text. In: Komp’iuternaia lingvistika i intellectual’nye tekhnologii. Moscow: Rossiiskii gosudarstvennyi gumanitarnyi universitet Publ., 2006. https://clck.ru/RzcG6 (accessed 15.07.2020). (In Russian)

Жолобов 2014 — Zholobov O. On reflexes of ti- and t-forms of verbs in Ancient Russian. Russian Linguistics. 2014, 38 (1): 121–163. (In Russian)

Зализняк 2004 — Zalizniak A.A. Old Novgorod dialect. Moscow: Iazyki slavianskoi kul’tury Publ., 2004. 872 p. (In Russian)

Захаров, Хохлова 2014 — Zakharov V.P., Khokhlova M.V. Automatic detection of terminological phrases. Strukturnaia i prikladnaia lingvistika. 2014, (10): 182–200. (In Russian)

Кочеткова 2013 — Kochetkova N.A. Statistical language methods. Collocations and colligations. Novye informatsionnye tekhnologii v avtomatizirovannykh sistemakh. 2013, (16): 301–305. https://clck.ru/RzcNe (accessed 15.07.2020). (In Russian)

Масевич, Захаров 2016 — Masevich A.Ts., Zakharov V.P. Methods of corpus linguistics in historical and cultural studies. In: Komp’iuternaia lingvistika i vychislitel’nye ontologii. St. Petersburg, 2016. P. 24–43. https://clck.ru/RzcWC (accessed 15.07.2020). (In Russian)

Митрофанова и др. 2008 — Mitrofanova O.A, Belik V.V., Kadina V.V. Corpus research of combinable preferences of the frequency lexemes of the Russian language. In: Komp’iuternaia lingvistika i intellektual’nye tekhnologii. Мoscow, 2008. https://clck.ru/RzcYr (accessed 15.07.2020). (In Russian)

Митрофанова, Соколова 2017 — Mitrofanova O.A., Sokolova E.V. Automatic extraction of keywords and phrases from Russian-language texts using the KEA algorithm. In: Komp’iuternaia lingvistika i vychislitel’nye ontologii. Issue 1. St. Petersburg, 2017. P. 157–165. https://clck.ru/RzcaY (accessed 15.07.2020). (In Russian)

Непараметрические корреляции — Nonparametric correlations. In: StatSoft. https://clck.ru/RzcdD (accessed 30.10.2020. (In Russian)

Птенцова 2008 — Ptentsova A.V. Semantic opposition of the verbs of знати and вѣдѣти on the material of Russian original monuments of the 11th–16th centuries. Die Welt der Slaven. 2008, (LIII): 265–278.

Соломоновская 2014 — Solomonovskaia A.L. Semantic field “intellectual activity” in the idiolect of a medieval translator. Vestnik slavianskikh kul’tur. 2014, (31): 118–127.

Ягунова, Пивоварова 2013 — Iagunova E.V., Pivovarova L.M.From collocations to structures. In: Russkii iazyk: konstruktsionnye i leksiko-semanticheskie podkhody. St. Petersburg, 2013. https://bit.ly/2OWkAmC (accessed 15.07.2020). (In Russian)

Baranov 2018 — Baranov V. A Text Corpus of Medieval Manuscripts as a Goal and a Tool for Linguistic Research. In: Editing Mediaeval Texts from a Different Angle: Slavonic and Multilingual Traditions. Paris, Bristol, Connecticut: Peeters Leuven, 2018. P. 283–308.

Baranov, Gnutikov 2019 — Baranov V.A., Gnutikov R.M. The statistics and n-gram modules of the historical corpus “Manuscript”. In: Digital and Analytical Approaches to the Written Heritage. Sofia: Gutenberg Publishing House, 2019. P. 9–28.

Evert 2005 — Evert S. The statistics of word cooccurences word pairs and collocations. PhD thesis. Stuttgart: Universität Stuttgart, 2005. 353 p. https://clck.ru/Rzcob (accessed 15.07.2020).

Kutuzov et al. 2018 — Kutuzov A., Øvrelid L., Szymanski T., Velldal E. Diachronic word embeddings and semantic shifts: a survey. In: Proceedings of the 27th International Conference on Computational Linguistic. Santa Fe, 2018. P. 1384–1397. https://clck.ru/Rzcpk (accessed 15.07.2020).

Manning, Schütze 2000 — Manning C., Schütze H. Foundations of Statistical Natural Language Processing, 2000. 680 p. https://clck.ru/RzcrE (accessed 15.07.2020).

Žolobov 2016 — Žolobov O. Present tense forms variability in the Paroemiarion Zacharianum d. 1271 (to the parchment internet-edition). Zeitschrift für Slawistik. 2016, 61 (2): 305–321.