Statistical analysis of near-synonymous words list and catalog in R

Authors

  • Андрей Валерьевич Носов

DOI:

https://doi.org/10.21638/spbu09.2018.310

Abstract

In this article, we present the results of the regression analysis of near-synonymous words list and catalog. The purpose of the case study is allocation of the most objective variant by modeling the grammatical interactions that make impact on updating of the considered words. Determination of list and catalog as objective and independent lexical units is performed within the system of distinctions and oppositions. By the probabilistic distribution, we allocate two most frequent interactions. The comparison of average values does not reveal regularly all aspects of the studied phenomenon (i.e. average values of models can be statistically identical). Therefore, we compare the models with predictors PRE.MOD and GENITIVE MEAN with the model without interactions to show distinction between them at the level of dispersion. Hence, three statistical hypotheses are compared in pairs. The main says that dispersions of three considered models are statistically equal and the alternative affirms that they are different. Model assessment without interactions is a predictive logit of list. Coefficients of logistic regression reflect the probability of changes within interactions. At the stage of normalization, we apply the model of the binary choice Hosmer—Lemeshow. Based on the obtained results we decide whether it is necessary further normalization or not. We define also the presence/absence of correlated samples among the considered predictors by lrm function, which determines reliability of the model and allows receiving confidential intervals of coefficients. This approach reflects novelty of work and allows revealing the factors defining the choice of one or another concept proceeding from objective semantic criteria. Interactions are considered at four levels: academic, spoken, fiction and news. Results of research allow to complete the content of the words list and catalog and to present their dynamics.

Keywords:

computational linguistic, logistic regression, comparative analysis, semantics, synonym, list, catalog

Downloads

Download data is not yet available.
 

References


References

Church et al. 1994 — Church K. W., Gale W., Hanks P., Hindle D., Moon R. “Lexical substitutability”. Computational Approaches to the Lexicon. Atkins B. T. S., Zampolli A. (eds.). Oxford: Oxford University Press, 1994, pp. 153–177.

Geeraerts 2010 — Geeraerts D. Theories of Lexical Semantics. Oxford: Oxford University Press, 2010, 341 p.

Gibbs 2006 — Gibbs R. W. “Metaphor Interpretation as Embodied Simulation”. Mind & Language. 21 (3), 2006: 434–458.

Gilquin 2003 — Gilquin G. “Causative ‘Get’ and ‘Have’: So Close, So Different”. Journal of English Linguistics. 31 (2), 2003: 125–148.

Glynn 2010 — Glynn D. “Synonymy, Lexical Fields, and Grammatical Constructions. Developing Usagebased Methodology for Cognitive Semantics”. Cognitive Foundations of Linguistic Usage Patterns. Schmid H.-J., Handl S. (eds.). [Berlin; New York]: De Gruyter Mouton, 2010, рр. 89–118.

Gries 2001 — Gries S. Th. “A Corpus-linguistic Analysis of -ic and -ical Adjectives”. ICAME Journal. 25, 2001: 65–108.

Gries, Otani 2010 — Gries S. Th., Otani N. “Behavioral Profiles: A Corpus-based Perspective on Synonymy and Antonymy”. ICAME Journal. 34, 2010: 121–150.

Hosmer, Lemeshow 1989 — Hosmer D. W., Lemeshow S. Applied Logistic Regression. New York: Wiley, 1989, XIII, 307 p.

Hunston 2002 — Hunston S. Corpora in Applied Linguistics. Cambridge: Cambridge University Press, 2002, 241 p.

Leitner 1993 — Leitner G. “Where to ‘Begin’ or ‘Start’? Aspectual Verbs in Dictionarie”. Data, Description, Discourse: Papers on the English Language in Honour of J. McH Sinclair on His 60th Birthday. Hoey M. (ed.). London: Harper Collins, 1993, рр. 50–63.

Levshina 2015 — Levshina N. How to Do Linguistics with R: Data Exploration and Statistical Analysis. Amsterdam; Philadelphia: John Benjamins, 2015, 443 p.

Levshina et al. 2014 — Levshina N., Geeraerts D., Speelman D. “Dutch Causative Constructions with Doen and Laten: Quantification of Meaning and Meaning of Quantification”. Corpus Methods for Semantics: Quantitative Studies in Polysemy and Synonymy. Glynn D., Robinson J. (ed.). Amsterdam: John Benjamins, 2014, pp. 205–221.

Miller, Walter 1991 — Miller G. A., Walter G. Ch. “Contextual Correlates of Semantic Similarity”. Language and Cognitive Processes. 6 (1), 1991: 1–28.

Minitab Inc. 2010 — “Minitab Inc.”. Softline Ltd. Educational portal. 2010. URL: http://support.minitab.com/en-us/minitab/17/topic-library/modeling-statistics/regression-and-correlation/regression-models/what-are-response-and-predictor-variables/ (accessed date: 29.05.2017).

Nosov 2016 — Nosov A. V. “Lingvisticheskie parametry kontseptov «list» i «catalog»: Variant obrabotki iazyka dlia komp’iuternykh system [Linguistic Parameters of the Concepts “LIST” and “CATALOG”: Language Processing Version for Computer Systems]”. Vestnik Permskogo un-ta: Rossiskaia i zarubezhnaia filologiia [Bulletin of Perm University: Russian and Foreign Philology]”. 4 (36), 2016: 75–82. (In Russian)

Phoocharoensil 2010 — Phoocharoensil S. A. “Corpus-Based Study of English Synonyms”. International Journal of Arts and Sciences. 3 (10), 2010: 227–245.

Shah, Barnwell 2003 — Shah B. V., Barnwell B. G. “Hosmer-Lemeshow Goodness of Fit Test for Survey Data Research”. 2003 ASA Proceedings: Papers Presented at the Annual Meeting of the American Statistical Association: Joint Statistical Meetings, San Francisco, California, August 3–7, 2003, and Other ASAsponsored Conferences. S. l.: American Statistical Association, 2003, pp. 3778–3781.

Speelman 2014 — Speelman D. “Logistic Regression: A Confirmatory Technique for Comparisons in Corpus Linguistics”. Corpus Methods for Semantics: Quantitative Studies in Polysemy and Synonymy. Glynn D., Robinson J. (eds.). Amsterdam: John Benjamins, 2014, pp. 487–533.

Published

2018-12-19

How to Cite

Носов, А. В. (2018). Statistical analysis of near-synonymous words list and catalog in R. Vestnik of Saint Petersburg University. Language and Literature, 15(3), 453–464. https://doi.org/10.21638/spbu09.2018.310

Issue

Section

Linguistics