Statistical analysis of near-synonymous words list and catalog in R
DOI:
https://doi.org/10.21638/spbu09.2018.310Abstract
In this article, we present the results of the regression analysis of near-synonymous words list and catalog. The purpose of the case study is allocation of the most objective variant by modeling the grammatical interactions that make impact on updating of the considered words. Determination of list and catalog as objective and independent lexical units is performed within the system of distinctions and oppositions. By the probabilistic distribution, we allocate two most frequent interactions. The comparison of average values does not reveal regularly all aspects of the studied phenomenon (i.e. average values of models can be statistically identical). Therefore, we compare the models with predictors PRE.MOD and GENITIVE MEAN with the model without interactions to show distinction between them at the level of dispersion. Hence, three statistical hypotheses are compared in pairs. The main says that dispersions of three considered models are statistically equal and the alternative affirms that they are different. Model assessment without interactions is a predictive logit of list. Coefficients of logistic regression reflect the probability of changes within interactions. At the stage of normalization, we apply the model of the binary choice Hosmer—Lemeshow. Based on the obtained results we decide whether it is necessary further normalization or not. We define also the presence/absence of correlated samples among the considered predictors by lrm function, which determines reliability of the model and allows receiving confidential intervals of coefficients. This approach reflects novelty of work and allows revealing the factors defining the choice of one or another concept proceeding from objective semantic criteria. Interactions are considered at four levels: academic, spoken, fiction and news. Results of research allow to complete the content of the words list and catalog and to present their dynamics.
Keywords:
computational linguistic, logistic regression, comparative analysis, semantics, synonym, list, catalog
Downloads
References
References
Downloads
Published
How to Cite
Issue
Section
License
Articles of "Vestnik of Saint Petersburg University. Language and Literature" are open access distributed under the terms of the License Agreement with Saint Petersburg State University, which permits to the authors unrestricted distribution and self-archiving free of charge.