Variants of compound models and their application to citation analysis

2.50
Hdl Handle:
http://hdl.handle.net/2436/620467
Title:
Variants of compound models and their application to citation analysis
Authors:
Low, Wan Jing
Abstract:
This thesis develops two variant statistical models for count data based upon compound models for contexts when the counts may be viewed as derived from two generations, which may or may not be independent. Unlike standard compound models, the variants model the sum of both generations. We consider cases where both generations are negative binomial or one is Poisson and the other is negative binomial. The first variant, denoted SVA, follows a zero restriction, where a zero in the first generation will automatically be followed by a zero in the second generation. The second variant, denoted SVB, is a convolution model that does not possess this zero restriction. The main properties of the SVA and SVB models are outlined and compared with standard compound models. The results show that the SVA distributions are similar to standard compound distributions for some fixed parameters. Comparisons of SVA, Poisson hurdle, negative binomial hurdle and their zero-inflated counterpart using simulated SVA data indicate that different models can give similar results, as the generating models are not always selected as the best fitting. This thesis focuses on the use of the variant models to model citation counts. We show that the SVA models are more suitable for modelling citation data than other previously used models such as the negative binomial model. Moreover, the application of SVA and SVB models may be used to describe the citation process. This thesis also explores model selection techniques based on log-likelihood methods, Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). The suitability of the models is also assessed using two diagrammatic methods, randomised quantile residual plots and Christmas tree plots. The Christmas tree plots clearly illustrate whether the observed data are within fluctuation bounds under the fitted model, but the randomised quantile residual plots utilise the cumulative distribution, and hence are insensitive to individual data values. Both plots show the presence of citation counts that are larger than expected under the fitted model in the data sets.
Issue Date:
2017
URI:
http://hdl.handle.net/2436/620467
Type:
Thesis
Language:
en
Description:
A thesis submitted in partial ful lment of the requirements of the University of Wolverhampton for the degree of Doctor of Philosophy.
Appears in Collections:
E-Theses

Full metadata record

DC FieldValue Language
dc.contributor.authorLow, Wan Jingen
dc.date.accessioned2017-05-10T08:27:04Z-
dc.date.available2017-05-10T08:27:04Z-
dc.date.issued2017-
dc.identifier.urihttp://hdl.handle.net/2436/620467-
dc.descriptionA thesis submitted in partial ful lment of the requirements of the University of Wolverhampton for the degree of Doctor of Philosophy.en
dc.description.abstractThis thesis develops two variant statistical models for count data based upon compound models for contexts when the counts may be viewed as derived from two generations, which may or may not be independent. Unlike standard compound models, the variants model the sum of both generations. We consider cases where both generations are negative binomial or one is Poisson and the other is negative binomial. The first variant, denoted SVA, follows a zero restriction, where a zero in the first generation will automatically be followed by a zero in the second generation. The second variant, denoted SVB, is a convolution model that does not possess this zero restriction. The main properties of the SVA and SVB models are outlined and compared with standard compound models. The results show that the SVA distributions are similar to standard compound distributions for some fixed parameters. Comparisons of SVA, Poisson hurdle, negative binomial hurdle and their zero-inflated counterpart using simulated SVA data indicate that different models can give similar results, as the generating models are not always selected as the best fitting. This thesis focuses on the use of the variant models to model citation counts. We show that the SVA models are more suitable for modelling citation data than other previously used models such as the negative binomial model. Moreover, the application of SVA and SVB models may be used to describe the citation process. This thesis also explores model selection techniques based on log-likelihood methods, Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). The suitability of the models is also assessed using two diagrammatic methods, randomised quantile residual plots and Christmas tree plots. The Christmas tree plots clearly illustrate whether the observed data are within fluctuation bounds under the fitted model, but the randomised quantile residual plots utilise the cumulative distribution, and hence are insensitive to individual data values. Both plots show the presence of citation counts that are larger than expected under the fitted model in the data sets.en
dc.language.isoenen
dc.subjectAICen
dc.subjectBICen
dc.subjectcompounden
dc.subjectnegative binomialen
dc.subjectPoissonen
dc.subjectvarianten
dc.subjectzero-inflateden
dc.subjectcitation analysisen
dc.subjectmodel selectionen
dc.subjectrandomised quantile residualen
dc.titleVariants of compound models and their application to citation analysisen
dc.typeThesisen
All Items in WIRE are protected by copyright, with all rights reserved, unless otherwise indicated.