790 research outputs found
Universality of citation distributions: towards an objective measure of scientific impact
We study the distributions of citations received by a single publication
within several disciplines, spanning broad areas of science. We show that the
probability that an article is cited times has large variations between
different disciplines, but all distributions are rescaled on a universal curve
when the relative indicator is considered, where is the
average number of citations per article for the discipline. In addition we show
that the same universal behavior occurs when citation distributions of articles
published in the same field, but in different years, are compared. These
findings provide a strong validation of as an unbiased indicator for
citation performance across disciplines and years. Based on this indicator, we
introduce a generalization of the h-index suitable for comparing scientists
working in different fields.Comment: 7 pages, 5 figures. accepted for publication in Proc. Natl Acad. Sci.
US
Predicting the long-term citation impact of recent publications
A fundamental problem in citation analysis is the prediction of the long-term
citation impact of recent publications. We propose a model to predict a
probability distribution for the future number of citations of a publication.
Two predictors are used: The impact factor of the journal in which a
publication has appeared and the number of citations a publication has received
one year after its appearance. The proposed model is based on quantile
regression. We employ the model to predict the future number of citations of a
large set of publications in the field of physics. Our analysis shows that both
predictors (i.e., impact factor and early citations) contribute to the accurate
prediction of long-term citation impact. We also analytically study the
behavior of the quantile regression coefficients for high quantiles of the
distribution of citations. This is done by linking the quantile regression
approach to a quantile estimation technique from extreme value theory. Our work
provides insight into the influence of the impact factor and early citations on
the long-term citation impact of a publication, and it takes a step toward a
methodology that can be used to assess research institutions based on their
most recently published work.Comment: 17 pages, 17 figure
Citation models and research evaluation
Citations in science are being studied from several perspectives. On the one
hand, there are approaches such as scientometrics and the science of science,
which take a more quantitative perspective. In this chapter I briefly review
some of the literature on citations, citation distributions and models of
citations. These citations feature prominently in another part of the
literature which is dealing with research evaluation and the role of metrics
and indicators in that process. Here I briefly review part of the discussion in
research evaluation. This also touches on the subject of how citations relate
to peer review. Finally, I try to integrate the two literatures with the aim of
clarifying what I believe the two can learn from each other. The fundamental
problem in research evaluation is that research quality is unobservable. This
has consequences for conclusions that we can draw from quantitative studies of
citations and citation models. The term "indicators" is a relevant concept in
this context, which I try to clarify. Causality is important for properly
understanding indicators, especially when indicators are used in practice: when
we act on indicators, we enter causal territory. Even when an indicator might
have been valid, through its very use, the consequences of its use may
invalidate it. By combining citation models with proper causal reasoning and
acknowledging the fundamental problem about unobservable research quality, we
may hope to make progress.Comment: This is a draft. The final version will be available in Handbook of
Computational Social Science edited by Taha Yasseri, forthcoming 2023, Edward
Elgar Publishing Lt
The utilization of paper-level classification system on the evaluation of journal impact
CAS Journal Ranking, a ranking system of journals based on the bibliometric
indicator of citation impact, has been widely used in meso and macro-scale
research evaluation in China since its first release in 2004. The ranking's
coverage is journals which contained in the Clarivate's Journal Citation
Reports (JCR). This paper will mainly introduce the upgraded version of the
2019 CAS journal ranking. Aiming at limitations around the indicator and
classification system utilized in earlier editions, also the problem of
journals' interdisciplinarity or multidisciplinarity, we will discuss the
improvements in the 2019 upgraded version of CAS journal ranking (1) the CWTS
paper-level classification system, a more fine-grained system, has been
utilized, (2) a new indicator, Field Normalized Citation Success Index (FNCSI),
which ia robust against not only extremely highly cited publications, but also
the wrongly assigned document type, has been used, and (3) the calculation of
the indicator is from a paper-level. In addition, this paper will present a
small part of ranking results and an interpretation of the robustness of the
new FNCSI indicator. By exploring more sophisticated methods and indicators,
like the CWTS paper-level classification system and the new FNCSI indicator,
CAS Journal Ranking will continue its original purpose for responsible research
evaluation
High‐Citation Papers in Space Physics: Examination of Gender, Country, and Paper Characteristics
The number of citations to a refereed journal article from other refereed journal articles is a measure of its impact. Papers, individuals, journals, departments, and institutions are increasingly judged by the impact they have in their disciplines, and citation counts are now a relatively easy (though not necessarily accurate or straightforward) way of attempting to quantify impact. This study examines papers published in the Journal of Geophysical Research—Space Physics in the year 2012 (n = 705) and analyzes the characteristics of high‐citation papers compared to low‐citation papers. We find that high‐citation papers generally have a large number of authors (>5) and cite significantly more articles in the reference section than low‐citation papers. We also examined the gender and country of institution of the first author and found that there is not a statistically significant gender bias, but there are some significant differences in citation statistics between articles based on the country of first‐author institution.Plain Language SummaryThe number of citations to a refereed journal article from other refereed journal articles is a measure of its impact. Papers, individuals, journals, departments, and institutions are increasingly judged by the impact they have in their disciplines, and citation counts are now a relatively easy (though not necessarily accurate) way of attempting to quantify impact. This study examines papers published in the Journal of Geophysical Research—Space Physics and analyzes the characteristics of high‐citation papers compared to low‐citation papers. We find that high‐citation papers generally have large number of authors (>5) and cite significantly more articles in the reference section than low‐citation papers. We also found that there is not a statistically significant gender bias in terms of citation counts, but there are some significant differences in citation statistics between articles based on the country of first‐author institution.Key PointsLarge collaborative and international teams that cite the literature extensively write high‐citation papersNo gender bias is found in terms of citation rates between female and male first‐author papers, and they submit first‐author papers proportionally to their representation in the disciplineA statistically significant small difference in citations is found for papers from U.S. institutions compared to the rest of the worldPeer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/144280/1/jgra54185_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/144280/2/jgra54185.pd
Hot Streaks in Artistic, Cultural, and Scientific Careers
The hot streak, loosely defined as winning begets more winnings, highlights a
specific period during which an individual's performance is substantially
higher than her typical performance. While widely debated in sports, gambling,
and financial markets over the past several decades, little is known if hot
streaks apply to individual careers. Here, building on rich literature on
lifecycle of creativity, we collected large-scale career histories of
individual artists, movie directors and scientists, tracing the artworks,
movies, and scientific publications they produced. We find that, across all
three domains, hit works within a career show a high degree of temporal
regularity, each career being characterized by bursts of high-impact works
occurring in sequence. We demonstrate that these observations can be explained
by a simple hot-streak model we developed, allowing us to probe quantitatively
the hot streak phenomenon governing individual careers, which we find to be
remarkably universal across diverse domains we analyzed: The hot streaks are
ubiquitous yet unique across different careers. While the vast majority of
individuals have at least one hot streak, hot streaks are most likely to occur
only once. The hot streak emerges randomly within an individual's sequence of
works, is temporally localized, and is unassociated with any detectable change
in productivity. We show that, since works produced during hot streaks garner
significantly more impact, the uncovered hot streaks fundamentally drives the
collective impact of an individual, ignoring which leads us to systematically
over- or under-estimate the future impact of a career. These results not only
deepen our quantitative understanding of patterns governing individual
ingenuity and success, they may also have implications for decisions and
policies involving predicting and nurturing individuals with lasting impact
Determining predictors of mortality in HIV positive people in South Africa, 2003 to 2009: a mixed methods approach incorporating unobserved variables
A thesis submitted to the School of Public Health, Faculty of Health Sciences, University of
the Witwatersrand, in fulfilment of the requirements for the degree
Of
Doctor of Philosophy. 02 April 2018.Background
The largest proportion of HIV-infected people resides in Southern Africa. In South
Africa, the government has taken the lead in the provision of free HIV treatment with
a high coverage rate. Provision of free antiretroviral treatment has led to a decline in
mortality rates and an increase in life expectancy. However, a significant number of
people with HIV continue to die despite the availability of free treatment. A large
proportion of studies have concentrated on using quantitative methods of analysis.
Very few have used mixed methods that combine quantitative time-to-event frailty
models and qualitative methods in assessing risk factors for mortality in HIV-infected
individuals. However, use of such mixed methods approach could provide insights
that may lead to an improvement in patient care and management.
Aim
To determine mortality risk factors in HIV-infected people through incorporating
unobserved variables using a mixed methods approach in which quantitative findings
are explained by the qualitative.
Methods
To critically review statistical methods used for assessing risk factors for mortality in
HIV-infected people between the years 2002 and 2011. We conducted a literature
review on the design of studies, how data were analysed and whether suitable
statistical methods were utilised in assessing mortality risk factors in HIV-infected
people in the period 2002-2011. Only publications written in English and listed in
Pubmed/Medline were considered. In this review, papers using time-to-event
techniques were regarded as appropriate. Data were split into two equal periods
allowing for the comparison of the statistical methods over time.
To compare the different time-to-event methods, we ran 1,000 simulations of
parametric clustered data using parameters derived from an HIV study that was
conducted in South Africa by the Perinatal HIV Research Unit (PHRU). Data for 5, 10
and 20 clusters of size 50 and 100 were simulated. Survival and censoring times
were derived from a Weibull distribution. The minimum of survival and censoring
times was taken as the study time. Using the simulated data, we compared the
following time-to-event methods: Cox proportional hazards regression, shared
Gamma frailty with Weibull and exponential baseline hazards (frequentist models),
and the Bayesian integrated nested Laplace approximation (INLA) with Weibull
baseline hazard. Parameter estimates, standard errors and their fit statistics were
averaged over 1,000 simulations. Similarly, means and standard deviations from
INLA were averaged (over the 1,000 simulations). Frequentist models were
compared using the -2 loglikelihood fit statistics while all the four models were
compared using the mean square error (MSE). Additionally, we simulated semiparametric
clustered frailty models (using gamma and log-normal frailties) including
INLA, h-likelihood, penalized likelihood and penalised partial likelihood estimations.
Parameter estimates and their standard errors were presented graphically and
compared using the MSE.
To assess mortality risk factors in HIV-infected people in South Africa in different
settings, factors associated with mortality in HIV-infected people were assessed by
INLA survival frailty model using cohort data of HIV-infected people from South
Africa. Two thirds were from Soweto (urban) and the rest from Mpumalanga (rural).
Findings were evaluated by site.
Mixed methods were used to evaluate risk factors for mortality by combining the best
fitting model applied to retrospective data and qualitative analysis on prospective
data. In order to explain the unobserved frailty modelling results, we conducted a
qualitative study that enrolled 20 participants who had confirmed knowing a person
that had died as a result of HIV. Participants were recruited from the Zazi VCT in
PHRU and were interviewed using a semi-structured interview guide. The aim of the
qualitative study was to attempt to explain the unobserved factors influencing
mortality in HIV-infected individuals using perceived reasons for death given by the
participants. These were later used to complement the potential reasons for death as
identified in the frailty modelling (quantitative) results.
Results
In the critical review, 189 studies met the inclusion criteria that included prospective
(69%) and retrospective (30%) studies. Of the 189 studies, 91 were published in the
period 2002-2006 and 98 in 2007-2011. Cox regression analysis with frailty was
used in only 7 studies (~4%); of which 6 were published between the years 2007-
2011.
The simulation study showed that the shared frailty models performed better than
Cox-PH. Within the shared frailty models, the Gamma frailty model with a Weibull
baseline performed better than the Gamma frailty model with an exponential
baseline. The MSE showed that in general, the Bayesian INLA had better results. In
the semiparametric simulations, results were similar but INLA had a slightly better fit
with consistently lower MSE values relative to both gamma and log-normal frailty
models. The random effects estimate for INLA, whose method is slightly different,
had lower MSE values consistently relative to the other methods.
In the HIV cohort study, 6,690 participants were enrolled with majority being female
(78%) and most participants residing in an urban area (67%). Rural participants were
older (36 years; IQR: 31-44) and with a higher mortality rate (11/100 person years).
Among those residing in rural areas, HAART treatment for between six and twelve
months (HR: 0.2, 95% CI: 0.1-0.4) and more than 12 months (HR: 0.1, 95% CI: 0.1-
0.2) was protective relative to not being on treatment. Being on HAART treatment for
greater than twelve months was protective in the urban participants (HR: 0.35,
95%CI: 0.27-0.46). Significant heterogeneity, assessed by frailty variance, was high
in rural participants and lower in the urban.
Since the frailty modelling results suggested that the unobserved variables had a
significant effect on mortality in HIV-infected individuals, a qualitative study was
conducted to explore the potential causes of death. In the qualitative study,
participants perceived that mortality in HIV-infected individuals may have been
influenced by engagement in risky sexual behaviour such as multiple sexual
partnerships, negative attitude by healthcare workers towards HIV-infected people,
believing in the healing power of religion, traditional medicine, food security and
social support structure.
Conclusions
The study found that Cox proportional hazards regression with frailty is not
commonly used in research on mortality in HIV-infected individuals as it is used in
other fields of health research. Additionally, use of the more complex semiparametric
frailty models was even lower in this population. From simulations, we found that
frailty survival models provided a better fit in modelling mortality due to their ability to
account for unobserved variables especially the Bayesian INLA. As the unobserved
variables are complex to explain using only quantitative modelling techniques,
qualitative analysis of perceived causes of death was explored. Unobserved
variables affecting mortality were explored through qualitative analysis of perceived
reasons provided by bereaved participants. This mixed methods approach optimised
data by using a quantitative approach followed by a qualitative one that
complemented each other. Use of optimal methods in assessing morbidity and
mortality in HIV-infected patients may improve patient care and management in
South Africa and other countries.
Key words: HIV, Mortality, Rural, Urban, unmeasured variables, HAART, FrailtyLG201
Theories of Informetrics and Scholarly Communication
Scientometrics have become an essential element in the practice and evaluation of science and research, including both the evaluation of individuals and national assessment exercises. Yet, researchers and practitioners in this field have lacked clear theories to guide their work. As early as 1981, then doctoral student Blaise Cronin published "The need for a theory of citing" —a call to arms for the fledgling scientometric community to produce foundational theories upon which the work of the field could be based. More than three decades later, the time has come to reach out the field again and ask how they have responded to this call.
This book compiles the foundational theories that guide informetrics and scholarly communication research. It is a much needed compilation by leading scholars in the field that gathers together the theories that guide our understanding of authorship, citing, and impact
- …