790 research outputs found

    Universality of citation distributions: towards an objective measure of scientific impact

    Full text link
    We study the distributions of citations received by a single publication within several disciplines, spanning broad areas of science. We show that the probability that an article is cited cc times has large variations between different disciplines, but all distributions are rescaled on a universal curve when the relative indicator cf=c/c0c_f=c/c_0 is considered, where c0c_0 is the average number of citations per article for the discipline. In addition we show that the same universal behavior occurs when citation distributions of articles published in the same field, but in different years, are compared. These findings provide a strong validation of cfc_f as an unbiased indicator for citation performance across disciplines and years. Based on this indicator, we introduce a generalization of the h-index suitable for comparing scientists working in different fields.Comment: 7 pages, 5 figures. accepted for publication in Proc. Natl Acad. Sci. US

    Predicting the long-term citation impact of recent publications

    Get PDF
    A fundamental problem in citation analysis is the prediction of the long-term citation impact of recent publications. We propose a model to predict a probability distribution for the future number of citations of a publication. Two predictors are used: The impact factor of the journal in which a publication has appeared and the number of citations a publication has received one year after its appearance. The proposed model is based on quantile regression. We employ the model to predict the future number of citations of a large set of publications in the field of physics. Our analysis shows that both predictors (i.e., impact factor and early citations) contribute to the accurate prediction of long-term citation impact. We also analytically study the behavior of the quantile regression coefficients for high quantiles of the distribution of citations. This is done by linking the quantile regression approach to a quantile estimation technique from extreme value theory. Our work provides insight into the influence of the impact factor and early citations on the long-term citation impact of a publication, and it takes a step toward a methodology that can be used to assess research institutions based on their most recently published work.Comment: 17 pages, 17 figure

    Citation models and research evaluation

    Full text link
    Citations in science are being studied from several perspectives. On the one hand, there are approaches such as scientometrics and the science of science, which take a more quantitative perspective. In this chapter I briefly review some of the literature on citations, citation distributions and models of citations. These citations feature prominently in another part of the literature which is dealing with research evaluation and the role of metrics and indicators in that process. Here I briefly review part of the discussion in research evaluation. This also touches on the subject of how citations relate to peer review. Finally, I try to integrate the two literatures with the aim of clarifying what I believe the two can learn from each other. The fundamental problem in research evaluation is that research quality is unobservable. This has consequences for conclusions that we can draw from quantitative studies of citations and citation models. The term "indicators" is a relevant concept in this context, which I try to clarify. Causality is important for properly understanding indicators, especially when indicators are used in practice: when we act on indicators, we enter causal territory. Even when an indicator might have been valid, through its very use, the consequences of its use may invalidate it. By combining citation models with proper causal reasoning and acknowledging the fundamental problem about unobservable research quality, we may hope to make progress.Comment: This is a draft. The final version will be available in Handbook of Computational Social Science edited by Taha Yasseri, forthcoming 2023, Edward Elgar Publishing Lt

    The utilization of paper-level classification system on the evaluation of journal impact

    Full text link
    CAS Journal Ranking, a ranking system of journals based on the bibliometric indicator of citation impact, has been widely used in meso and macro-scale research evaluation in China since its first release in 2004. The ranking's coverage is journals which contained in the Clarivate's Journal Citation Reports (JCR). This paper will mainly introduce the upgraded version of the 2019 CAS journal ranking. Aiming at limitations around the indicator and classification system utilized in earlier editions, also the problem of journals' interdisciplinarity or multidisciplinarity, we will discuss the improvements in the 2019 upgraded version of CAS journal ranking (1) the CWTS paper-level classification system, a more fine-grained system, has been utilized, (2) a new indicator, Field Normalized Citation Success Index (FNCSI), which ia robust against not only extremely highly cited publications, but also the wrongly assigned document type, has been used, and (3) the calculation of the indicator is from a paper-level. In addition, this paper will present a small part of ranking results and an interpretation of the robustness of the new FNCSI indicator. By exploring more sophisticated methods and indicators, like the CWTS paper-level classification system and the new FNCSI indicator, CAS Journal Ranking will continue its original purpose for responsible research evaluation

    High‐Citation Papers in Space Physics: Examination of Gender, Country, and Paper Characteristics

    Full text link
    The number of citations to a refereed journal article from other refereed journal articles is a measure of its impact. Papers, individuals, journals, departments, and institutions are increasingly judged by the impact they have in their disciplines, and citation counts are now a relatively easy (though not necessarily accurate or straightforward) way of attempting to quantify impact. This study examines papers published in the Journal of Geophysical Research—Space Physics in the year 2012 (n = 705) and analyzes the characteristics of high‐citation papers compared to low‐citation papers. We find that high‐citation papers generally have a large number of authors (>5) and cite significantly more articles in the reference section than low‐citation papers. We also examined the gender and country of institution of the first author and found that there is not a statistically significant gender bias, but there are some significant differences in citation statistics between articles based on the country of first‐author institution.Plain Language SummaryThe number of citations to a refereed journal article from other refereed journal articles is a measure of its impact. Papers, individuals, journals, departments, and institutions are increasingly judged by the impact they have in their disciplines, and citation counts are now a relatively easy (though not necessarily accurate) way of attempting to quantify impact. This study examines papers published in the Journal of Geophysical Research—Space Physics and analyzes the characteristics of high‐citation papers compared to low‐citation papers. We find that high‐citation papers generally have large number of authors (>5) and cite significantly more articles in the reference section than low‐citation papers. We also found that there is not a statistically significant gender bias in terms of citation counts, but there are some significant differences in citation statistics between articles based on the country of first‐author institution.Key PointsLarge collaborative and international teams that cite the literature extensively write high‐citation papersNo gender bias is found in terms of citation rates between female and male first‐author papers, and they submit first‐author papers proportionally to their representation in the disciplineA statistically significant small difference in citations is found for papers from U.S. institutions compared to the rest of the worldPeer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/144280/1/jgra54185_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/144280/2/jgra54185.pd

    Hot Streaks in Artistic, Cultural, and Scientific Careers

    Full text link
    The hot streak, loosely defined as winning begets more winnings, highlights a specific period during which an individual's performance is substantially higher than her typical performance. While widely debated in sports, gambling, and financial markets over the past several decades, little is known if hot streaks apply to individual careers. Here, building on rich literature on lifecycle of creativity, we collected large-scale career histories of individual artists, movie directors and scientists, tracing the artworks, movies, and scientific publications they produced. We find that, across all three domains, hit works within a career show a high degree of temporal regularity, each career being characterized by bursts of high-impact works occurring in sequence. We demonstrate that these observations can be explained by a simple hot-streak model we developed, allowing us to probe quantitatively the hot streak phenomenon governing individual careers, which we find to be remarkably universal across diverse domains we analyzed: The hot streaks are ubiquitous yet unique across different careers. While the vast majority of individuals have at least one hot streak, hot streaks are most likely to occur only once. The hot streak emerges randomly within an individual's sequence of works, is temporally localized, and is unassociated with any detectable change in productivity. We show that, since works produced during hot streaks garner significantly more impact, the uncovered hot streaks fundamentally drives the collective impact of an individual, ignoring which leads us to systematically over- or under-estimate the future impact of a career. These results not only deepen our quantitative understanding of patterns governing individual ingenuity and success, they may also have implications for decisions and policies involving predicting and nurturing individuals with lasting impact

    Determining predictors of mortality in HIV positive people in South Africa, 2003 to 2009: a mixed methods approach incorporating unobserved variables

    Get PDF
    A thesis submitted to the School of Public Health, Faculty of Health Sciences, University of the Witwatersrand, in fulfilment of the requirements for the degree Of Doctor of Philosophy. 02 April 2018.Background The largest proportion of HIV-infected people resides in Southern Africa. In South Africa, the government has taken the lead in the provision of free HIV treatment with a high coverage rate. Provision of free antiretroviral treatment has led to a decline in mortality rates and an increase in life expectancy. However, a significant number of people with HIV continue to die despite the availability of free treatment. A large proportion of studies have concentrated on using quantitative methods of analysis. Very few have used mixed methods that combine quantitative time-to-event frailty models and qualitative methods in assessing risk factors for mortality in HIV-infected individuals. However, use of such mixed methods approach could provide insights that may lead to an improvement in patient care and management. Aim To determine mortality risk factors in HIV-infected people through incorporating unobserved variables using a mixed methods approach in which quantitative findings are explained by the qualitative. Methods To critically review statistical methods used for assessing risk factors for mortality in HIV-infected people between the years 2002 and 2011. We conducted a literature review on the design of studies, how data were analysed and whether suitable statistical methods were utilised in assessing mortality risk factors in HIV-infected people in the period 2002-2011. Only publications written in English and listed in Pubmed/Medline were considered. In this review, papers using time-to-event techniques were regarded as appropriate. Data were split into two equal periods allowing for the comparison of the statistical methods over time. To compare the different time-to-event methods, we ran 1,000 simulations of parametric clustered data using parameters derived from an HIV study that was conducted in South Africa by the Perinatal HIV Research Unit (PHRU). Data for 5, 10 and 20 clusters of size 50 and 100 were simulated. Survival and censoring times were derived from a Weibull distribution. The minimum of survival and censoring times was taken as the study time. Using the simulated data, we compared the following time-to-event methods: Cox proportional hazards regression, shared Gamma frailty with Weibull and exponential baseline hazards (frequentist models), and the Bayesian integrated nested Laplace approximation (INLA) with Weibull baseline hazard. Parameter estimates, standard errors and their fit statistics were averaged over 1,000 simulations. Similarly, means and standard deviations from INLA were averaged (over the 1,000 simulations). Frequentist models were compared using the -2 loglikelihood fit statistics while all the four models were compared using the mean square error (MSE). Additionally, we simulated semiparametric clustered frailty models (using gamma and log-normal frailties) including INLA, h-likelihood, penalized likelihood and penalised partial likelihood estimations. Parameter estimates and their standard errors were presented graphically and compared using the MSE. To assess mortality risk factors in HIV-infected people in South Africa in different settings, factors associated with mortality in HIV-infected people were assessed by INLA survival frailty model using cohort data of HIV-infected people from South Africa. Two thirds were from Soweto (urban) and the rest from Mpumalanga (rural). Findings were evaluated by site. Mixed methods were used to evaluate risk factors for mortality by combining the best fitting model applied to retrospective data and qualitative analysis on prospective data. In order to explain the unobserved frailty modelling results, we conducted a qualitative study that enrolled 20 participants who had confirmed knowing a person that had died as a result of HIV. Participants were recruited from the Zazi VCT in PHRU and were interviewed using a semi-structured interview guide. The aim of the qualitative study was to attempt to explain the unobserved factors influencing mortality in HIV-infected individuals using perceived reasons for death given by the participants. These were later used to complement the potential reasons for death as identified in the frailty modelling (quantitative) results. Results In the critical review, 189 studies met the inclusion criteria that included prospective (69%) and retrospective (30%) studies. Of the 189 studies, 91 were published in the period 2002-2006 and 98 in 2007-2011. Cox regression analysis with frailty was used in only 7 studies (~4%); of which 6 were published between the years 2007- 2011. The simulation study showed that the shared frailty models performed better than Cox-PH. Within the shared frailty models, the Gamma frailty model with a Weibull baseline performed better than the Gamma frailty model with an exponential baseline. The MSE showed that in general, the Bayesian INLA had better results. In the semiparametric simulations, results were similar but INLA had a slightly better fit with consistently lower MSE values relative to both gamma and log-normal frailty models. The random effects estimate for INLA, whose method is slightly different, had lower MSE values consistently relative to the other methods. In the HIV cohort study, 6,690 participants were enrolled with majority being female (78%) and most participants residing in an urban area (67%). Rural participants were older (36 years; IQR: 31-44) and with a higher mortality rate (11/100 person years). Among those residing in rural areas, HAART treatment for between six and twelve months (HR: 0.2, 95% CI: 0.1-0.4) and more than 12 months (HR: 0.1, 95% CI: 0.1- 0.2) was protective relative to not being on treatment. Being on HAART treatment for greater than twelve months was protective in the urban participants (HR: 0.35, 95%CI: 0.27-0.46). Significant heterogeneity, assessed by frailty variance, was high in rural participants and lower in the urban. Since the frailty modelling results suggested that the unobserved variables had a significant effect on mortality in HIV-infected individuals, a qualitative study was conducted to explore the potential causes of death. In the qualitative study, participants perceived that mortality in HIV-infected individuals may have been influenced by engagement in risky sexual behaviour such as multiple sexual partnerships, negative attitude by healthcare workers towards HIV-infected people, believing in the healing power of religion, traditional medicine, food security and social support structure. Conclusions The study found that Cox proportional hazards regression with frailty is not commonly used in research on mortality in HIV-infected individuals as it is used in other fields of health research. Additionally, use of the more complex semiparametric frailty models was even lower in this population. From simulations, we found that frailty survival models provided a better fit in modelling mortality due to their ability to account for unobserved variables especially the Bayesian INLA. As the unobserved variables are complex to explain using only quantitative modelling techniques, qualitative analysis of perceived causes of death was explored. Unobserved variables affecting mortality were explored through qualitative analysis of perceived reasons provided by bereaved participants. This mixed methods approach optimised data by using a quantitative approach followed by a qualitative one that complemented each other. Use of optimal methods in assessing morbidity and mortality in HIV-infected patients may improve patient care and management in South Africa and other countries. Key words: HIV, Mortality, Rural, Urban, unmeasured variables, HAART, FrailtyLG201

    Theories of Informetrics and Scholarly Communication

    Get PDF
    Scientometrics have become an essential element in the practice and evaluation of science and research, including both the evaluation of individuals and national assessment exercises. Yet, researchers and practitioners in this field have lacked clear theories to guide their work. As early as 1981, then doctoral student Blaise Cronin published "The need for a theory of citing" —a call to arms for the fledgling scientometric community to produce foundational theories upon which the work of the field could be based. More than three decades later, the time has come to reach out the field again and ask how they have responded to this call. This book compiles the foundational theories that guide informetrics and scholarly communication research. It is a much needed compilation by leading scholars in the field that gathers together the theories that guide our understanding of authorship, citing, and impact
    corecore