Search CORE

3,216 research outputs found

Identifying single influential publications in a research field: New analysis opportunities of the CRExplorer

Author: Bornmann Lutz
Marx Werner
Mutz Rüdiger
Thor Andreas
Publication venue
Publication date: 21/03/2018
Field of study

Reference Publication Year Spectroscopy (RPYS) has been developed for identifying the cited references (CRs) with the greatest influence in a given paper set (mostly sets of papers on certain topics or fields). The program CRExplorer (see www.crexplorer.net) was specifically developed by Thor, Marx, Leydesdorff, and Bornmann (2016a, 2016b) for applying RPYS to publication sets downloaded from Scopus or Web of Science. In this study, we present some advanced methods which have been newly developed for CRExplorer. These methods are able to identify and characterize the CRs which have been influential across a longer period (many citing years). The new methods are demonstrated in this study using all the papers published in Scientometrics between 1978 and 2016. The indicators N_TOP50, N_TOP25, and N_TOP10 can be used to identify those CRs which belong to the 50%, 25%, or 10% most frequently cited publications (CRs) over many citing publication years. In the Scientometrics dataset, for example, Lotka's (1926) paper on the distribution of scientific productivity belongs to the top 10% publications (CRs) in 36 citing years. Furthermore, the new version of CRExplorer analyzes the impact sequence of CRs across citing years. CRs can have below average (-), average (0), or above average (+) impact in citing years (whereby average is meant in the sense of expected values). The sequence (e.g. 00++---0--00) is used by the program to identify papers with typical impact distributions. For example, CRs can have early, but not late impact ("hot papers", e.g. +++---) or vice versa ("sleeping beauties", e.g. ---0000---++)

arXiv.org e-Print Archive

Repository for Publications and Research Data

A multiple k-means cluster ensemble framework for clustering citation trajectories

Author: Chakraborty Joyita
Nandi Subrata
Pradhan Dinesh K.
Publication venue
Publication date: 10/09/2023
Field of study

Citation maturity time varies for different articles. However, the impact of all articles is measured in a fixed window. Clustering their citation trajectories helps understand the knowledge diffusion process and reveals that not all articles gain immediate success after publication. Moreover, clustering trajectories is necessary for paper impact recommendation algorithms. It is a challenging problem because citation time series exhibit significant variability due to non linear and non stationary characteristics. Prior works propose a set of arbitrary thresholds and a fixed rule based approach. All methods are primarily parameter dependent. Consequently, it leads to inconsistencies while defining similar trajectories and ambiguities regarding their specific number. Most studies only capture extreme trajectories. Thus, a generalised clustering framework is required. This paper proposes a feature based multiple k means cluster ensemble framework. 1,95,783 and 41,732 well cited articles from the Microsoft Academic Graph data are considered for clustering short term (10 year) and long term (30 year) trajectories, respectively. It has linear run time. Four distinct trajectories are obtained Early Rise Rapid Decline (2.2%), Early Rise Slow Decline (45%), Delayed Rise No Decline (53%), and Delayed Rise Slow Decline (0.8%). Individual trajectory differences for two different spans are studied. Most papers exhibit Early Rise Slow Decline and Delayed Rise No Decline patterns. The growth and decay times, cumulative citation distribution, and peak characteristics of individual trajectories are redefined empirically. A detailed comparative study reveals our proposed methodology can detect all distinct trajectory classes.Comment: 29 page

arXiv.org e-Print Archive

Search for Evergreens in Science: A Functional Data Analysis

Author: Mei Yajun
Wang Jian
Zhang Ruizhi
Publication venue: 'Elsevier BV'
Publication date: 16/06/2017
Field of study

Evergreens in science are papers that display a continual rise in annual citations without decline, at least within a sufficiently long time period. Aiming to better understand evergreens in particular and patterns of citation trajectory in general, this paper develops a functional data analysis method to cluster citation trajectories of a sample of 1699 research papers published in 1980 in the American Physical Society (APS) journals. We propose a functional Poisson regression model for individual papers' citation trajectories, and fit the model to the observed 30-year citations of individual papers by functional principal component analysis and maximum likelihood estimation. Based on the estimated paper-specific coefficients, we apply the K-means clustering algorithm to cluster papers into different groups, for uncovering general types of citation trajectories. The result demonstrates the existence of an evergreen cluster of papers that do not exhibit any decline in annual citations over 30 years.Comment: 40 pages, 9 figure

arXiv.org e-Print Archive

Lancaster E-Prints