3,216 research outputs found
Identifying single influential publications in a research field: New analysis opportunities of the CRExplorer
Reference Publication Year Spectroscopy (RPYS) has been developed for
identifying the cited references (CRs) with the greatest influence in a given
paper set (mostly sets of papers on certain topics or fields). The program
CRExplorer (see www.crexplorer.net) was specifically developed by Thor, Marx,
Leydesdorff, and Bornmann (2016a, 2016b) for applying RPYS to publication sets
downloaded from Scopus or Web of Science. In this study, we present some
advanced methods which have been newly developed for CRExplorer. These methods
are able to identify and characterize the CRs which have been influential
across a longer period (many citing years). The new methods are demonstrated in
this study using all the papers published in Scientometrics between 1978 and
2016. The indicators N_TOP50, N_TOP25, and N_TOP10 can be used to identify
those CRs which belong to the 50%, 25%, or 10% most frequently cited
publications (CRs) over many citing publication years. In the Scientometrics
dataset, for example, Lotka's (1926) paper on the distribution of scientific
productivity belongs to the top 10% publications (CRs) in 36 citing years.
Furthermore, the new version of CRExplorer analyzes the impact sequence of CRs
across citing years. CRs can have below average (-), average (0), or above
average (+) impact in citing years (whereby average is meant in the sense of
expected values). The sequence (e.g. 00++---0--00) is used by the program to
identify papers with typical impact distributions. For example, CRs can have
early, but not late impact ("hot papers", e.g. +++---) or vice versa ("sleeping
beauties", e.g. ---0000---++)
A multiple k-means cluster ensemble framework for clustering citation trajectories
Citation maturity time varies for different articles. However, the impact of
all articles is measured in a fixed window. Clustering their citation
trajectories helps understand the knowledge diffusion process and reveals that
not all articles gain immediate success after publication. Moreover, clustering
trajectories is necessary for paper impact recommendation algorithms. It is a
challenging problem because citation time series exhibit significant
variability due to non linear and non stationary characteristics. Prior works
propose a set of arbitrary thresholds and a fixed rule based approach. All
methods are primarily parameter dependent. Consequently, it leads to
inconsistencies while defining similar trajectories and ambiguities regarding
their specific number. Most studies only capture extreme trajectories. Thus, a
generalised clustering framework is required. This paper proposes a feature
based multiple k means cluster ensemble framework. 1,95,783 and 41,732 well
cited articles from the Microsoft Academic Graph data are considered for
clustering short term (10 year) and long term (30 year) trajectories,
respectively. It has linear run time. Four distinct trajectories are obtained
Early Rise Rapid Decline (2.2%), Early Rise Slow Decline (45%), Delayed Rise No
Decline (53%), and Delayed Rise Slow Decline (0.8%). Individual trajectory
differences for two different spans are studied. Most papers exhibit Early Rise
Slow Decline and Delayed Rise No Decline patterns. The growth and decay times,
cumulative citation distribution, and peak characteristics of individual
trajectories are redefined empirically. A detailed comparative study reveals
our proposed methodology can detect all distinct trajectory classes.Comment: 29 page
Search for Evergreens in Science: A Functional Data Analysis
Evergreens in science are papers that display a continual rise in annual
citations without decline, at least within a sufficiently long time period.
Aiming to better understand evergreens in particular and patterns of citation
trajectory in general, this paper develops a functional data analysis method to
cluster citation trajectories of a sample of 1699 research papers published in
1980 in the American Physical Society (APS) journals. We propose a functional
Poisson regression model for individual papers' citation trajectories, and fit
the model to the observed 30-year citations of individual papers by functional
principal component analysis and maximum likelihood estimation. Based on the
estimated paper-specific coefficients, we apply the K-means clustering
algorithm to cluster papers into different groups, for uncovering general types
of citation trajectories. The result demonstrates the existence of an evergreen
cluster of papers that do not exhibit any decline in annual citations over 30
years.Comment: 40 pages, 9 figure
- …