151,589 research outputs found
Predicting Scientific Success Based on Coauthorship Networks
We address the question to what extent the success of scientific articles is
due to social influence. Analyzing a data set of over 100000 publications from
the field of Computer Science, we study how centrality in the coauthorship
network differs between authors who have highly cited papers and those who do
not. We further show that a machine learning classifier, based only on
coauthorship network centrality measures at time of publication, is able to
predict with high precision whether an article will be highly cited five years
after publication. By this we provide quantitative insight into the social
dimension of scientific publishing - challenging the perception of citations as
an objective, socially unbiased measure of scientific success.Comment: 21 pages, 2 figures, incl. Supplementary Materia
Understanding the Impact of Early Citers on Long-Term Scientific Impact
This paper explores an interesting new dimension to the challenging problem
of predicting long-term scientific impact (LTSI) usually measured by the number
of citations accumulated by a paper in the long-term. It is well known that
early citations (within 1-2 years after publication) acquired by a paper
positively affects its LTSI. However, there is no work that investigates if the
set of authors who bring in these early citations to a paper also affect its
LTSI. In this paper, we demonstrate for the first time, the impact of these
authors whom we call early citers (EC) on the LTSI of a paper. Note that this
study of the complex dynamics of EC introduces a brand new paradigm in citation
behavior analysis. Using a massive computer science bibliographic dataset we
identify two distinct categories of EC - we call those authors who have high
overall publication/citation count in the dataset as influential and the rest
of the authors as non-influential. We investigate three characteristic
properties of EC and present an extensive analysis of how each category
correlates with LTSI in terms of these properties. In contrast to popular
perception, we find that influential EC negatively affects LTSI possibly owing
to attention stealing. To motivate this, we present several representative
examples from the dataset. A closer inspection of the collaboration network
reveals that this stealing effect is more profound if an EC is nearer to the
authors of the paper being investigated. As an intuitive use case, we show that
incorporating EC properties in the state-of-the-art supervised citation
prediction models leads to high performance margins. At the closing, we present
an online portal to visualize EC statistics along with the prediction results
for a given query paper
Measuring academic influence: Not all citations are equal
The importance of a research article is routinely measured by counting how
many times it has been cited. However, treating all citations with equal weight
ignores the wide variety of functions that citations perform. We want to
automatically identify the subset of references in a bibliography that have a
central academic influence on the citing paper. For this purpose, we examine
the effectiveness of a variety of features for determining the academic
influence of a citation. By asking authors to identify the key references in
their own work, we created a data set in which citations were labeled according
to their academic influence. Using automatic feature selection with supervised
machine learning, we found a model for predicting academic influence that
achieves good performance on this data set using only four features. The best
features, among those we evaluated, were those based on the number of times a
reference is mentioned in the body of a citing paper. The performance of these
features inspired us to design an influence-primed h-index (the hip-index).
Unlike the conventional h-index, it weights citations by how many times a
reference is mentioned. According to our experiments, the hip-index is a better
indicator of researcher performance than the conventional h-index
Gravity-Inspired Graph Autoencoders for Directed Link Prediction
Graph autoencoders (AE) and variational autoencoders (VAE) recently emerged
as powerful node embedding methods. In particular, graph AE and VAE were
successfully leveraged to tackle the challenging link prediction problem,
aiming at figuring out whether some pairs of nodes from a graph are connected
by unobserved edges. However, these models focus on undirected graphs and
therefore ignore the potential direction of the link, which is limiting for
numerous real-life applications. In this paper, we extend the graph AE and VAE
frameworks to address link prediction in directed graphs. We present a new
gravity-inspired decoder scheme that can effectively reconstruct directed
graphs from a node embedding. We empirically evaluate our method on three
different directed link prediction tasks, for which standard graph AE and VAE
perform poorly. We achieve competitive results on three real-world graphs,
outperforming several popular baselines.Comment: ACM International Conference on Information and Knowledge Management
(CIKM 2019
Using Machine Learning to Predict the Evolution of Physics Research
The advancement of science as outlined by Popper and Kuhn is largely
qualitative, but with bibliometric data it is possible and desirable to develop
a quantitative picture of scientific progress. Furthermore it is also important
to allocate finite resources to research topics that have growth potential, to
accelerate the process from scientific breakthroughs to technological
innovations. In this paper, we address this problem of quantitative knowledge
evolution by analysing the APS publication data set from 1981 to 2010. We build
the bibliographic coupling and co-citation networks, use the Louvain method to
detect topical clusters (TCs) in each year, measure the similarity of TCs in
consecutive years, and visualize the results as alluvial diagrams. Having the
predictive features describing a given TC and its known evolution in the next
year, we can train a machine learning model to predict future changes of TCs,
i.e., their continuing, dissolving, merging and splitting. We found the number
of papers from certain journals, the degree, closeness, and betweenness to be
the most predictive features. Additionally, betweenness increases significantly
for merging events, and decreases significantly for splitting events. Our
results represent a first step from a descriptive understanding of the Science
of Science (SciSci), towards one that is ultimately prescriptive.Comment: 24 pages, 10 figures, 4 tables, supplementary information is include
- …