293,710 research outputs found
Investigating the contribution of author- and publication-specific features to scholars' h-index prediction
Evaluation of researchers' output is vital for hiring committees and funding
bodies, and it is usually measured via their scientific productivity,
citations, or a combined metric such as h-index. Assessing young researchers is
more critical because it takes a while to get citations and increment of
h-index. Hence, predicting the h-index can help to discover the researchers'
scientific impact. In addition, identifying the influential factors to predict
the scientific impact is helpful for researchers seeking solutions to improve
it. This study investigates the effect of author, paper and venue-specific
features on the future h-index. For this purpose, we used machine learning
methods to predict the h-index and feature analysis techniques to advance the
understanding of feature impact. Utilizing the bibliometric data in Scopus, we
defined and extracted two main groups of features. The first relates to prior
scientific impact, and we name it 'prior impact-based features' and includes
the number of publications, received citations, and h-index. The second group
is 'non-impact-based features' and contains the features related to author,
co-authorship, paper, and venue characteristics. We explored their importance
in predicting h-index for researchers in three different career phases. Also,
we examine the temporal dimension of predicting performance for different
feature categories to find out which features are more reliable for long- and
short-term prediction. We referred to the gender of the authors to examine the
role of this author's characteristics in the prediction task. Our findings
showed that gender has a very slight effect in predicting the h-index. We found
that non-impact-based features are more robust predictors for younger scholars
than seniors in the short term. Also, prior impact-based features lose their
power to predict more than other features in the long-term.Comment: 14 pages, 1 figur
Will This Paper Increase Your h-index? Scientific Impact Prediction
Scientific impact plays a central role in the evaluation of the output of
scholars, departments, and institutions. A widely used measure of scientific
impact is citations, with a growing body of literature focused on predicting
the number of citations obtained by any given publication. The effectiveness of
such predictions, however, is fundamentally limited by the power-law
distribution of citations, whereby publications with few citations are
extremely common and publications with many citations are relatively rare.
Given this limitation, in this work we instead address a related question asked
by many academic researchers in the course of writing a paper, namely: "Will
this paper increase my h-index?" Using a real academic dataset with over 1.7
million authors, 2 million papers, and 8 million citation relationships from
the premier online academic service ArnetMiner, we formalize a novel scientific
impact prediction problem to examine several factors that can drive a paper to
increase the primary author's h-index. We find that the researcher's authority
on the publication topic and the venue in which the paper is published are
crucial factors to the increase of the primary author's h-index, while the
topic popularity and the co-authors' h-indices are of surprisingly little
relevance. By leveraging relevant factors, we find a greater than 87.5%
potential predictability for whether a paper will contribute to an author's
h-index within five years. As a further experiment, we generate a
self-prediction for this paper, estimating that there is a 76% probability that
it will contribute to the h-index of the co-author with the highest current
h-index in five years. We conclude that our findings on the quantification of
scientific impact can help researchers to expand their influence and more
effectively leverage their position of "standing on the shoulders of giants."Comment: Proc. of the 8th ACM International Conference on Web Search and Data
Mining (WSDM'15
Predicting Scientific Success Based on Coauthorship Networks
We address the question to what extent the success of scientific articles is
due to social influence. Analyzing a data set of over 100000 publications from
the field of Computer Science, we study how centrality in the coauthorship
network differs between authors who have highly cited papers and those who do
not. We further show that a machine learning classifier, based only on
coauthorship network centrality measures at time of publication, is able to
predict with high precision whether an article will be highly cited five years
after publication. By this we provide quantitative insight into the social
dimension of scientific publishing - challenging the perception of citations as
an objective, socially unbiased measure of scientific success.Comment: 21 pages, 2 figures, incl. Supplementary Materia
A framework for the measurement and prediction of an individual scientist's performance
Quantitative bibliometric indicators are widely used to evaluate the
performance of scientists. However, traditional indicators do not much rely on
the analysis of the processes intended to measure and the practical goals of
the measurement. In this study, I propose a simple framework to measure and
predict an individual researcher's scientific performance that takes into
account the main regularities of publication and citation processes and the
requirements of practical tasks. Statistical properties of the new indicator -
a scientist's personal impact rate - are illustrated by its application to a
sample of Estonian researchers.Comment: 12 pages, 3 figure
Measuring academic influence: Not all citations are equal
The importance of a research article is routinely measured by counting how
many times it has been cited. However, treating all citations with equal weight
ignores the wide variety of functions that citations perform. We want to
automatically identify the subset of references in a bibliography that have a
central academic influence on the citing paper. For this purpose, we examine
the effectiveness of a variety of features for determining the academic
influence of a citation. By asking authors to identify the key references in
their own work, we created a data set in which citations were labeled according
to their academic influence. Using automatic feature selection with supervised
machine learning, we found a model for predicting academic influence that
achieves good performance on this data set using only four features. The best
features, among those we evaluated, were those based on the number of times a
reference is mentioned in the body of a citing paper. The performance of these
features inspired us to design an influence-primed h-index (the hip-index).
Unlike the conventional h-index, it weights citations by how many times a
reference is mentioned. According to our experiments, the hip-index is a better
indicator of researcher performance than the conventional h-index
- …