6 research outputs found
Generalized h-index for Disclosing Latent Facts in Citation Networks
What is the value of a scientist and its impact upon the scientific thinking?
How can we measure the prestige of a journal or of a conference? The evaluation
of the scientific work of a scientist and the estimation of the quality of a
journal or conference has long attracted significant interest, due to the
benefits from obtaining an unbiased and fair criterion. Although it appears to
be simple, defining a quality metric is not an easy task. To overcome the
disadvantages of the present metrics used for ranking scientists and journals,
J.E. Hirsch proposed a pioneering metric, the now famous h-index. In this
article, we demonstrate several inefficiencies of this index and develop a pair
of generalizations and effective variants of it to deal with scientist ranking
and with publication forum ranking. The new citation indices are able to
disclose trendsetters in scientific research, as well as researchers that
constantly shape their field with their influential work, no matter how old
they are. We exhibit the effectiveness and the benefits of the new indices to
unfold the full potential of the h-index, with extensive experimental results
obtained from DBLP, a widely known on-line digital library.Comment: 19 pages, 17 tables, 27 figure
Finding Academic Experts on a MultiSensor Approach using Shannon's Entropy
Expert finding is an information retrieval task concerned with the search for
the most knowledgeable people, in some topic, with basis on documents
describing peoples activities. The task involves taking a user query as input
and returning a list of people sorted by their level of expertise regarding the
user query. This paper introduces a novel approach for combining multiple
estimators of expertise based on a multisensor data fusion framework together
with the Dempster-Shafer theory of evidence and Shannon's entropy. More
specifically, we defined three sensors which detect heterogeneous information
derived from the textual contents, from the graph structure of the citation
patterns for the community of experts, and from profile information about the
academic experts. Given the evidences collected, each sensor may define
different candidates as experts and consequently do not agree in a final
ranking decision. To deal with these conflicts, we applied the Dempster-Shafer
theory of evidence combined with Shannon's Entropy formula to fuse this
information and come up with a more accurate and reliable final ranking list.
Experiments made over two datasets of academic publications from the Computer
Science domain attest for the adequacy of the proposed approach over the
traditional state of the art approaches. We also made experiments against
representative supervised state of the art algorithms. Results revealed that
the proposed method achieved a similar performance when compared to these
supervised techniques, confirming the capabilities of the proposed framework
Learning to Rank Academic Experts in the DBLP Dataset
Expert finding is an information retrieval task that is concerned with the
search for the most knowledgeable people with respect to a specific topic, and
the search is based on documents that describe people's activities. The task
involves taking a user query as input and returning a list of people who are
sorted by their level of expertise with respect to the user query. Despite
recent interest in the area, the current state-of-the-art techniques lack in
principled approaches for optimally combining different sources of evidence.
This article proposes two frameworks for combining multiple estimators of
expertise. These estimators are derived from textual contents, from
graph-structure of the citation patterns for the community of experts, and from
profile information about the experts. More specifically, this article explores
the use of supervised learning to rank methods, as well as rank aggregation
approaches, for combing all of the estimators of expertise. Several supervised
learning algorithms, which are representative of the pointwise, pairwise and
listwise approaches, were tested, and various state-of-the-art data fusion
techniques were also explored for the rank aggregation framework. Experiments
that were performed on a dataset of academic publications from the Computer
Science domain attest the adequacy of the proposed approaches.Comment: Expert Systems, 2013. arXiv admin note: text overlap with
arXiv:1302.041
Do PageRank-based author rankings outperform simple citation counts?
The basic indicators of a researcher's productivity and impact are still the
number of publications and their citation counts. These metrics are clear,
straightforward, and easy to obtain. When a ranking of scholars is needed, for
instance in grant, award, or promotion procedures, their use is the fastest and
cheapest way of prioritizing some scientists over others. However, due to their
nature, there is a danger of oversimplifying scientific achievements.
Therefore, many other indicators have been proposed including the usage of the
PageRank algorithm known for the ranking of webpages and its modifications
suited to citation networks. Nevertheless, this recursive method is
computationally expensive and even if it has the advantage of favouring
prestige over popularity, its application should be well justified,
particularly when compared to the standard citation counts. In this study, we
analyze three large datasets of computer science papers in the categories of
artificial intelligence, software engineering, and theory and methods and apply
12 different ranking methods to the citation networks of authors. We compare
the resulting rankings with self-compiled lists of outstanding researchers
selected as frequent editorial board members of prestigious journals in the
field and conclude that there is no evidence of PageRank-based methods
outperforming simple citation counts.Comment: 28 pages, 5 figures, 6 table
Analysing academic paper ranking algorithms using test data and benchmarks:an investigation
Research on academic paper ranking has received great attention in recent years, and many algorithms have been proposed to automatically assess a large number of papers for this purpose. How to evaluate or analyse the performance of these ranking algorithms becomes an open research question. Theoretically, evaluation of an algorithm requires to compare its ranking result against a ground truth paper list. However, such ground truth does not exist in the field of scholarly ranking due to the fact that there does not and will not exist an absolutely unbiased, objective, and unified standard to formulate the impact of papers. Therefore, in practice researchers evaluate or analyse their proposed ranking algorithms by different methods, such as using domain expert decisions (test data) and comparing against predefined ranking benchmarks. The question is whether using different methods leads to different analysis results, and if so, how should we analyse the performance of the ranking algorithms? To answer these questions, this study compares among test data and different citation-based benchmarks by examining their relationships and assessing the effect of the method choices on their analysis results. The results of our experiments show that there does exist difference in analysis results when employing test data and different benchmarks, and relying exclusively on one benchmark or test data may bring inadequate analysis results. In addition, a guideline on how to conduct a comprehensive analysis using multiple benchmarks from different perspectives is summarised, which can help provide a systematic understanding and profile of the analysed algorithms.</p