2,146 research outputs found
Learning Reputation in an Authorship Network
The problem of searching for experts in a given academic field is hugely
important in both industry and academia. We study exactly this issue with
respect to a database of authors and their publications. The idea is to use
Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA) to perform
topic modelling in order to find authors who have worked in a query field. We
then construct a coauthorship graph and motivate the use of influence
maximisation and a variety of graph centrality measures to obtain a ranked list
of experts. The ranked lists are further improved using a Markov Chain-based
rank aggregation approach. The complete method is readily scalable to large
datasets. To demonstrate the efficacy of the approach we report on an extensive
set of computational simulations using the Arnetminer dataset. An improvement
in mean average precision is demonstrated over the baseline case of simply
using the order of authors found by the topic models
The wisdom of collective grading and the effects of epistemic and semantic diversity
A computer simulation is used to study collective judgements that an expert panel reaches on the basis of qualitative probability judgements contributed by individual members. The simulated panel displays a strong and robust crowd wisdom effect. The panel's performance is better when members contribute precise probability estimates instead of qualitative judgements, but not by much. Surprisingly, it doesn't always hurt for panel members to interpret the probability expressions differently. Indeed, coordinating their understandings can be much worse
Alligning Vertical Collection Relevance with User Intent
Selecting and aggregating different types of content from multiple vertical search engines is becoming popular in web search. The user vertical intent, the verticals the user expects to be relevant for a particular information need, might not correspond to the vertical collection relevance, the verticals containing the most relevant content. In this work we propose different approaches to define the set of relevant verticals based on document judgments. We correlate the collection-based relevant verticals obtained from these approaches to the real user vertical intent, and show that they can be aligned relatively well. The set of relevant verticals defined by those approaches could therefore serve as an approximate but reliable ground-truth for evaluating vertical selection, avoiding the need for collecting explicit user vertical intent, and vice versa
On multidimensional poverty rankings of binary attributes
We address the problem of ranking distributions of attributes in terms of poverty, when the attributes are represented by binary variables. To accomplish this task, we identify a suitable notion of “multidimensional poverty line” and characterize axiomatically the Head-Count and the Attribute-Gap poverty rankings, which are the natural counterparts of the most widely used income poverty indices. Finally, we apply our methodology and compare our empirical results with those obtained with some other well-known poverty measures
Peer review and citation data in predicting university rankings, a large-scale analysis
Most Performance-based Research Funding Systems (PRFS) draw on peer review and bibliometric indicators, two different method- ologies which are sometimes combined. A common argument against the use of indicators in such research evaluation exercises is their low corre- lation at the article level with peer review judgments. In this study, we analyse 191,000 papers from 154 higher education institutes which were peer reviewed in a national research evaluation exercise. We combine these data with 6.95 million citations to the original papers. We show that when citation-based indicators are applied at the institutional or departmental level, rather than at the level of individual papers, surpris- ingly large correlations with peer review judgments can be observed, up to r <= 0.802, n = 37, p < 0.001 for some disciplines. In our evaluation of ranking prediction performance based on citation data, we show we can reduce the mean rank prediction error by 25% compared to previous work. This suggests that citation-based indicators are sufficiently aligned with peer review results at the institutional level to be used to lessen the overall burden of peer review on national evaluation exercises leading to considerable cost savings
Context Models For Web Search Personalization
We present our solution to the Yandex Personalized Web Search Challenge. The
aim of this challenge was to use the historical search logs to personalize
top-N document rankings for a set of test users. We used over 100 features
extracted from user- and query-depended contexts to train neural net and
tree-based learning-to-rank and regression models. Our final submission, which
was a blend of several different models, achieved an NDCG@10 of 0.80476 and
placed 4'th amongst the 194 teams winning 3'rd prize
- …