9,950 research outputs found
Unsupervised, Efficient and Semantic Expertise Retrieval
We introduce an unsupervised discriminative model for the task of retrieving
experts in online document collections. We exclusively employ textual evidence
and avoid explicit feature engineering by learning distributed word
representations in an unsupervised way. We compare our model to
state-of-the-art unsupervised statistical vector space and probabilistic
generative approaches. Our proposed log-linear model achieves the retrieval
performance levels of state-of-the-art document-centric methods with the low
inference cost of so-called profile-centric approaches. It yields a
statistically significant improved ranking over vector space and generative
models in most cases, matching the performance of supervised methods on various
benchmarks. That is, by using solely text we can do as well as methods that
work with external evidence and/or relevance feedback. A contrastive analysis
of rankings produced by discriminative and generative approaches shows that
they have complementary strengths due to the ability of the unsupervised
discriminative model to perform semantic matching.Comment: WWW2016, Proceedings of the 25th International Conference on World
Wide Web. 201
Modeling Documents as Mixtures of Persons for Expert Finding
In this paper we address the problem of searching for knowledgeable
persons within the enterprise, known as the expert finding (or
expert search) task. We present a probabilistic algorithm using the assumption
that terms in documents are produced by people who are mentioned
in them.We represent documents retrieved to a query as mixtures
of candidate experts language models. Two methods of personal language
models extraction are proposed, as well as the way of combining
them with other evidences of expertise. Experiments conducted with the
TREC Enterprise collection demonstrate the superiority of our approach
in comparison with the best one among existing solutions
Learning to Rank Academic Experts in the DBLP Dataset
Expert finding is an information retrieval task that is concerned with the
search for the most knowledgeable people with respect to a specific topic, and
the search is based on documents that describe people's activities. The task
involves taking a user query as input and returning a list of people who are
sorted by their level of expertise with respect to the user query. Despite
recent interest in the area, the current state-of-the-art techniques lack in
principled approaches for optimally combining different sources of evidence.
This article proposes two frameworks for combining multiple estimators of
expertise. These estimators are derived from textual contents, from
graph-structure of the citation patterns for the community of experts, and from
profile information about the experts. More specifically, this article explores
the use of supervised learning to rank methods, as well as rank aggregation
approaches, for combing all of the estimators of expertise. Several supervised
learning algorithms, which are representative of the pointwise, pairwise and
listwise approaches, were tested, and various state-of-the-art data fusion
techniques were also explored for the rank aggregation framework. Experiments
that were performed on a dataset of academic publications from the Computer
Science domain attest the adequacy of the proposed approaches.Comment: Expert Systems, 2013. arXiv admin note: text overlap with
arXiv:1302.041
Medical WordNet: A new methodology for the construction and validation of information resources for consumer health
A consumer health information system must be able to comprehend both expert and non-expert medical vocabulary and to map between the two. We describe an ongoing
project to create a new lexical database called Medical WordNet (MWN), consisting of
medically relevant terms used by and intelligible to non-expert subjects and supplemented by a corpus of natural-language sentences that is designed to provide
medically validated contexts for MWN terms. The corpus derives primarily from online health information sources targeted to consumers, and involves two sub-corpora, called Medical FactNet (MFN) and Medical BeliefNet (MBN), respectively. The former consists of statements accredited as true on the basis of a rigorous process of validation, the latter of statements which non-experts believe to be true. We summarize the MWN / MFN / MBN project, and describe some of its applications
SCSMiner: mining social coding sites for software developer recommendation with relevance propagation
Ā© 2018, Springer Science+Business Media, LLC, part of Springer Nature. With the advent of social coding sites, software development has entered a new era of collaborative work. Social coding sites (e.g., GitHub) can integrate social networking and distributed version control in a unified platform to facilitate collaborative developments over the world. One unique characteristic of such sites is that the past development experiences of developers provided on the sites convey the implicit metrics of developerās programming capability and expertise, which can be applied in many areas, such as software developer recruitment for IT corporations. Motivated by this intuition, we aim to develop a framework to effectively locate the developers with right coding skills. To achieve this goal, we devise a generativ e probabilistic expert ranking model upon which a consistency among projects is incorporated as graph regularization to enhance the expert ranking and a perspective of relevance propagation illustration is introduced. For evaluation, StackOverflow is leveraged to complement the ground truth of expert. Finally, a prototype system, SCSMiner, which provides expert search service based on a real-world dataset crawled from GitHub is implemented and demonstrated
Recommended from our members
Integrating multiple document features in language models for expert finding
We argue that expert finding is sensitive to multiple document features in an organizational intranet. These document features include multiple levels of associations between experts and a query topic from sentence, paragraph, up to document levels, document authority information such as the PageRank, indegree, and URL length of documents, and internal document structures that indicate the experts' relationship with the content of documents. Our assumption is that expert finding can largely benefit from the incorporation of these document features. However, existing language modeling approaches for expert finding have not sufficiently taken into account these document features. We propose a novel language modeling approach, which integrates multiple document features, for expert finding. Our experiments on two large scale TREC Enterprise Track datasets, i.e., the W3C and CSIRO datasets, demonstrate that the natures of the two organizational intranets and two types of expert finding tasks, i.e., key contact finding for CSIRO and knowledgeable person finding for W3C, influence the effectiveness of different document features. Our work provides insights into which document features work for certain types of expert finding tasks, and helps design expert finding strategies that are effective for different scenarios. Our main contribution is to develop an effective formal method for modeling multiple document features in expert finding, and conduct a systematic investigation of their effects. It is worth noting that our novel approach achieves better results in terms of MAP than previous language model based approaches and the best automatic runs in both the TREC2006 and TREC2007 expert search tasks, respectively
- ā¦