773 research outputs found
Unsupervised, Efficient and Semantic Expertise Retrieval
We introduce an unsupervised discriminative model for the task of retrieving
experts in online document collections. We exclusively employ textual evidence
and avoid explicit feature engineering by learning distributed word
representations in an unsupervised way. We compare our model to
state-of-the-art unsupervised statistical vector space and probabilistic
generative approaches. Our proposed log-linear model achieves the retrieval
performance levels of state-of-the-art document-centric methods with the low
inference cost of so-called profile-centric approaches. It yields a
statistically significant improved ranking over vector space and generative
models in most cases, matching the performance of supervised methods on various
benchmarks. That is, by using solely text we can do as well as methods that
work with external evidence and/or relevance feedback. A contrastive analysis
of rankings produced by discriminative and generative approaches shows that
they have complementary strengths due to the ability of the unsupervised
discriminative model to perform semantic matching.Comment: WWW2016, Proceedings of the 25th International Conference on World
Wide Web. 201
The right expert at the right time and place: From expertise identification to expertise selection
We propose a unified and complete solution for expert finding in organizations, including not only expertise identification, but also expertise selection functionality. The latter two include the use of implicit and explicit preferences of users on meeting each other, as well as localization and planning as important auxiliary processes. We also propose a solution for privacy protection, which is urgently required in view of the huge amount of privacy sensitive data involved. Various parts are elaborated elsewhere, and we look forward to a realization and usage of the proposed system as a whole
Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data
Use of socially generated "big data" to access information about collective
states of the minds in human societies has become a new paradigm in the
emerging field of computational social science. A natural application of this
would be the prediction of the society's reaction to a new product in the sense
of popularity and adoption rate. However, bridging the gap between "real time
monitoring" and "early predicting" remains a big challenge. Here we report on
an endeavor to build a minimalistic predictive model for the financial success
of movies based on collective activity data of online users. We show that the
popularity of a movie can be predicted much before its release by measuring and
analyzing the activity level of editors and viewers of the corresponding entry
to the movie in Wikipedia, the well-known online encyclopedia.Comment: 13 pages, Including Supporting Information, 7 Figures, Download the
dataset from: http://wwm.phy.bme.hu/SupplementaryDataS1.zi
Document expansion for image retrieval
Successful information retrieval requires e�ective matching
between the user's search request and the contents of relevant
documents. Often the request entered by a user may
not use the same topic relevant terms as the authors' of the
documents. One potential approach to address problems
of query-document term mismatch is document expansion
to include additional topically relevant indexing terms in a
document which may encourage its retrieval when relevant
to queries which do not match its original contents well. We
propose and evaluate a new document expansion method
using external resources. While results of previous research
have been inconclusive in determining the impact of document
expansion on retrieval e�ectiveness, our method is
shown to work e�ectively for text-based image retrieval of
short image annotation documents. Our approach uses the
Okapi query expansion algorithm as a method for document
expansion. We further show improved performance can be
achieved by using a \document reduction" approach to include
only the signi�cant terms in a document in the expansion
process. Our experiments on the WikipediaMM task at
ImageCLEF 2008 show an increase of 16.5% in mean average
precision (MAP) compared to a variation of Okapi BM25 retrieval
model. To compare document expansion with query
expansion, we also test query expansion from an external resource
which leads an improvement by 9.84% in MAP over
our baseline. Our conclusion is that the document expansion
with document reduction and in combination with query expansion
produces the overall best retrieval results for shortlength
document retrieval. For this image retrieval task, we
also concluded that query expansion from external resource
does not outperform the document expansion method
- …