63,653 research outputs found
Unsupervised, Efficient and Semantic Expertise Retrieval
We introduce an unsupervised discriminative model for the task of retrieving
experts in online document collections. We exclusively employ textual evidence
and avoid explicit feature engineering by learning distributed word
representations in an unsupervised way. We compare our model to
state-of-the-art unsupervised statistical vector space and probabilistic
generative approaches. Our proposed log-linear model achieves the retrieval
performance levels of state-of-the-art document-centric methods with the low
inference cost of so-called profile-centric approaches. It yields a
statistically significant improved ranking over vector space and generative
models in most cases, matching the performance of supervised methods on various
benchmarks. That is, by using solely text we can do as well as methods that
work with external evidence and/or relevance feedback. A contrastive analysis
of rankings produced by discriminative and generative approaches shows that
they have complementary strengths due to the ability of the unsupervised
discriminative model to perform semantic matching.Comment: WWW2016, Proceedings of the 25th International Conference on World
Wide Web. 201
Unsupervised Learning via Total Correlation Explanation
Learning by children and animals occurs effortlessly and largely without
obvious supervision. Successes in automating supervised learning have not
translated to the more ambiguous realm of unsupervised learning where goals and
labels are not provided. Barlow (1961) suggested that the signal that brains
leverage for unsupervised learning is dependence, or redundancy, in the sensory
environment. Dependence can be characterized using the information-theoretic
multivariate mutual information measure called total correlation. The principle
of Total Cor-relation Ex-planation (CorEx) is to learn representations of data
that "explain" as much dependence in the data as possible. We review some
manifestations of this principle along with successes in unsupervised learning
problems across diverse domains including human behavior, biology, and
language.Comment: Invited contribution for IJCAI 2017 Early Career Spotlight. 5 pages,
1 figur
The Incremental Multiresolution Matrix Factorization Algorithm
Multiresolution analysis and matrix factorization are foundational tools in
computer vision. In this work, we study the interface between these two
distinct topics and obtain techniques to uncover hierarchical block structure
in symmetric matrices -- an important aspect in the success of many vision
problems. Our new algorithm, the incremental multiresolution matrix
factorization, uncovers such structure one feature at a time, and hence scales
well to large matrices. We describe how this multiscale analysis goes much
farther than what a direct global factorization of the data can identify. We
evaluate the efficacy of the resulting factorizations for relative leveraging
within regression tasks using medical imaging data. We also use the
factorization on representations learned by popular deep networks, providing
evidence of their ability to infer semantic relationships even when they are
not explicitly trained to do so. We show that this algorithm can be used as an
exploratory tool to improve the network architecture, and within numerous other
settings in vision.Comment: Computer Vision and Pattern Recognition (CVPR) 2017, 10 page
- …