792 research outputs found
Accelerated structured matrix factorization
Matrix factorization exploits the idea that, in complex high-dimensional
data, the actual signal typically lies in lower-dimensional structures. These
lower dimensional objects provide useful insight, with interpretability favored
by sparse structures. Sparsity, in addition, is beneficial in terms of
regularization and, thus, to avoid over-fitting. By exploiting Bayesian
shrinkage priors, we devise a computationally convenient approach for
high-dimensional matrix factorization. The dependence between row and column
entities is modeled by inducing flexible sparse patterns within factors. The
availability of external information is accounted for in such a way that
structures are allowed while not imposed. Inspired by boosting algorithms, we
pair the the proposed approach with a numerical strategy relying on a
sequential inclusion and estimation of low-rank contributions, with data-driven
stopping rule. Practical advantages of the proposed approach are demonstrated
by means of a simulation study and the analysis of soccer heatmaps obtained
from new generation tracking data
Mining Text and Time Series Data with Applications in Finance
Finance is a field extremely rich in data, and has great need of methods for summarizing and understanding these data. Existing methods of multivariate analysis allow the discovery of structure in time series data but can be difficult to interpret. Often there exists a wealth of text data directly related to the time series. In this thesis it is shown that this text can be exploited to aid interpretation of, and even to improve, the structure uncovered. To this end, two approaches are described and tested. Both serve to uncover structure in the relationship between text and time series data, but do so in very different ways. The first model comes from the field of topic modelling. A novel topic model is developed, closely related to an existing topic model for mixed data. Improved held-out likelihood is demonstrated for this model on a corpus of UK equity market data and the discovered structure is qualitatively examined. To the authors’ knowledge this is the first attempt to combine text and time series data in a single generative topic model. The second method is a simpler, discriminative method based on a low-rank decomposition of time series data with constraints determined by word frequencies in the text data. This is compared to topic modelling using both the equity data and a second corpus comprising foreign exchange rates time series and text describing global macroeconomic sentiments, showing further improvements in held-out likelihood. One example of an application for the inferred structure is also demonstrated: construction of carry trade portfolios. The superior results using this second method serve as a reminder that methodological complexity does not guarantee performance gains
Personalized Expert Recommendation: Models and Algorithms
Many large-scale information sharing systems including social media systems, questionanswering
sites and rating and reviewing applications have been growing rapidly, allowing
millions of human participants to generate and consume information on an unprecedented
scale. To manage the sheer growth of information generation, there comes the need to enable
personalization of information resources for users — to surface high-quality content
and feeds, to provide personally relevant suggestions, and so on. A fundamental task in
creating and supporting user-centered personalization systems is to build rich user profile
to aid recommendation for better user experience.
Therefore, in this dissertation research, we propose models and algorithms to facilitate
the creation of new crowd-powered personalized information sharing systems. Specifically,
we first give a principled framework to enable personalization of resources so that
information seekers can be matched with customized knowledgeable users based on their
previous historical actions and contextual information; We then focus on creating rich
user models that allows accurate and comprehensive modeling of user profiles for long
tail users, including discovering user’s known-for profile, user’s opinion bias and user’s
geo-topic profile. In particular, this dissertation research makes two unique contributions:
First, we introduce the problem of personalized expert recommendation and propose
the first principled framework for addressing this problem. To overcome the sparsity issue,
we investigate the use of user’s contextual information that can be exploited to build robust
models of personal expertise, study how spatial preference for personally-valuable expertise
varies across regions, across topics and based on different underlying social communities,
and integrate these different forms of preferences into a matrix factorization-based
personalized expert recommender.
Second, to support the personalized recommendation on experts, we focus on modeling
and inferring user profiles in online information sharing systems. In order to tap
the knowledge of most majority of users, we provide frameworks and algorithms to accurately
and comprehensively create user models by discovering user’s known-for profile,
user’s opinion bias and user’s geo-topic profile, with each described shortly as follows:
—We develop a probabilistic model called Bayesian Contextual Poisson Factorization
to discover what users are known for by others. Our model considers as input a small fraction
of users whose known-for profiles are already known and the vast majority of users for
whom we have little (or no) information, learns the implicit relationships between user?s
known-for profiles and their contextual signals, and finally predict known-for profiles for
those majority of users.
—We explore user’s topic-sensitive opinion bias, propose a lightweight semi-supervised
system called “BiasWatch” to semi-automatically infer the opinion bias of long-tail users,
and demonstrate how user’s opinion bias can be exploited to recommend other users with
similar opinion in social networks.
— We study how a user’s topical profile varies geo-spatially and how we can model
a user’s geo-spatial known-for profile as the last step in our dissertation for creation of
rich user profile. We propose a multi-layered Bayesian hierarchical user factorization to
overcome user heterogeneity and an enhanced model to alleviate the sparsity issue by integrating
user contexts into the two-layered hierarchical user model for better representation
of user’s geo-topic preference by others
Bayesian Methods in Tensor Analysis
Tensors, also known as multidimensional arrays, are useful data structures in
machine learning and statistics. In recent years, Bayesian methods have emerged
as a popular direction for analyzing tensor-valued data since they provide a
convenient way to introduce sparsity into the model and conduct uncertainty
quantification. In this article, we provide an overview of frequentist and
Bayesian methods for solving tensor completion and regression problems, with a
focus on Bayesian methods. We review common Bayesian tensor approaches
including model formulation, prior assignment, posterior computation, and
theoretical properties. We also discuss potential future directions in this
field.Comment: 32 pages, 8 figures, 2 table
MetaRec: Meta-Learning Meets Recommendation Systems
Artificial neural networks (ANNs) have recently received increasing attention as powerful modeling tools to improve the performance of recommendation systems. Meta-learning, on the other hand, is a paradigm that has re-surged in popularity within the broader machine learning community over the past several years. In this thesis, we will explore the intersection of these two domains and work on developing methods for integrating meta-learning to design more accurate and flexible recommendation systems.
In the present work, we propose a meta-learning framework for the design of collaborative filtering methods in recommendation systems, drawing from ideas, models, and solutions from modern approaches in both the meta-learning and recommendation system literature, applying them to recommendation tasks to obtain improved generalization performance.
Our proposed framework, MetaRec, includes and unifies the main state-of-the-art models in recommendation systems, extending them to be flexibly configured and efficiently operate with limited data. We empirically test the architectures created under our MetaRec framework on several recommendation benchmark datasets using a plethora of evaluation metrics and find that by taking a meta-learning approach to the collaborative filtering problem, we observe notable gains in predictive performance
- …