369 research outputs found
Matrix Factorization Equals Efficient Co-occurrence Representation
Matrix factorization is a simple and effective solution to the recommendation
problem. It has been extensively employed in the industry and has attracted
much attention from the academia. However, it is unclear what the
low-dimensional matrices represent. We show that matrix factorization can
actually be seen as simultaneously calculating the eigenvectors of the
user-user and item-item sample co-occurrence matrices. We then use insights
from random matrix theory (RMT) to show that picking the top eigenvectors
corresponds to removing sampling noise from user/item co-occurrence matrices.
Therefore, the low-dimension matrices represent a reduced noise user and item
co-occurrence space. We also analyze the structure of the top eigenvector and
show that it corresponds to global effects and removing it results in less
popular items being recommended. This increases the diversity of the items
recommended without affecting the accuracy.Comment: RecSys 2018 LBR
Inter-causal Independence and Heterogeneous Factorization
It is well known that conditional independence can be used to factorize a
joint probability into a multiplication of conditional probabilities. This
paper proposes a constructive definition of inter-causal independence, which
can be used to further factorize a conditional probability. An inference
algorithm is developed, which makes use of both conditional independence and
inter-causal independence to reduce inference complexity in Bayesian networks.Comment: Appears in Proceedings of the Tenth Conference on Uncertainty in
Artificial Intelligence (UAI1994
Learning Sparse Deep Feedforward Networks via Tree Skeleton Expansion
Despite the popularity of deep learning, structure learning for deep models
remains a relatively under-explored area. In contrast, structure learning has
been studied extensively for probabilistic graphical models (PGMs). In
particular, an efficient algorithm has been developed for learning a class of
tree-structured PGMs called hierarchical latent tree models (HLTMs), where
there is a layer of observed variables at the bottom and multiple layers of
latent variables on top. In this paper, we propose a simple method for learning
the structures of feedforward neural networks (FNNs) based on HLTMs. The idea
is to expand the connections in the tree skeletons from HLTMs and to use the
resulting structures for FNNs. An important characteristic of FNN structures
learned this way is that they are sparse. We present extensive empirical
results to show that, compared with standard FNNs tuned-manually, sparse FNNs
learned by our method achieve better or comparable classification performance
with much fewer parameters. They are also more interpretable.Comment: 7 page
Building Sparse Deep Feedforward Networks using Tree Receptive Fields
Sparse connectivity is an important factor behind the success of
convolutional neural networks and recurrent neural networks. In this paper, we
consider the problem of learning sparse connectivity for feedforward neural
networks (FNNs). The key idea is that a unit should be connected to a small
number of units at the next level below that are strongly correlated. We use
Chow-Liu's algorithm to learn a tree-structured probabilistic model for the
units at the current level, use the tree to identify subsets of units that are
strongly correlated, and introduce a new unit with receptive field over the
subsets. The procedure is repeated on the new units to build multiple layers of
hidden units. The resulting model is called a TRF-net. Empirical results show
that, when compared to dense FNNs, TRF-net achieves better or comparable
classification performance with much fewer parameters and sparser structures.
They are also more interpretable.Comment: International Joint Conference on Artificial Intelligence 201
Solving Asymmetric Decision Problems with Influence Diagrams
While influence diagrams have many advantages as a representation framework
for Bayesian decision problems, they have a serious drawback in handling
asymmetric decision problems. To be represented in an influence diagram, an
asymmetric decision problem must be symmetrized. A considerable amount of
unnecessary computation may be involved when a symmetrized influence diagram is
evaluated by conventional algorithms. In this paper we present an approach for
avoiding such unnecessary computation in influence diagram evaluation.Comment: Appears in Proceedings of the Tenth Conference on Uncertainty in
Artificial Intelligence (UAI1994
Incremental computation of the value of perfect information in stepwise-decomposable influence diagrams
To determine the value of perfect information in an influence diagram, one
needs first to modify the diagram to reflect the change in information
availability, and then to compute the optimal expected values of both the
original diagram and the modified diagram. The value of perfect information is
the difference between the two optimal expected values. This paper is about how
to speed up the computation of the optimal expected value of the modified
diagram by making use of the intermediate computation results obtained when
computing the optimal expected value of the original diagram.Comment: Appears in Proceedings of the Ninth Conference on Uncertainty in
Artificial Intelligence (UAI1993
Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes
Most exact algorithms for general partially observable Markov decision
processes (POMDPs) use a form of dynamic programming in which a
piecewise-linear and convex representation of one value function is transformed
into another. We examine variations of the "incremental pruning" method for
solving this problem and compare them to earlier algorithms from theoretical
and empirical perspectives. We find that incremental pruning is presently the
most efficient exact method for solving POMDPs.Comment: Appears in Proceedings of the Thirteenth Conference on Uncertainty in
Artificial Intelligence (UAI1997
Value Iteration Working With Belief Subsets
Value iteration is a popular algorithm for solving POMDPs. However, it is inefficient in practice. The primary reason is that it needs to conduct value updates for all the belief states in the (continuous) belief space. In this paper, we study value iteration working with a subset of the belief space, i.e., it conducts value updates only for belief states in the subset. We present a way to select belief subset and describe an algorithm to conduct value iteration over the selected subset. The algorithm is attractive in that it works with belief subset but also retains the quality of the generated values. Given aPOMDP,weshowhowtoa priori determine whether the selected subset is a proper subset of belief space. If this is the case, the algorithm carries the advantages of representation in space and efficiency in time
Using Taste Groups for Collaborative Filtering
Implicit feedback is the simplest form of user feedback that can be used for
item recommendation. It is easy to collect and domain independent. However,
there is a lack of negative examples. Existing works circumvent this problem by
making various assumptions regarding the unconsumed items, which fail to hold
when the user did not consume an item because she was unaware of it. In this
paper, we propose as a novel method for addressing the lack of negative
examples in implicit feedback. The motivation is that if there is a large group
of users who share the same taste and none of them consumed an item, then it is
highly likely that the item is irrelevant to this taste. We use Hierarchical
Latent Tree Analysis(HLTA) to identify taste-based user groups and make
recommendations for a user based on her memberships in the groups.Comment: RecSys 2018 LBRS. arXiv admin note: substantial text overlap with
arXiv:1704.0188
Learning Latent Superstructures in Variational Autoencoders for Deep Multidimensional Clustering
We investigate a variant of variational autoencoders where there is a
superstructure of discrete latent variables on top of the latent features. In
general, our superstructure is a tree structure of multiple super latent
variables and it is automatically learned from data. When there is only one
latent variable in the superstructure, our model reduces to one that assumes
the latent features to be generated from a Gaussian mixture model. We call our
model the latent tree variational autoencoder (LTVAE). Whereas previous deep
learning methods for clustering produce only one partition of data, LTVAE
produces multiple partitions of data, each being given by one super latent
variable. This is desirable because high dimensional data usually have many
different natural facets and can be meaningfully partitioned in multiple ways.Comment: Published in ICLR 201
- …