83,180 research outputs found
Top-N Recommender System via Matrix Completion
Top-N recommender systems have been investigated widely both in industry and
academia. However, the recommendation quality is far from satisfactory. In this
paper, we propose a simple yet promising algorithm. We fill the user-item
matrix based on a low-rank assumption and simultaneously keep the original
information. To do that, a nonconvex rank relaxation rather than the nuclear
norm is adopted to provide a better rank approximation and an efficient
optimization strategy is designed. A comprehensive set of experiments on real
datasets demonstrates that our method pushes the accuracy of Top-N
recommendation to a new level.Comment: AAAI 201
ECONOMIC PERFORMANCE THROUGH TIME: A GENERAL EQUILIBRIUM MODEL
This paper presents a simple general equilibrium model of economic performance through time. The model incorporates 4 main determinants of economic performance: technology, capital investment, the division of labor and institutions. It demonstrates that growth is not automatic even with technological progress. In order to maintain economic growth, it is important to continuously implement new technologies through capital investment. It also shows that institutional improvement promotes the social division of labour, which is an independent source of economic growth.economic growth, savings and investment, transaction costs, division of labor, financial and production institutions
Fast k-means based on KNN Graph
In the era of big data, k-means clustering has been widely adopted as a basic
processing tool in various contexts. However, its computational cost could be
prohibitively high as the data size and the cluster number are large. It is
well known that the processing bottleneck of k-means lies in the operation of
seeking closest centroid in each iteration. In this paper, a novel solution
towards the scalability issue of k-means is presented. In the proposal, k-means
is supported by an approximate k-nearest neighbors graph. In the k-means
iteration, each data sample is only compared to clusters that its nearest
neighbors reside. Since the number of nearest neighbors we consider is much
less than k, the processing cost in this step becomes minor and irrelevant to
k. The processing bottleneck is therefore overcome. The most interesting thing
is that k-nearest neighbor graph is constructed by iteratively calling the fast
-means itself. Comparing with existing fast k-means variants, the proposed
algorithm achieves hundreds to thousands times speed-up while maintaining high
clustering quality. As it is tested on 10 million 512-dimensional data, it
takes only 5.2 hours to produce 1 million clusters. In contrast, to fulfill the
same scale of clustering, it would take 3 years for traditional k-means
Variational Hamiltonian Monte Carlo via Score Matching
Traditionally, the field of computational Bayesian statistics has been
divided into two main subfields: variational methods and Markov chain Monte
Carlo (MCMC). In recent years, however, several methods have been proposed
based on combining variational Bayesian inference and MCMC simulation in order
to improve their overall accuracy and computational efficiency. This marriage
of fast evaluation and flexible approximation provides a promising means of
designing scalable Bayesian inference methods. In this paper, we explore the
possibility of incorporating variational approximation into a state-of-the-art
MCMC method, Hamiltonian Monte Carlo (HMC), to reduce the required gradient
computation in the simulation of Hamiltonian flow, which is the bottleneck for
many applications of HMC in big data problems. To this end, we use a {\it
free-form} approximation induced by a fast and flexible surrogate function
based on single-hidden layer feedforward neural networks. The surrogate
provides sufficiently accurate approximation while allowing for fast
exploration of parameter space, resulting in an efficient approximate inference
algorithm. We demonstrate the advantages of our method on both synthetic and
real data problems
Hamiltonian Monte Carlo Acceleration Using Surrogate Functions with Random Bases
For big data analysis, high computational cost for Bayesian methods often
limits their applications in practice. In recent years, there have been many
attempts to improve computational efficiency of Bayesian inference. Here we
propose an efficient and scalable computational technique for a
state-of-the-art Markov Chain Monte Carlo (MCMC) methods, namely, Hamiltonian
Monte Carlo (HMC). The key idea is to explore and exploit the structure and
regularity in parameter space for the underlying probabilistic model to
construct an effective approximation of its geometric properties. To this end,
we build a surrogate function to approximate the target distribution using
properly chosen random bases and an efficient optimization process. The
resulting method provides a flexible, scalable, and efficient sampling
algorithm, which converges to the correct target distribution. We show that by
choosing the basis functions and optimization process differently, our method
can be related to other approaches for the construction of surrogate functions
such as generalized additive models or Gaussian process models. Experiments
based on simulated and real data show that our approach leads to substantially
more efficient sampling algorithms compared to existing state-of-the art
methods
- β¦