2,252 research outputs found
Estimating ensemble flows on a hidden Markov chain
We propose a new framework to estimate the evolution of an ensemble of
indistinguishable agents on a hidden Markov chain using only aggregate output
data. This work can be viewed as an extension of the recent developments in
optimal mass transport and Schr\"odinger bridges to the finite state space
hidden Markov chain setting. The flow of the ensemble is estimated by solving a
maximum likelihood problem, which has a convex formulation at the
infinite-particle limit, and we develop a fast numerical algorithm for it. We
illustrate in two numerical examples how this framework can be used to track
the flow of identical and indistinguishable dynamical systems.Comment: 8 pages, 4 figure
Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs
Laplacian mixture models identify overlapping regions of influence in
unlabeled graph and network data in a scalable and computationally efficient
way, yielding useful low-dimensional representations. By combining Laplacian
eigenspace and finite mixture modeling methods, they provide probabilistic or
fuzzy dimensionality reductions or domain decompositions for a variety of input
data types, including mixture distributions, feature vectors, and graphs or
networks. Provable optimal recovery using the algorithm is analytically shown
for a nontrivial class of cluster graphs. Heuristic approximations for scalable
high-performance implementations are described and empirically tested.
Connections to PageRank and community detection in network analysis demonstrate
the wide applicability of this approach. The origins of fuzzy spectral methods,
beginning with generalized heat or diffusion equations in physics, are reviewed
and summarized. Comparisons to other dimensionality reduction and clustering
methods for challenging unsupervised machine learning problems are also
discussed.Comment: 13 figures, 35 reference
Generative Modeling and Inference in Directed and Undirected Neural Networks
Generative modeling and inference are two broad categories in unsupervised learning whose goal is to answer the following questions, respectively: 1. Given a dataset, how do we (either implicitly or explicitly) model the underlying probability distribution from which the data came and draw samples from that distribution? 2. How can we learn an underlying abstract representation of the data? In this dissertation we provide three studies that each in a different way improve upon specific generative modeling and inference techniques. First, we develop a state-of-the-art estimator of a generic probability distribution's partition function, or normalizing constant, during simulated tempering. We then apply our estimator to the specific case of training undirected probabilistic graphical models and find our method able to track log-likelihoods during training at essentially no extra computational cost. We then shift our focus to variational inference in directed probabilistic graphical models (Bayesian networks) for generative modeling and inference. First, we generalize the aggregate prior distribution to decouple the variational and generative models to provide the model with greater flexibility and find improvements in the model's log-likelihood of test data as well as a better latent representation. Finally, we study the variational loss function and argue under a typical architecture the data-dependent term of the gradient decays to zero as the latent space dimensionality increases. We use this result to propose a simple modification to random weight initialization and show in certain models the modification gives rise to substantial improvement in training convergence time. Together, these results improve quantitative performance of popular generative modeling and inference models in addition to furthering our understanding of them
Rank Centrality: Ranking from Pair-wise Comparisons
The question of aggregating pair-wise comparisons to obtain a global ranking
over a collection of objects has been of interest for a very long time: be it
ranking of online gamers (e.g. MSR's TrueSkill system) and chess players,
aggregating social opinions, or deciding which product to sell based on
transactions. In most settings, in addition to obtaining a ranking, finding
`scores' for each object (e.g. player's rating) is of interest for
understanding the intensity of the preferences.
In this paper, we propose Rank Centrality, an iterative rank aggregation
algorithm for discovering scores for objects (or items) from pair-wise
comparisons. The algorithm has a natural random walk interpretation over the
graph of objects with an edge present between a pair of objects if they are
compared; the score, which we call Rank Centrality, of an object turns out to
be its stationary probability under this random walk. To study the efficacy of
the algorithm, we consider the popular Bradley-Terry-Luce (BTL) model
(equivalent to the Multinomial Logit (MNL) for pair-wise comparisons) in which
each object has an associated score which determines the probabilistic outcomes
of pair-wise comparisons between objects. In terms of the pair-wise marginal
probabilities, which is the main subject of this paper, the MNL model and the
BTL model are identical. We bound the finite sample error rates between the
scores assumed by the BTL model and those estimated by our algorithm. In
particular, the number of samples required to learn the score well with high
probability depends on the structure of the comparison graph. When the
Laplacian of the comparison graph has a strictly positive spectral gap, e.g.
each item is compared to a subset of randomly chosen items, this leads to
dependence on the number of samples that is nearly order-optimal.Comment: 45 pages, 3 figure
Part-time Bayesians: incentives and behavioral heterogeneity in belief updating
Decisions in management and finance rely on information that often includes win-lose feedback (e.g., gains and losses, success and failure). Simple reinforcement then suggests to blindly repeat choices if they led to success in the past and change them otherwise, which might conflict with Bayesian updating of beliefs. We use finite mixture models and hidden Markov models, adapted from machine learning, to uncover behavioral heterogeneity in the reliance on difference behavioral rules across and within individuals in a belief-updating experiment. Most decision makers rely both on Bayesian updating and reinforcement. Paradoxically, an increase in incentives increases the reliance on reinforcement because the win-lose cues become more salient
- …