5,733 research outputs found
Gibbs Max-margin Topic Models with Data Augmentation
Max-margin learning is a powerful approach to building classifiers and
structured output predictors. Recent work on max-margin supervised topic models
has successfully integrated it with Bayesian topic models to discover
discriminative latent semantic structures and make accurate predictions for
unseen testing data. However, the resulting learning problems are usually hard
to solve because of the non-smoothness of the margin loss. Existing approaches
to building max-margin supervised topic models rely on an iterative procedure
to solve multiple latent SVM subproblems with additional mean-field assumptions
on the desired posterior distributions. This paper presents an alternative
approach by defining a new max-margin loss. Namely, we present Gibbs max-margin
supervised topic models, a latent variable Gibbs classifier to discover hidden
topic representations for various tasks, including classification, regression
and multi-task learning. Gibbs max-margin supervised topic models minimize an
expected margin loss, which is an upper bound of the existing margin loss
derived from an expected prediction rule. By introducing augmented variables
and integrating out the Dirichlet variables analytically by conjugacy, we
develop simple Gibbs sampling algorithms with no restricting assumptions and no
need to solve SVM subproblems. Furthermore, each step of the
"augment-and-collapse" Gibbs sampling algorithms has an analytical conditional
distribution, from which samples can be easily drawn. Experimental results
demonstrate significant improvements on time efficiency. The classification
performance is also significantly improved over competitors on binary,
multi-class and multi-label classification tasks.Comment: 35 page
A novel dynamic asset allocation system using Feature Saliency Hidden Markov models for smart beta investing
The financial crisis of 2008 generated interest in more transparent,
rules-based strategies for portfolio construction, with Smart beta strategies
emerging as a trend among institutional investors. While they perform well in
the long run, these strategies often suffer from severe short-term drawdown
(peak-to-trough decline) with fluctuating performance across cycles. To address
cyclicality and underperformance, we build a dynamic asset allocation system
using Hidden Markov Models (HMMs). We test our system across multiple
combinations of smart beta strategies and the resulting portfolios show an
improvement in risk-adjusted returns, especially on more return oriented
portfolios (up to 50 in excess of market annually). In addition, we propose
a novel smart beta allocation system based on the Feature Saliency HMM (FSHMM)
algorithm that performs feature selection simultaneously with the training of
the HMM, to improve regime identification. We evaluate our systematic trading
system with real life assets using MSCI indices; further, the results (up to
60 in excess of market annually) show model performance improvement with
respect to portfolios built using full feature HMMs
To go deep or wide in learning?
To achieve acceptable performance for AI tasks, one can either use
sophisticated feature extraction methods as the first layer in a two-layered
supervised learning model, or learn the features directly using a deep
(multi-layered) model. While the first approach is very problem-specific, the
second approach has computational overheads in learning multiple layers and
fine-tuning of the model. In this paper, we propose an approach called wide
learning based on arc-cosine kernels, that learns a single layer of infinite
width. We propose exact and inexact learning strategies for wide learning and
show that wide learning with single layer outperforms single layer as well as
deep architectures of finite width for some benchmark datasets.Comment: 9 pages, 1 figure, Accepted for publication in Seventeenth
International Conference on Artificial Intelligence and Statistic
Herding as a Learning System with Edge-of-Chaos Dynamics
Herding defines a deterministic dynamical system at the edge of chaos. It
generates a sequence of model states and parameters by alternating parameter
perturbations with state maximizations, where the sequence of states can be
interpreted as "samples" from an associated MRF model. Herding differs from
maximum likelihood estimation in that the sequence of parameters does not
converge to a fixed point and differs from an MCMC posterior sampling approach
in that the sequence of states is generated deterministically. Herding may be
interpreted as a"perturb and map" method where the parameter perturbations are
generated using a deterministic nonlinear dynamical system rather than randomly
from a Gumbel distribution. This chapter studies the distinct statistical
characteristics of the herding algorithm and shows that the fast convergence
rate of the controlled moments may be attributed to edge of chaos dynamics. The
herding algorithm can also be generalized to models with latent variables and
to a discriminative learning setting. The perceptron cycling theorem ensures
that the fast moment matching property is preserved in the more general
framework
- …