5,733 research outputs found

    Gibbs Max-margin Topic Models with Data Augmentation

    Full text link
    Max-margin learning is a powerful approach to building classifiers and structured output predictors. Recent work on max-margin supervised topic models has successfully integrated it with Bayesian topic models to discover discriminative latent semantic structures and make accurate predictions for unseen testing data. However, the resulting learning problems are usually hard to solve because of the non-smoothness of the margin loss. Existing approaches to building max-margin supervised topic models rely on an iterative procedure to solve multiple latent SVM subproblems with additional mean-field assumptions on the desired posterior distributions. This paper presents an alternative approach by defining a new max-margin loss. Namely, we present Gibbs max-margin supervised topic models, a latent variable Gibbs classifier to discover hidden topic representations for various tasks, including classification, regression and multi-task learning. Gibbs max-margin supervised topic models minimize an expected margin loss, which is an upper bound of the existing margin loss derived from an expected prediction rule. By introducing augmented variables and integrating out the Dirichlet variables analytically by conjugacy, we develop simple Gibbs sampling algorithms with no restricting assumptions and no need to solve SVM subproblems. Furthermore, each step of the "augment-and-collapse" Gibbs sampling algorithms has an analytical conditional distribution, from which samples can be easily drawn. Experimental results demonstrate significant improvements on time efficiency. The classification performance is also significantly improved over competitors on binary, multi-class and multi-label classification tasks.Comment: 35 page

    A novel dynamic asset allocation system using Feature Saliency Hidden Markov models for smart beta investing

    Get PDF
    The financial crisis of 2008 generated interest in more transparent, rules-based strategies for portfolio construction, with Smart beta strategies emerging as a trend among institutional investors. While they perform well in the long run, these strategies often suffer from severe short-term drawdown (peak-to-trough decline) with fluctuating performance across cycles. To address cyclicality and underperformance, we build a dynamic asset allocation system using Hidden Markov Models (HMMs). We test our system across multiple combinations of smart beta strategies and the resulting portfolios show an improvement in risk-adjusted returns, especially on more return oriented portfolios (up to 50%\% in excess of market annually). In addition, we propose a novel smart beta allocation system based on the Feature Saliency HMM (FSHMM) algorithm that performs feature selection simultaneously with the training of the HMM, to improve regime identification. We evaluate our systematic trading system with real life assets using MSCI indices; further, the results (up to 60%\% in excess of market annually) show model performance improvement with respect to portfolios built using full feature HMMs

    To go deep or wide in learning?

    Full text link
    To achieve acceptable performance for AI tasks, one can either use sophisticated feature extraction methods as the first layer in a two-layered supervised learning model, or learn the features directly using a deep (multi-layered) model. While the first approach is very problem-specific, the second approach has computational overheads in learning multiple layers and fine-tuning of the model. In this paper, we propose an approach called wide learning based on arc-cosine kernels, that learns a single layer of infinite width. We propose exact and inexact learning strategies for wide learning and show that wide learning with single layer outperforms single layer as well as deep architectures of finite width for some benchmark datasets.Comment: 9 pages, 1 figure, Accepted for publication in Seventeenth International Conference on Artificial Intelligence and Statistic

    Herding as a Learning System with Edge-of-Chaos Dynamics

    Full text link
    Herding defines a deterministic dynamical system at the edge of chaos. It generates a sequence of model states and parameters by alternating parameter perturbations with state maximizations, where the sequence of states can be interpreted as "samples" from an associated MRF model. Herding differs from maximum likelihood estimation in that the sequence of parameters does not converge to a fixed point and differs from an MCMC posterior sampling approach in that the sequence of states is generated deterministically. Herding may be interpreted as a"perturb and map" method where the parameter perturbations are generated using a deterministic nonlinear dynamical system rather than randomly from a Gumbel distribution. This chapter studies the distinct statistical characteristics of the herding algorithm and shows that the fast convergence rate of the controlled moments may be attributed to edge of chaos dynamics. The herding algorithm can also be generalized to models with latent variables and to a discriminative learning setting. The perceptron cycling theorem ensures that the fast moment matching property is preserved in the more general framework
    • …
    corecore