Search CORE

174 research outputs found

Bayesian Modelling of Functional Whole Brain Connectivity

Author: Røge Rasmus
Publication venue: Technical University of Denmark
Publication date: 01/01/2017
Field of study

Online Research Database In Technology

$k$ -MLE: A fast algorithm for learning statistical mixture models

Author: Nielsen Frank
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/03/2012
Field of study

We describe

k

-MLE, a fast and efficient local search algorithm for learning finite statistical mixtures of exponential families such as Gaussian mixture models. Mixture models are traditionally learned using the expectation-maximization (EM) soft clustering technique that monotonically increases the incomplete (expected complete) likelihood. Given prescribed mixture weights, the hard clustering

k

-MLE algorithm iteratively assigns data to the most likely weighted component and update the component models using Maximum Likelihood Estimators (MLEs). Using the duality between exponential families and Bregman divergences, we prove that the local convergence of the complete likelihood of

k

-MLE follows directly from the convergence of a dual additively weighted Bregman hard clustering. The inner loop of

k

-MLE can be implemented using any

k

-means heuristic like the celebrated Lloyd's batched or Hartigan's greedy swap updates. We then show how to update the mixture weights by minimizing a cross-entropy criterion that implies to update weights by taking the relative proportion of cluster points, and reiterate the mixture parameter update and mixture weight update processes until convergence. Hard EM is interpreted as a special case of

k

-MLE when both the component update and the weight update are performed successively in the inner loop. To initialize

k

-MLE, we propose

k

-MLE++, a careful initialization of

k

-MLE guaranteeing probabilistically a global bound on the best possible complete likelihood.Comment: 31 pages, Extend preliminary paper presented at IEEE ICASSP 201

arXiv.org e-Print Archive

Crossref

The Importance of Being Clustered: Uncluttering the Trends of Statistics from 1970 to 2015

Author: Anderlucci Laura
Montanari Angela
Viroli Cinzia
Publication venue
Publication date: 01/01/2017
Field of study

In this paper we retrace the recent history of statistics by analyzing all the papers published in five prestigious statistical journals since 1970, namely: Annals of Statistics, Biometrika, Journal of the American Statistical Association, Journal of the Royal Statistical Society, series B and Statistical Science. The aim is to construct a kind of "taxonomy" of the statistical papers by organizing and by clustering them in main themes. In this sense being identified in a cluster means being important enough to be uncluttered in the vast and interconnected world of the statistical research. Since the main statistical research topics naturally born, evolve or die during time, we will also develop a dynamic clustering strategy, where a group in a time period is allowed to migrate or to merge into different groups in the following one. Results show that statistics is a very dynamic and evolving science, stimulated by the rise of new research questions and types of data

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Infinite von Mises-Fisher Mixture Modeling of Whole Brain fMRI Data

Author: Madsen Kristoffer Hougaard
Mørup Morten
Røge Rasmus
Schmidt Mikkel Nørgaard
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2017
Field of study

Online Research Database In Technology

Weakly-Supervised Neural Text Classification

Author: Han Jiawei
Meng Yu
Shen Jiaming
Zhang Chao
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/09/2018
Field of study

Deep neural networks are gaining increasing popularity for the classic text classification task, due to their strong expressive power and less requirement for feature engineering. Despite such attractiveness, neural text classification models suffer from the lack of training data in many real-world applications. Although many semi-supervised and weakly-supervised text classification models exist, they cannot be easily applied to deep neural models and meanwhile support limited supervision types. In this paper, we propose a weakly-supervised method that addresses the lack of training data in neural text classification. Our method consists of two modules: (1) a pseudo-document generator that leverages seed information to generate pseudo-labeled documents for model pre-training, and (2) a self-training module that bootstraps on real unlabeled data for model refinement. Our method has the flexibility to handle different types of weak supervision and can be easily integrated into existing deep neural models for text classification. We have performed extensive experiments on three real-world datasets from different domains. The results demonstrate that our proposed method achieves inspiring performance without requiring excessive training data and outperforms baseline methods significantly.Comment: CIKM 2018 Full Pape

arXiv.org e-Print Archive

Crossref