499 research outputs found
LASS: a simple assignment model with Laplacian smoothing
We consider the problem of learning soft assignments of items to
categories given two sources of information: an item-category similarity
matrix, which encourages items to be assigned to categories they are similar to
(and to not be assigned to categories they are dissimilar to), and an item-item
similarity matrix, which encourages similar items to have similar assignments.
We propose a simple quadratic programming model that captures this intuition.
We give necessary conditions for its solution to be unique, define an
out-of-sample mapping, and derive a simple, effective training algorithm based
on the alternating direction method of multipliers. The model predicts
reasonable assignments from even a few similarity values, and can be seen as a
generalization of semisupervised learning. It is particularly useful when items
naturally belong to multiple categories, as for example when annotating
documents with keywords or pictures with tags, with partially tagged items, or
when the categories have complex interrelations (e.g. hierarchical) that are
unknown.Comment: 20 pages, 4 figures. A shorter version appears in AAAI 201
Towards Swarm Calculus: Urn Models of Collective Decisions and Universal Properties of Swarm Performance
Methods of general applicability are searched for in swarm intelligence with
the aim of gaining new insights about natural swarms and to develop design
methodologies for artificial swarms. An ideal solution could be a `swarm
calculus' that allows to calculate key features of swarms such as expected
swarm performance and robustness based on only a few parameters. To work
towards this ideal, one needs to find methods and models with high degrees of
generality. In this paper, we report two models that might be examples of
exceptional generality. First, an abstract model is presented that describes
swarm performance depending on swarm density based on the dichotomy between
cooperation and interference. Typical swarm experiments are given as examples
to show how the model fits to several different results. Second, we give an
abstract model of collective decision making that is inspired by urn models.
The effects of positive feedback probability, that is increasing over time in a
decision making system, are understood by the help of a parameter that controls
the feedback based on the swarm's current consensus. Several applicable
methods, such as the description as Markov process, calculation of splitting
probabilities, mean first passage times, and measurements of positive feedback,
are discussed and applications to artificial and natural swarms are reported
Semantic spaces
Any natural language can be considered as a tool for producing large
databases (consisting of texts, written, or discursive). This tool for its
description in turn requires other large databases (dictionaries, grammars
etc.). Nowadays, the notion of database is associated with computer processing
and computer memory. However, a natural language resides also in human brains
and functions in human communication, from interpersonal to intergenerational
one. We discuss in this survey/research paper mathematical, in particular
geometric, constructions, which help to bridge these two worlds. In particular,
in this paper we consider the Vector Space Model of semantics based on
frequency matrices, as used in Natural Language Processing. We investigate
underlying geometries, formulated in terms of Grassmannians, projective spaces,
and flag varieties. We formulate the relation between vector space models and
semantic spaces based on semic axes in terms of projectability of subvarieties
in Grassmannians and projective spaces. We interpret Latent Semantics as a
geometric flow on Grassmannians. We also discuss how to formulate G\"ardenfors'
notion of "meeting of minds" in our geometric setting.Comment: 32 pages, TeX, 1 eps figur
-MLE: A fast algorithm for learning statistical mixture models
We describe -MLE, a fast and efficient local search algorithm for learning
finite statistical mixtures of exponential families such as Gaussian mixture
models. Mixture models are traditionally learned using the
expectation-maximization (EM) soft clustering technique that monotonically
increases the incomplete (expected complete) likelihood. Given prescribed
mixture weights, the hard clustering -MLE algorithm iteratively assigns data
to the most likely weighted component and update the component models using
Maximum Likelihood Estimators (MLEs). Using the duality between exponential
families and Bregman divergences, we prove that the local convergence of the
complete likelihood of -MLE follows directly from the convergence of a dual
additively weighted Bregman hard clustering. The inner loop of -MLE can be
implemented using any -means heuristic like the celebrated Lloyd's batched
or Hartigan's greedy swap updates. We then show how to update the mixture
weights by minimizing a cross-entropy criterion that implies to update weights
by taking the relative proportion of cluster points, and reiterate the mixture
parameter update and mixture weight update processes until convergence. Hard EM
is interpreted as a special case of -MLE when both the component update and
the weight update are performed successively in the inner loop. To initialize
-MLE, we propose -MLE++, a careful initialization of -MLE guaranteeing
probabilistically a global bound on the best possible complete likelihood.Comment: 31 pages, Extend preliminary paper presented at IEEE ICASSP 201
Non-Parametric and Regularized Dynamical Wasserstein Barycenters for Time-Series Analysis
We consider probabilistic time-series models for systems that gradually
transition among a finite number of states. We are particularly motivated by
applications such as human activity analysis where the observed time-series
contains segments representing distinct activities such as running or walking
as well as segments characterized by continuous transition among these states.
Accordingly, the dynamical Wasserstein barycenter (DWB) model introduced in
Cheng et al. in 2021 [1] associates with each state, which we call a pure
state, its own probability distribution, and models these continuous
transitions with the dynamics of the barycentric weights that combine the pure
state distributions via the Wasserstein barycenter. Here, focusing on the
univariate case where Wasserstein distances and barycenters can be computed in
closed form, we extend [1] by discussing two challenges associated with
learning a DWB model and two improvements. First, we highlight the issue of
uniqueness in identifying the model parameters. Secondly, we discuss the
challenge of estimating a dynamically evolving distribution given a limited
number of samples. The uncertainty associated with this estimation may cause a
model's learned dynamics to not reflect the gradual transitions characteristic
of the system. The first improvement introduces a regularization framework that
addresses this uncertainty by imposing temporal smoothness on the dynamics of
the barycentric weights while leveraging the understanding of the
non-uniqueness of the problem. This is done without defining an entire
stochastic model for the dynamics of the system as in [1]. Our second
improvement lifts the Gaussian assumption on the pure states distributions in
[1] by proposing a quantile-based non-parametric representation. We pose model
estimation in a variational framework and propose a finite approximation to the
infinite dimensional problem
Brain Activity Mapping from MEG Data via a Hierarchical Bayesian Algorithm with Automatic Depth Weighting
A recently proposed iterated alternating sequential (IAS) MEG inverse solver algorithm, based on the coupling of a hierarchical Bayesian model with computationally efficient Krylov subspace linear solver, has been shown to perform well for both superficial and deep brain sources. However, a systematic study of its ability to correctly identify active brain regions is still missing. We propose novel statistical protocols to quantify the performance of MEG inverse solvers, focusing in particular on how their accuracy and precision at identifying active brain regions. We use these protocols for a systematic study of the performance of the IAS MEG inverse solver, comparing it with three standard inversion methods, wMNE, dSPM, and sLORETA. To avoid the bias of anecdotal tests towards a particular algorithm, the proposed protocols are Monte Carlo sampling based, generating an ensemble of activity patches in each brain region identified in a given atlas. The performance in correctly identifying the active areas is measured by how much, on average, the reconstructed activity is concentrated in the brain region of the simulated active patch. The analysis is based on Bayes factors, interpreting the estimated current activity as data for testing the hypothesis that the active brain region is correctly identified, versus the hypothesis of any erroneous attribution. The methodology allows the presence of a single or several simultaneous activity regions, without assuming that the number of active regions is known. The testing protocols suggest that the IAS solver performs well with both with cortical and subcortical activity estimation
Multiple Subject Barycentric Discriminant Analysis (MUSUBADA): How to Assign Scans to Categories without Using Spatial Normalization
We present a new discriminant analysis (DA) method called Multiple Subject Barycentric Discriminant Analysis (MUSUBADA) suited for analyzing fMRI data because it handles datasets with multiple participants that each provides different number of variables (i.e., voxels) that are themselves grouped into regions of interest (ROIs). Like DA, MUSUBADA (1) assigns observations to predefined categories, (2) gives factorial maps displaying observations and categories, and (3) optimally assigns observations to categories. MUSUBADA handles cases with more variables than observations and can project portions of the data table (e.g., subtables, which can represent participants or ROIs) on the factorial maps. Therefore MUSUBADA can analyze datasets with different voxel numbers per participant and, so does not require spatial normalization. MUSUBADA statistical inferences are implemented with cross-validation techniques (e.g., jackknife and bootstrap), its performance is evaluated with confusion matrices (for fixed and random models) and represented with prediction, tolerance, and confidence intervals. We present an example where we predict the image categories (houses, shoes, chairs, and human, monkey, dog, faces,) of images watched by participants whose brains were scanned. This example corresponds to a DA question in which the data table is made of subtables (one per subject) and with more variables than observations
- …