499 research outputs found

    LASS: a simple assignment model with Laplacian smoothing

    Full text link
    We consider the problem of learning soft assignments of NN items to KK categories given two sources of information: an item-category similarity matrix, which encourages items to be assigned to categories they are similar to (and to not be assigned to categories they are dissimilar to), and an item-item similarity matrix, which encourages similar items to have similar assignments. We propose a simple quadratic programming model that captures this intuition. We give necessary conditions for its solution to be unique, define an out-of-sample mapping, and derive a simple, effective training algorithm based on the alternating direction method of multipliers. The model predicts reasonable assignments from even a few similarity values, and can be seen as a generalization of semisupervised learning. It is particularly useful when items naturally belong to multiple categories, as for example when annotating documents with keywords or pictures with tags, with partially tagged items, or when the categories have complex interrelations (e.g. hierarchical) that are unknown.Comment: 20 pages, 4 figures. A shorter version appears in AAAI 201

    Towards Swarm Calculus: Urn Models of Collective Decisions and Universal Properties of Swarm Performance

    Full text link
    Methods of general applicability are searched for in swarm intelligence with the aim of gaining new insights about natural swarms and to develop design methodologies for artificial swarms. An ideal solution could be a `swarm calculus' that allows to calculate key features of swarms such as expected swarm performance and robustness based on only a few parameters. To work towards this ideal, one needs to find methods and models with high degrees of generality. In this paper, we report two models that might be examples of exceptional generality. First, an abstract model is presented that describes swarm performance depending on swarm density based on the dichotomy between cooperation and interference. Typical swarm experiments are given as examples to show how the model fits to several different results. Second, we give an abstract model of collective decision making that is inspired by urn models. The effects of positive feedback probability, that is increasing over time in a decision making system, are understood by the help of a parameter that controls the feedback based on the swarm's current consensus. Several applicable methods, such as the description as Markov process, calculation of splitting probabilities, mean first passage times, and measurements of positive feedback, are discussed and applications to artificial and natural swarms are reported

    Semantic spaces

    Get PDF
    Any natural language can be considered as a tool for producing large databases (consisting of texts, written, or discursive). This tool for its description in turn requires other large databases (dictionaries, grammars etc.). Nowadays, the notion of database is associated with computer processing and computer memory. However, a natural language resides also in human brains and functions in human communication, from interpersonal to intergenerational one. We discuss in this survey/research paper mathematical, in particular geometric, constructions, which help to bridge these two worlds. In particular, in this paper we consider the Vector Space Model of semantics based on frequency matrices, as used in Natural Language Processing. We investigate underlying geometries, formulated in terms of Grassmannians, projective spaces, and flag varieties. We formulate the relation between vector space models and semantic spaces based on semic axes in terms of projectability of subvarieties in Grassmannians and projective spaces. We interpret Latent Semantics as a geometric flow on Grassmannians. We also discuss how to formulate G\"ardenfors' notion of "meeting of minds" in our geometric setting.Comment: 32 pages, TeX, 1 eps figur

    kk-MLE: A fast algorithm for learning statistical mixture models

    Full text link
    We describe kk-MLE, a fast and efficient local search algorithm for learning finite statistical mixtures of exponential families such as Gaussian mixture models. Mixture models are traditionally learned using the expectation-maximization (EM) soft clustering technique that monotonically increases the incomplete (expected complete) likelihood. Given prescribed mixture weights, the hard clustering kk-MLE algorithm iteratively assigns data to the most likely weighted component and update the component models using Maximum Likelihood Estimators (MLEs). Using the duality between exponential families and Bregman divergences, we prove that the local convergence of the complete likelihood of kk-MLE follows directly from the convergence of a dual additively weighted Bregman hard clustering. The inner loop of kk-MLE can be implemented using any kk-means heuristic like the celebrated Lloyd's batched or Hartigan's greedy swap updates. We then show how to update the mixture weights by minimizing a cross-entropy criterion that implies to update weights by taking the relative proportion of cluster points, and reiterate the mixture parameter update and mixture weight update processes until convergence. Hard EM is interpreted as a special case of kk-MLE when both the component update and the weight update are performed successively in the inner loop. To initialize kk-MLE, we propose kk-MLE++, a careful initialization of kk-MLE guaranteeing probabilistically a global bound on the best possible complete likelihood.Comment: 31 pages, Extend preliminary paper presented at IEEE ICASSP 201

    Non-Parametric and Regularized Dynamical Wasserstein Barycenters for Time-Series Analysis

    Full text link
    We consider probabilistic time-series models for systems that gradually transition among a finite number of states. We are particularly motivated by applications such as human activity analysis where the observed time-series contains segments representing distinct activities such as running or walking as well as segments characterized by continuous transition among these states. Accordingly, the dynamical Wasserstein barycenter (DWB) model introduced in Cheng et al. in 2021 [1] associates with each state, which we call a pure state, its own probability distribution, and models these continuous transitions with the dynamics of the barycentric weights that combine the pure state distributions via the Wasserstein barycenter. Here, focusing on the univariate case where Wasserstein distances and barycenters can be computed in closed form, we extend [1] by discussing two challenges associated with learning a DWB model and two improvements. First, we highlight the issue of uniqueness in identifying the model parameters. Secondly, we discuss the challenge of estimating a dynamically evolving distribution given a limited number of samples. The uncertainty associated with this estimation may cause a model's learned dynamics to not reflect the gradual transitions characteristic of the system. The first improvement introduces a regularization framework that addresses this uncertainty by imposing temporal smoothness on the dynamics of the barycentric weights while leveraging the understanding of the non-uniqueness of the problem. This is done without defining an entire stochastic model for the dynamics of the system as in [1]. Our second improvement lifts the Gaussian assumption on the pure states distributions in [1] by proposing a quantile-based non-parametric representation. We pose model estimation in a variational framework and propose a finite approximation to the infinite dimensional problem

    Brain Activity Mapping from MEG Data via a Hierarchical Bayesian Algorithm with Automatic Depth Weighting

    Get PDF
    A recently proposed iterated alternating sequential (IAS) MEG inverse solver algorithm, based on the coupling of a hierarchical Bayesian model with computationally efficient Krylov subspace linear solver, has been shown to perform well for both superficial and deep brain sources. However, a systematic study of its ability to correctly identify active brain regions is still missing. We propose novel statistical protocols to quantify the performance of MEG inverse solvers, focusing in particular on how their accuracy and precision at identifying active brain regions. We use these protocols for a systematic study of the performance of the IAS MEG inverse solver, comparing it with three standard inversion methods, wMNE, dSPM, and sLORETA. To avoid the bias of anecdotal tests towards a particular algorithm, the proposed protocols are Monte Carlo sampling based, generating an ensemble of activity patches in each brain region identified in a given atlas. The performance in correctly identifying the active areas is measured by how much, on average, the reconstructed activity is concentrated in the brain region of the simulated active patch. The analysis is based on Bayes factors, interpreting the estimated current activity as data for testing the hypothesis that the active brain region is correctly identified, versus the hypothesis of any erroneous attribution. The methodology allows the presence of a single or several simultaneous activity regions, without assuming that the number of active regions is known. The testing protocols suggest that the IAS solver performs well with both with cortical and subcortical activity estimation

    Multiple Subject Barycentric Discriminant Analysis (MUSUBADA): How to Assign Scans to Categories without Using Spatial Normalization

    Get PDF
    We present a new discriminant analysis (DA) method called Multiple Subject Barycentric Discriminant Analysis (MUSUBADA) suited for analyzing fMRI data because it handles datasets with multiple participants that each provides different number of variables (i.e., voxels) that are themselves grouped into regions of interest (ROIs). Like DA, MUSUBADA (1) assigns observations to predefined categories, (2) gives factorial maps displaying observations and categories, and (3) optimally assigns observations to categories. MUSUBADA handles cases with more variables than observations and can project portions of the data table (e.g., subtables, which can represent participants or ROIs) on the factorial maps. Therefore MUSUBADA can analyze datasets with different voxel numbers per participant and, so does not require spatial normalization. MUSUBADA statistical inferences are implemented with cross-validation techniques (e.g., jackknife and bootstrap), its performance is evaluated with confusion matrices (for fixed and random models) and represented with prediction, tolerance, and confidence intervals. We present an example where we predict the image categories (houses, shoes, chairs, and human, monkey, dog, faces,) of images watched by participants whose brains were scanned. This example corresponds to a DA question in which the data table is made of subtables (one per subject) and with more variables than observations
    corecore