106 research outputs found

    Learning Low-Dimensional Signal Models

    Get PDF
    Sampling, coding, and streaming even the most essential data, e.g., in medical imaging and weather-monitoring applications, produce a data deluge that severely stresses the avail able analog-to-digital converter, communication bandwidth, and digital-storage resources. Surprisingly, while the ambient data dimension is large in many problems, the relevant information in the data can reside in a much lower dimensional space. This observation has led to several important theoretical and algorithmic developments under different low-dimensional modeling frameworks, such as compressive sensing (CS), matrix completion, and general factor-model representations. These approaches have enabled new measurement systems, tools, and methods for information extraction from dimensionality-reduced or incomplete data. A key aspect of maximizing the potential of such techniques is to develop appropriate data models. In this article, we investigate this challenge from the perspective of nonparametric Bayesian analysis

    The Data Big Bang and the Expanding Digital Universe: High-Dimensional, Complex and Massive Data Sets in an Inflationary Epoch

    Get PDF
    Recent and forthcoming advances in instrumentation, and giant new surveys, are creating astronomical data sets that are not amenable to the methods of analysis familiar to astronomers. Traditional methods are often inadequate not merely because of the size in bytes of the data sets, but also because of the complexity of modern data sets. Mathematical limitations of familiar algorithms and techniques in dealing with such data sets create a critical need for new paradigms for the representation, analysis and scientific visualization (as opposed to illustrative visualization) of heterogeneous, multiresolution data across application domains. Some of the problems presented by the new data sets have been addressed by other disciplines such as applied mathematics, statistics and machine learning and have been utilized by other sciences such as space-based geosciences. Unfortunately, valuable results pertaining to these problems are mostly to be found only in publications outside of astronomy. Here we offer brief overviews of a number of concepts, techniques and developments, some "old" and some new. These are generally unknown to most of the astronomical community, but are vital to the analysis and visualization of complex datasets and images. In order for astronomers to take advantage of the richness and complexity of the new era of data, and to be able to identify, adopt, and apply new solutions, the astronomical community needs a certain degree of awareness and understanding of the new concepts. One of the goals of this paper is to help bridge the gap between applied mathematics, artificial intelligence and computer science on the one side and astronomy on the other.Comment: 24 pages, 8 Figures, 1 Table. Accepted for publication: "Advances in Astronomy, special issue "Robotic Astronomy

    Autoregressive process parameters estimation from Compressed Sensing measurements and Bayesian dictionary learning

    Get PDF
    The main contribution of this thesis is the introduction of new techniques which allow to perform signal processing operations on signals represented by means of compressed sensing. Exploiting autoregressive modeling of the original signal, we obtain a compact yet representative description of the signal which can be estimated directly in the compressed domain. This is the key concept on which the applications we introduce rely on. In fact, thanks to proposed the framework it is possible to gain information about the original signal given compressed sensing measurements. This is done by means of autoregressive modeling which can be used to describe a signal through a small number of parameters. We develop a method to estimate these parameters given the compressed measurements by using an ad-hoc sensing matrix design and two different coupled estimators that can be used in different scenarios. This enables centralized and distributed estimation of the covariance matrix of a process given the compressed sensing measurements in a efficient way at low communication cost. Next, we use the characterization of the original signal done by means of few autoregressive parameters to improve compressive imaging. In particular, we use these parameters as a proxy to estimate the complexity of a block of a given image. This allows us to introduce a novel compressive imaging system in which the number of allocated measurements is adapted for each block depending on its complexity, i.e., spatial smoothness. The result is that a careful allocation of the measurements, improves the recovery process by reaching higher recovery quality at the same compression ratio in comparison to state-of-the-art compressive image recovery techniques. Interestingly, the parameters we are able to estimate directly in the compressed domain not only can improve the recovery but can also be used as feature vectors for classification. In fact, we also propose to use these parameters as more general feature vectors which allow to perform classification in the compressed domain. Remarkably, this method reaches high classification performance which is comparable with that obtained in the original domain, but with a lower cost in terms of dataset storage. In the second part of this work, we focus on sparse representations. In fact, a better sparsifying dictionary can improve the Compressed Sensing recovery performance. At first, we focus on the original domain and hence no dimensionality reduction by means of Compressed Sensing is considered. In particular, we develop a Bayesian technique which, in a fully automated fashion, performs dictionary learning. More in detail, using the uncertainties coming from atoms selection in the sparse representation step, this technique outperforms state-of-the-art dictionary learning techniques. Then, we also address image denoising and inpainting tasks using the aforementioned technique with excellent results. Next, we move to the compressed domain where a better dictionary is expected to provide improved recovery. We show how the Bayesian dictionary learning model can be adapted to the compressive case and the necessary assumptions that must be made when considering random projections. Lastly, numerical experiments confirm the superiority of this technique when compared to other compressive dictionary learning techniques

    Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)

    Get PDF
    The implicit objective of the biennial "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For its second edition, the iTWIST workshop took place in the medieval and picturesque town of Namur in Belgium, from Wednesday August 27th till Friday August 29th, 2014. The workshop was conveniently located in "The Arsenal" building within walking distance of both hotels and town center. iTWIST'14 has gathered about 70 international participants and has featured 9 invited talks, 10 oral presentations, and 14 posters on the following themes, all related to the theory, application and generalization of the "sparsity paradigm": Sparsity-driven data sensing and processing; Union of low dimensional subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph sensing/processing; Blind inverse problems and dictionary learning; Sparsity and computational neuroscience; Information theory, geometry and randomness; Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?; Sparse machine learning and inference.Comment: 69 pages, 24 extended abstracts, iTWIST'14 website: http://sites.google.com/site/itwist1

    Doctor of Philosophy

    Get PDF
    dissertationLatent structures play a vital role in many data analysis tasks. By providing compact yet expressive representations, such structures can offer useful insights into the complex and high-dimensional datasets encountered in domains such as computational biology, computer vision, natural language processing, etc. Specifying the right complexity of these latent structures for a given problem is an important modeling decision. Instead of using models with an a priori fixed complexity, it is desirable to have models that can adapt their complexity as the data warrant. Nonparametric Bayesian models are motivated precisely based on this desideratum by offering a flexible modeling paradigm for data without limiting the model-complexity a priori. The flexibility comes from the model's ability to adjust its complexity adaptively with data. This dissertation is about nonparametric Bayesian learning of two specific types of latent structures: (1) low-dimensional latent features underlying high-dimensional observed data where the latent features could exhibit interdependencies, and (2) latent task structures that capture how a set of learning tasks relate with each other, a notion critical in the paradigm of Multitask Learning where the goal is to solve multiple learning tasks jointly in order to borrow information across similar tasks. Another focus of this dissertation is on designing efficient approximate inference algorithms for nonparametric Bayesian models. Specifically, for the nonparametric Bayesian latent feature model where the goal is to infer the binary-valued latent feature assignment matrix for a given set of observations, the dissertation proposes two approximate inference methods. The first one is a search-based algorithm to find the maximum-a-posteriori (MAP) solution for the latent feature assignment matrix. The second one is a sequential Monte-Carlo-based approximate inference algorithm that allows processing the data oneexample- at-a-time while being space-efficient in terms of the storage required to represent the posterior distribution of the latent feature assignment matrix
    corecore