Search CORE

79 research outputs found

Sparse Modeling for Image and Vision Processing

Author: Ecole Normale Supérieure
Francis Bach
Francis Bach
Hal Id Hal
Jean Ponce
Jean Ponce
Julien Mairal
Julien Mairal
Sparse Modeling Image
Vision Processing
Publication venue
Publication date: 01/01/2014
Field of study

In recent years, a large amount of multi-disciplinary research has been conducted on sparse models and their applications. In statistics and machine learning, the sparsity principle is used to perform model selection---that is, automatically selecting a simple model among a large collection of them. In signal processing, sparse coding consists of representing data with linear combinations of a few dictionary elements. Subsequently, the corresponding tools have been widely adopted by several scientific communities such as neuroscience, bioinformatics, or computer vision. The goal of this monograph is to offer a self-contained view of sparse modeling for visual recognition and image processing. More specifically, we focus on applications where the dictionary is learned and adapted to data, yielding a compact representation that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics and Visio

arXiv.org e-Print Archive

CiteSeerX

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Sparse representation frameworks for inference problems in visual sensor networks

Author: Coşar Serhan
Publication venue
Publication date: 01/11/2013
Field of study

Visual sensor networks (VSNs) form a new research area that merges computer vision and sensor networks. VSNs consist of small visual sensor nodes called camera nodes, which integrate an image sensor, an embedded processor, and a wireless transceiver. Having multiple cameras in a wireless network poses unique and challenging problems that do not exist either in computer vision or in sensor networks. Due to the resource constraints of the camera nodes, such as battery power and bandwidth, it is crucial to perform data processing and collaboration efficiently. This thesis presents a number of sparse-representation based methods to be used in the context of surveillance tasks in VSNs. Performing surveillance tasks, such as tracking, recognition, etc., in a communication-constrained VSN environment is extremely challenging. Compressed sensing is a technique for acquiring and reconstructing a signal from small amount of measurements utilizing the prior knowledge that the signal has a sparse representation in a proper space. The ability of sparse representation tools to reconstruct signals from small amount of observations fits well with the limitations in VSNs for processing, communication, and collaboration. Hence, this thesis presents novel sparsity-driven methods that can be used in action recognition and human tracking applications in VSNs. A sparsity-driven action recognition method is proposed by casting the classification problem as an optimization problem. We solve the optimization problem by enforcing sparsity through ł1 regularization and perform action recognition. We have demonstrated the superiority of our method when observations are low-resolution, occluded, and noisy. To the best of our knowledge, this is the first action recognition method that uses sparse representation. In addition, we have proposed an adaptation of this method for VSN resource constraints. We have also performed an analysis of the role of sparsity in classi cation for two different action recognition problems. We have proposed a feature compression framework for human tracking applications in visual sensor networks. In this framework, we perform decentralized tracking: each camera extracts useful features from the images it has observed and sends them to a fusion node which collects the multi-view image features and performs tracking. In tracking, extracting features usually results a likelihood function. To reduce communication in the network, we compress the likelihoods by first splitting them into blocks, and then transforming each block to a proper domain and taking only the most significant coefficients in this representation. To the best of our knowledge, compression of features computed in the context of tracking in a VSN has not been proposed in previous works. We have applied our method for indoor and outdoor tracking scenarios. Experimental results show that our approach can save up to 99.6% of the bandwidth compared to centralized approaches that compress raw images to decrease the communication. We have also shown that our approach outperforms existing decentralized approaches. Furthermore, we have extended this tracking framework and proposed a sparsitydriven approach for human tracking in VSNs. We have designed special overcomplete dictionaries that exploit the specific known geometry of the measurement scenario and used these dictionaries for sparse representation of likelihoods. By obtaining dictionaries that match the structure of the likelihood functions, we can represent likelihoods with few coefficients, and thereby decrease the communication in the network. This is the first method in the literature that uses sparse representation to compress likelihood functions and applies this idea for VSNs. We have tested our approach for indoor and outdoor tracking scenarios and demonstrated that our approach can achieve bandwidth reduction better than our feature compression framework. We have also presented that our approach outperforms existing decentralized and distributed approaches

Sabanci University Research Database

A Methodology for Extracting Human Bodies from Still Images

Author: Tsitsoulis Athanasios
Publication venue: CORE Scholar
Publication date: 01/01/2013
Field of study

Monitoring and surveillance of humans is one of the most prominent applications of today and it is expected to be part of many future aspects of our life, for safety reasons, assisted living and many others. Many efforts have been made towards automatic and robust solutions, but the general problem is very challenging and remains still open. In this PhD dissertation we examine the problem from many perspectives. First, we study the performance of a hardware architecture designed for large-scale surveillance systems. Then, we focus on the general problem of human activity recognition, present an extensive survey of methodologies that deal with this subject and propose a maturity metric to evaluate them. One of the numerous and most popular algorithms for image processing found in the field is image segmentation and we propose a blind metric to evaluate their results regarding the activity at local regions. Finally, we propose a fully automatic system for segmenting and extracting human bodies from challenging single images, which is the main contribution of the dissertation. Our methodology is a novel bottom-up approach relying mostly on anthropometric constraints and is facilitated by our research in the fields of face, skin and hands detection. Experimental results and comparison with state-of-the-art methodologies demonstrate the success of our approach

CORE

Taming Wild Faces: Web-Scale, Open-Universe Face Identification in Still and Video Imagery

Author: Ortiz Enrique
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2014
Field of study

With the increasing pervasiveness of digital cameras, the Internet, and social networking, there is a growing need to catalog and analyze large collections of photos and videos. In this dissertation, we explore unconstrained still-image and video-based face recognition in real-world scenarios, e.g. social photo sharing and movie trailers, where people of interest are recognized and all others are ignored. In such a scenario, we must obtain high precision in recognizing the known identities, while accurately rejecting those of no interest. Recent advancements in face recognition research has seen Sparse Representation-based Classification (SRC) advance to the forefront of competing methods. However, its drawbacks, slow speed and sensitivity to variations in pose, illumination, and occlusion, have hindered its wide-spread applicability. The contributions of this dissertation are three-fold: 1. For still-image data, we propose a novel Linearly Approximated Sparse Representation-based Classification (LASRC) algorithm that uses linear regression to perform sample selection for l1-minimization, thus harnessing the speed of least-squares and the robustness of SRC. On our large dataset collected from Facebook, LASRC performs equally to standard SRC with a speedup of 100-250x. 2. For video, applying the popular l1-minimization for face recognition on a frame-by-frame basis is prohibitively expensive computationally, so we propose a new algorithm Mean Sequence SRC (MSSRC) that performs video face recognition using a joint optimization leveraging all of the available video data and employing the knowledge that the face track frames belong to the same individual. Employing MSSRC results in a speedup of 5x on average over SRC on a frame-by-frame basis. 3. Finally, we make the observation that MSSRC sometimes assigns inconsistent identities to the same individual in a scene that could be corrected based on their visual similarity. Therefore, we construct a probabilistic affinity graph combining appearance and co-occurrence similarities to model the relationship between face tracks in a video. Using this relationship graph, we employ random walk analysis to propagate strong class predictions among similar face tracks, while dampening weak predictions. Our method results in a performance gain of 15.8% in average precision over using MSSRC alone

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Recommended from our members

Learning-based Optimization for Signal and Image Processing

Author: Liu Jialin
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Incorporating machine learning techniques into optimization problems and solvers attracts increasing attention. Given a particular type of optimization problem that needs to be solved repeatedly, machine learning techniques can find some features for this category of optimization and develop algorithms with excellent performance. This thesis deals with algorithms and convergence analysis in learning-based optimization in three aspects: learning dictionaries, learning optimization solvers and learning regularizers.Learning dictionaries for sparse coding is significant for signal processing. Convolutional sparse coding is a form of sparse coding with a structured, translation invariant dictionary. Most convolutional dictionary learning algorithms to date operate in the batch mode, requiring simultaneous access to all training images during the learning process, which results in very high memory usage, and severely limits the training data size that can be used. I proposed two online convolutional dictionary learning algorithms that offered far better scaling of memory and computational cost than batch methods and provided a rigorous theoretical analysis of these methods.Learning fast solvers for optimization is a rising research topic. In recent years, unfolding iterative algorithms as neural networks has become an empirical success in solving sparse recovery problems. However, its theoretical understanding is still immature, which prevents us from fully utilizing the power of neural networks. I studied unfolded ISTA (Iterative Shrinkage Thresholding Algorithm) for sparse signal recovery and established its convergence. Based on the properties of parameters required by convergence, the model can be significantly simplified and, consequently, has much less training cost and better recovery performance.Learning regularizers or priors improves the performance of optimization solvers, especially for signal and image processing tasks. Plug-and-play (PnP) is a non-convex framework that integrates modern priors, such as BM3D or deep learning-based denoisers, into ADMM or other proximal algorithms. Although PnP has been recently studied extensively with great empirical success, theoretical analysis addressing even the most basic question of convergence has been insufficient. In this thesis, the theoretical convergence of PnP-FBS and PnP-ADMM was established, without using diminishing stepsizes, under a certain Lipschitz condition on the denoisers. Furthermore, real spectral normalization was proposed for training deep learning-based denoisers to satisfy the proposed Lipschitz condition

eScholarship - University of California

Skeletonization methods for image and volume inpainting

Author: Sobiecki Andre
Publication venue: 'University of Groningen Press'
Publication date: 01/01/2016
Field of study

Proceedings - University of Groningen

Skeletonization methods for image and volume inpainting

Author: Sobiecki Andre
Publication venue: 'University of Groningen Press'
Publication date: 01/01/2016
Field of study

ARTS repository - University of Groningen

Dynamics and correlations in sparse signal acquisition

Author: Charles Adam Shabti
Publication venue: Georgia Institute of Technology
Publication date: 08/06/2015
Field of study

One of the most important parts of engineered and biological systems is the ability to acquire and interpret information from the surrounding world accurately and in time-scales relevant to the tasks critical to system performance. This classical concept of efficient signal acquisition has been a cornerstone of signal processing research, spawning traditional sampling theorems (e.g. Shannon-Nyquist sampling), efficient filter designs (e.g. the Parks-McClellan algorithm), novel VLSI chipsets for embedded systems, and optimal tracking algorithms (e.g. Kalman filtering). Traditional techniques have made minimal assumptions on the actual signals that were being measured and interpreted, essentially only assuming a limited bandwidth. While these assumptions have provided the foundational works in signal processing, recently the ability to collect and analyze large datasets have allowed researchers to see that many important signal classes have much more regularity than having finite bandwidth. One of the major advances of modern signal processing is to greatly improve on classical signal processing results by leveraging more specific signal statistics. By assuming even very broad classes of signals, signal acquisition and recovery can be greatly improved in regimes where classical techniques are extremely pessimistic. One of the most successful signal assumptions that has gained popularity in recet hears is notion of sparsity. Under the sparsity assumption, the signal is assumed to be composed of a small number of atomic signals from a potentially large dictionary. This limit in the underlying degrees of freedom (the number of atoms used) as opposed to the ambient dimension of the signal has allowed for improved signal acquisition, in particular when the number of measurements is severely limited. While techniques for leveraging sparsity have been explored extensively in many contexts, typically works in this regime concentrate on exploring static measurement systems which result in static measurements of static signals. Many systems, however, have non-trivial dynamic components, either in the measurement system's operation or in the nature of the signal being observed. Due to the promising prior work leveraging sparsity for signal acquisition and the large number of dynamical systems and signals in many important applications, it is critical to understand whether sparsity assumptions are compatible with dynamical systems. Therefore, this work seeks to understand how dynamics and sparsity can be used jointly in various aspects of signal measurement and inference. Specifically, this work looks at three different ways that dynamical systems and sparsity assumptions can interact. In terms of measurement systems, we analyze a dynamical neural network that accumulates signal information over time. We prove a series of bounds on the length of the input signal that drives the network that can be recovered from the values at the network nodes~[1--9]. We also analyze sparse signals that are generated via a dynamical system (i.e. a series of correlated, temporally ordered, sparse signals). For this class of signals, we present a series of inference algorithms that leverage both dynamics and sparsity information, improving the potential for signal recovery in a host of applications~[10--19]. As an extension of dynamical filtering, we show how these dynamic filtering ideas can be expanded to the broader class of spatially correlated signals. Specifically, explore how sparsity and spatial correlations can improve inference of material distributions and spectral super-resolution in hyperspectral imagery~[20--25]. Finally, we analyze dynamical systems that perform optimization routines for sparsity-based inference. We analyze a networked system driven by a continuous-time differential equation and show that such a system is capable of recovering a large variety of different sparse signal classes~[26--30].Ph.D

Scholarly Materials And Research @ Georgia Tech