Search CORE

923 research outputs found

Sparse Modeling for Image and Vision Processing

Author: Ecole Normale Supérieure
Francis Bach
Francis Bach
Hal Id Hal
Jean Ponce
Jean Ponce
Julien Mairal
Julien Mairal
Sparse Modeling Image
Vision Processing
Publication venue
Publication date: 01/01/2014
Field of study

In recent years, a large amount of multi-disciplinary research has been conducted on sparse models and their applications. In statistics and machine learning, the sparsity principle is used to perform model selection---that is, automatically selecting a simple model among a large collection of them. In signal processing, sparse coding consists of representing data with linear combinations of a few dictionary elements. Subsequently, the corresponding tools have been widely adopted by several scientific communities such as neuroscience, bioinformatics, or computer vision. The goal of this monograph is to offer a self-contained view of sparse modeling for visual recognition and image processing. More specifically, we focus on applications where the dictionary is learned and adapted to data, yielding a compact representation that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics and Visio

arXiv.org e-Print Archive

CiteSeerX

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL-Rennes 1

Detection and classification of non-stationary signals using sparse representations in adaptive dictionaries

Author: Moody Daniela I.
Publication venue
Publication date: 01/01/2012
Field of study

Automatic classification of non-stationary radio frequency (RF) signals is of particular interest in persistent surveillance and remote sensing applications. Such signals are often acquired in noisy, cluttered environments, and may be characterized by complex or unknown analytical models, making feature extraction and classification difficult. This thesis proposes an adaptive classification approach for poorly characterized targets and backgrounds based on sparse representations in non-analytical dictionaries learned from data. Conventional analytical orthogonal dictionaries, e.g., Short Time Fourier and Wavelet Transforms, can be suboptimal for classification of non-stationary signals, as they provide a rigid tiling of the time-frequency space, and are not specifically designed for a particular signal class. They generally do not lead to sparse decompositions (i.e., with very few non-zero coefficients), and use in classification requires separate feature selection algorithms. Pursuit-type decompositions in analytical overcomplete (non-orthogonal) dictionaries yield sparse representations, by design, and work well for signals that are similar to the dictionary elements. The pursuit search, however, has a high computational cost, and the method can perform poorly in the presence of realistic noise and clutter. One such overcomplete analytical dictionary method is also analyzed in this thesis for comparative purposes. The main thrust of the thesis is learning discriminative RF dictionaries directly from data, without relying on analytical constraints or additional knowledge about the signal characteristics. A pursuit search is used over the learned dictionaries to generate sparse classification features in order to identify time windows that contain a target pulse. Two state-of-the-art dictionary learning methods are compared, the K-SVD algorithm and Hebbian learning, in terms of their classification performance as a function of dictionary training parameters. Additionally, a novel hybrid dictionary algorithm is introduced, demonstrating better performance and higher robustness to noise. The issue of dictionary dimensionality is explored and this thesis demonstrates that undercomplete learned dictionaries are suitable for non-stationary RF classification. Results on simulated data sets with varying background clutter and noise levels are presented. Lastly, unsupervised classification with undercomplete learned dictionaries is also demonstrated in satellite imagery analysis

Digital Repository at the University of Maryland

FACE RECOGNITION AND VERIFICATION IN UNCONSTRAINED ENVIRIONMENTS

Author: Guo Huimin
Publication venue
Publication date: 01/01/2012
Field of study

Face recognition has been a long standing problem in computer vision. General face recognition is challenging because of large appearance variability due to factors including pose, ambient lighting, expression, size of the face, age, and distance from the camera, etc. There are very accurate techniques to perform face recognition in controlled environments, especially when large numbers of samples are available for each face (individual). However, face identification under uncontrolled( unconstrained) environments or with limited training data is still an unsolved problem. There are two face recognition tasks: face identification (who is who in a probe face set, given a gallery face set) and face verification (same or not, given two faces). In this work, we study both face identification and verification in unconstrained environments. Firstly, we propose a face verification framework that combines Partial Least Squares (PLS) and the One-Shot similarity model[1]. The idea is to describe a face with a large feature set combining shape, texture and color information. PLS regression is applied to perform multi-channel feature weighting on this large feature set. Finally the PLS regression is used to compute the similarity score of an image pair by One-Shot learning (using a fixed negative set). Secondly, we study face identification with image sets, where the gallery and probe are sets of face images of an individual. We model a face set by its covariance matrix (COV) which is a natural 2nd-order statistic of a sample set.By exploring an efficient metric for the SPD matrices, i.e., Log-Euclidean Distance (LED), we derive a kernel function that explicitly maps the covariance matrix from the Riemannian manifold to Euclidean space. Then, discriminative learning is performed on the COV manifold: the learning aims to maximize the between-class COV distance and minimize the within-class COV distance. Sparse representation and dictionary learning have been widely used in face recognition, especially when large numbers of samples are available for each face (individual). Sparse coding is promising since it provides a more stable and discriminative face representation. In the last part of our work, we explore sparse coding and dictionary learning for face verification application. More specifically, in one approach, we apply sparse representations to face verification in two ways via a fix reference set as dictionary. In the other approach, we propose a dictionary learning framework with explicit pairwise constraints, which unifies the discriminative dictionary learning for pair matching (face verification) and classification (face recognition) problems

CiteSeerX

Digital Repository at the University of Maryland

Taming Wild Faces: Web-Scale, Open-Universe Face Identification in Still and Video Imagery

Author: Ortiz Enrique
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2014
Field of study

With the increasing pervasiveness of digital cameras, the Internet, and social networking, there is a growing need to catalog and analyze large collections of photos and videos. In this dissertation, we explore unconstrained still-image and video-based face recognition in real-world scenarios, e.g. social photo sharing and movie trailers, where people of interest are recognized and all others are ignored. In such a scenario, we must obtain high precision in recognizing the known identities, while accurately rejecting those of no interest. Recent advancements in face recognition research has seen Sparse Representation-based Classification (SRC) advance to the forefront of competing methods. However, its drawbacks, slow speed and sensitivity to variations in pose, illumination, and occlusion, have hindered its wide-spread applicability. The contributions of this dissertation are three-fold: 1. For still-image data, we propose a novel Linearly Approximated Sparse Representation-based Classification (LASRC) algorithm that uses linear regression to perform sample selection for l1-minimization, thus harnessing the speed of least-squares and the robustness of SRC. On our large dataset collected from Facebook, LASRC performs equally to standard SRC with a speedup of 100-250x. 2. For video, applying the popular l1-minimization for face recognition on a frame-by-frame basis is prohibitively expensive computationally, so we propose a new algorithm Mean Sequence SRC (MSSRC) that performs video face recognition using a joint optimization leveraging all of the available video data and employing the knowledge that the face track frames belong to the same individual. Employing MSSRC results in a speedup of 5x on average over SRC on a frame-by-frame basis. 3. Finally, we make the observation that MSSRC sometimes assigns inconsistent identities to the same individual in a scene that could be corrected based on their visual similarity. Therefore, we construct a probabilistic affinity graph combining appearance and co-occurrence similarities to model the relationship between face tracks in a video. Using this relationship graph, we employ random walk analysis to propagate strong class predictions among similar face tracks, while dampening weak predictions. Our method results in a performance gain of 15.8% in average precision over using MSSRC alone

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

A Panorama on Multiscale Geometric Representations, Intertwining Spatial, Directional and Frequency Selectivity

Author: Aach
Abrial
Adelson
Allen
Andres
Antoine
Antoine
Antoine
Antoine
Antoine
Antoine
Antoine
Aujol
Auscher
Averbuch
Ayache
Babaud
Bamberger
Baussard
Bayram
Bayram
Belzer
Bergeaud
Beylkin
Bharath
Blu
Blu
Bogdanova
Bracewell
Bredies
Breiman
Bresenham
Bruekers
Brémaud
Burt
Bülow
Bülow
Cai
Candès
Candès
Candès
Candès
Candès
Candès
Candès
Casazza
Cayón
Chambolle
Chan
Chandrasekaran
Chang
Chappelier
Chaudhury
Chaudhury
Chaux
Chaux
Chen
Christensen
Chui
Claypoole
Clonda
Cohen
Cohen
Cohen
Cohen
Cohen
Coifman
Coifman
Coifman
Combettes
Combettes
Cunha
Daragon
Daubechies
Daubechies
Daubechies
Daugman
Daugman
Davis
De Valois
Deans
Dekel
Demanet
Demanet
Demaret
Distasi
Do
Do
Do
Do
Donoho
Donoho
Donoho
Donoho
Donoho
Driscoll
Duffin
Durand
Dyn
Egger
Fadili
Faugère
Feauveau
Fernandes
Fernandes
Figueras i Ventura
Forster
Freeden
Freeden
Freeman
Freeman
Friedrich
Führ
Gabor
Gauthier
Gerek
Golomb
Gopinath
Gopinath
Goutsias
Gouze
Grossman
Guilloux
Guo
Haar
Hahn
Hammond
Hampson
Healy
Heeger
Heijmans
Heijmans
Helbert
Held
Holschneider
Jacques
Jacques
Jansen
Kassim
Kerkyacharian
King
Kingsbury
Kittipoom
Knutsson
Kovačević
Kovačević
Krommweh
Kutyniok
Kâaniche
Le Pennec
Lee
Lessig
Lim
Lindeberg
Lindeberg
Lounsbery
Lu
Ma
Mallat
Mallat
Mallat
Mallat
Malvar
Manduchi
Marr
Marr
Marr
Massopust
Meyer
Meyer
Monaci
Narcowich
Nason
Natarajan
Neff
Nestares
Nguyen
Ogden
Olhede
Olshausen
Pesquet
Peyré
Peyré
Peyré
Plonka
Portilla
Portilla
Quellec
Reissell
Rioul
Rosenfeld
Rosiene
Roşca
Rubinstein
Rudin
Said
Sala Llonch
Sampat
Secker
Selesnick
Selesnick
Shapiro
Shen
Shensa
Shi
Shukla
Simoncelli
Simoncelli
Simoncelli
Smith
Starck
Starck
Starck
Starck
Steffen
Storath
Sweldens
Sweldens
Szatmáry
Tanaka
Tanaka
Tanaka
Tanaka
Taubman
Taubman
Treitel
Tropp
Tropp
Unser
Unser
Vaidyanathan
Van De Ville
Vandergheynst
Vandergheynst
Velisavljević
Vetterli
Wakin
Watson
Wiaux
Wiaux
Wiaux
Wiaux
Willett
Wilson
Witkin
Wornell
Xia
Xiong
Xu
Xu
Yeo
Yin
Zhang
Zhang
Zuidwijk
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

The richness of natural images makes the quest for optimal representations in image processing and computer vision challenging. The latter observation has not prevented the design of image representations, which trade off between efficiency and complexity, while achieving accurate rendering of smooth regions as well as reproducing faithful contours and textures. The most recent ones, proposed in the past decade, share an hybrid heritage highlighting the multiscale and oriented nature of edges and patterns in images. This paper presents a panorama of the aforementioned literature on decompositions in multiscale, multi-orientation bases or dictionaries. They typically exhibit redundancy to improve sparsity in the transformed domain and sometimes its invariance with respect to simple geometric deformations (translation, rotation). Oriented multiscale dictionaries extend traditional wavelet processing and may offer rotation invariance. Highly redundant dictionaries require specific algorithms to simplify the search for an efficient (sparse) representation. We also discuss the extension of multiscale geometric decompositions to non-Euclidean domains such as the sphere or arbitrary meshed surfaces. The etymology of panorama suggests an overview, based on a choice of partially overlapping "pictures". We hope that this paper will contribute to the appreciation and apprehension of a stream of current research directions in image understanding.Comment: 65 pages, 33 figures, 303 reference

arXiv.org e-Print Archive

CiteSeerX

Base de publications de l'université Paris-Dauphine

Crossref

DIAL UCLouvain

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM