67 research outputs found

    Domain-Specific Face Synthesis for Video Face Recognition from a Single Sample Per Person

    Full text link
    The performance of still-to-video FR systems can decline significantly because faces captured in unconstrained operational domain (OD) over multiple video cameras have a different underlying data distribution compared to faces captured under controlled conditions in the enrollment domain (ED) with a still camera. This is particularly true when individuals are enrolled to the system using a single reference still. To improve the robustness of these systems, it is possible to augment the reference set by generating synthetic faces based on the original still. However, without knowledge of the OD, many synthetic images must be generated to account for all possible capture conditions. FR systems may, therefore, require complex implementations and yield lower accuracy when training on many less relevant images. This paper introduces an algorithm for domain-specific face synthesis (DSFS) that exploits the representative intra-class variation information available from the OD. Prior to operation, a compact set of faces from unknown persons appearing in the OD is selected through clustering in the captured condition space. The domain-specific variations of these face images are projected onto the reference stills by integrating an image-based face relighting technique inside the 3D reconstruction framework. A compact set of synthetic faces is generated that resemble individuals of interest under the capture conditions relevant to the OD. In a particular implementation based on sparse representation classification, the synthetic faces generated with the DSFS are employed to form a cross-domain dictionary that account for structured sparsity. Experimental results reveal that augmenting the reference gallery set of FR systems using the proposed DSFS approach can provide a higher level of accuracy compared to state-of-the-art approaches, with only a moderate increase in its computational complexity

    Contribution to supervised representation learning: algorithms and applications.

    Get PDF
    278 p.In this thesis, we focus on supervised learning methods for pattern categorization. In this context, itremains a major challenge to establish efficient relationships between the discriminant properties of theextracted features and the inter-class sparsity structure.Our first attempt to address this problem was to develop a method called "Robust Discriminant Analysiswith Feature Selection and Inter-class Sparsity" (RDA_FSIS). This method performs feature selectionand extraction simultaneously. The targeted projection transformation focuses on the most discriminativeoriginal features while guaranteeing that the extracted (or transformed) features belonging to the sameclass share a common sparse structure, which contributes to small intra-class distances.In a further study on this approach, some improvements have been introduced in terms of theoptimization criterion and the applied optimization process. In fact, we proposed an improved version ofthe original RDA_FSIS called "Enhanced Discriminant Analysis with Class Sparsity using GradientMethod" (EDA_CS). The basic improvement is twofold: on the first hand, in the alternatingoptimization, we update the linear transformation and tune it with the gradient descent method, resultingin a more efficient and less complex solution than the closed form adopted in RDA_FSIS.On the other hand, the method could be used as a fine-tuning technique for many feature extractionmethods. The main feature of this approach lies in the fact that it is a gradient descent based refinementapplied to a closed form solution. This makes it suitable for combining several extraction methods andcan thus improve the performance of the classification process.In accordance with the above methods, we proposed a hybrid linear feature extraction scheme called"feature extraction using gradient descent with hybrid initialization" (FE_GD_HI). This method, basedon a unified criterion, was able to take advantage of several powerful linear discriminant methods. Thelinear transformation is computed using a descent gradient method. The strength of this approach is thatit is generic in the sense that it allows fine tuning of the hybrid solution provided by different methods.Finally, we proposed a new efficient ensemble learning approach that aims to estimate an improved datarepresentation. The proposed method is called "ICS Based Ensemble Learning for Image Classification"(EM_ICS). Instead of using multiple classifiers on the transformed features, we aim to estimate multipleextracted feature subsets. These were obtained by multiple learned linear embeddings. Multiple featuresubsets were used to estimate the transformations, which were ranked using multiple feature selectiontechniques. The derived extracted feature subsets were concatenated into a single data representationvector with strong discriminative properties.Experiments conducted on various benchmark datasets ranging from face images, handwritten digitimages, object images to text datasets showed promising results that outperformed the existing state-ofthe-art and competing methods

    ROBUST DEEP LEARNING METHODS FOR SOLVING INVERSE PROBLEMS IN MEDICAL IMAGING

    Get PDF
    The medical imaging field has a long history of incorporating machine learning algorithms to address inverse problems in image acquisition and analysis. With the impressive successes of deep neural networks on natural images, we seek to answer the obvious question: do these successes also transfer to the medical image domain? The answer may seem straightforward on the surface. Tasks like image-to-image transformation, segmentation, detection, etc., have direct applications for medical images. For example, metal artifact reduction for Computed Tomography (CT) and reconstruction from undersampled k-space signal for Magnetic Resonance (MR) imaging can be formulated as an image-to-image transformation; lesion/tumor detection and segmentation are obvious applications for higher level vision tasks. While these tasks may be similar in formulation, many practical constraints and requirements exist in solving these tasks for medical images. Patient data is highly sensitive and usually only accessible from individual institutions. This creates constraints on the available groundtruth, dataset size, and computational resources in these institutions to train performant models. Due to the mission-critical nature in healthcare applications, requirements such as performance robustness and speed are also stringent. As such, the big-data, dense-computation, supervised learning paradigm in mainstream deep learning is often insufficient to address these situations. In this dissertation, we investigate ways to benefit from the powerful representational capacity of deep neural networks while still satisfying the above-mentioned constraints and requirements. The first part of this dissertation focuses on adapting supervised learning to account for variations such as different medical image modality, image quality, architecture designs, tasks, etc. The second part of this dissertation focuses on improving model robustness on unseen data through domain adaptation, which ameliorates performance degradation due to distribution shifts. The last part of this dissertation focuses on self-supervised learning and learning from synthetic data with a focus in tomographic imaging; this is essential in many situations where the desired groundtruth may not be accessible

    Configurable analog hardware for neuromorphic Bayesian inference and least-squares solutions

    Get PDF
    Sparse approximation is a Bayesian inference program with a wide number of signal processing applications, such as Compressed Sensing recovery used in medical imaging. Previous sparse coding implementations relied on digital algorithms whose power consumption and performance scale poorly with problem size, rendering them unsuitable for portable applications, and a bottleneck in high speed applications. A novel analog architecture, implementing the Locally Competitive Algorithm (LCA), was designed and programmed onto a Field Programmable Analog Arrays (FPAAs), using floating gate transistors to set the analog parameters. A network of 6 coefficients was demonstrated to converge to similar values as a digital sparse approximation algorithm, but with better power and performance scaling. A rate encoded spiking algorithm was then developed, which was shown to converge to similar values as the LCA. A second novel architecture was designed and programmed on an FPAA implementing the spiking version of the LCA with integrate and fire neurons. A network of 18 neurons converged on similar values as a digital sparse approximation algorithm, with even better performance and power efficiency than the non-spiking network. Novel algorithms were created to increase floating gate programming speed by more than two orders of magnitude, and reduce programming error from device mismatch. A new FPAA chip was designed and tested which allowed for rapid interfacing and additional improvements in accuracy. Finally, a neuromorphic chip was designed, containing 400 integrate and fire neurons, and capable of converging on a sparse approximation solution in 10 microseconds, over 1000 times faster than the best digital solution.Ph.D

    Dynamics and correlations in sparse signal acquisition

    Get PDF
    One of the most important parts of engineered and biological systems is the ability to acquire and interpret information from the surrounding world accurately and in time-scales relevant to the tasks critical to system performance. This classical concept of efficient signal acquisition has been a cornerstone of signal processing research, spawning traditional sampling theorems (e.g. Shannon-Nyquist sampling), efficient filter designs (e.g. the Parks-McClellan algorithm), novel VLSI chipsets for embedded systems, and optimal tracking algorithms (e.g. Kalman filtering). Traditional techniques have made minimal assumptions on the actual signals that were being measured and interpreted, essentially only assuming a limited bandwidth. While these assumptions have provided the foundational works in signal processing, recently the ability to collect and analyze large datasets have allowed researchers to see that many important signal classes have much more regularity than having finite bandwidth. One of the major advances of modern signal processing is to greatly improve on classical signal processing results by leveraging more specific signal statistics. By assuming even very broad classes of signals, signal acquisition and recovery can be greatly improved in regimes where classical techniques are extremely pessimistic. One of the most successful signal assumptions that has gained popularity in recet hears is notion of sparsity. Under the sparsity assumption, the signal is assumed to be composed of a small number of atomic signals from a potentially large dictionary. This limit in the underlying degrees of freedom (the number of atoms used) as opposed to the ambient dimension of the signal has allowed for improved signal acquisition, in particular when the number of measurements is severely limited. While techniques for leveraging sparsity have been explored extensively in many contexts, typically works in this regime concentrate on exploring static measurement systems which result in static measurements of static signals. Many systems, however, have non-trivial dynamic components, either in the measurement system's operation or in the nature of the signal being observed. Due to the promising prior work leveraging sparsity for signal acquisition and the large number of dynamical systems and signals in many important applications, it is critical to understand whether sparsity assumptions are compatible with dynamical systems. Therefore, this work seeks to understand how dynamics and sparsity can be used jointly in various aspects of signal measurement and inference. Specifically, this work looks at three different ways that dynamical systems and sparsity assumptions can interact. In terms of measurement systems, we analyze a dynamical neural network that accumulates signal information over time. We prove a series of bounds on the length of the input signal that drives the network that can be recovered from the values at the network nodes~[1--9]. We also analyze sparse signals that are generated via a dynamical system (i.e. a series of correlated, temporally ordered, sparse signals). For this class of signals, we present a series of inference algorithms that leverage both dynamics and sparsity information, improving the potential for signal recovery in a host of applications~[10--19]. As an extension of dynamical filtering, we show how these dynamic filtering ideas can be expanded to the broader class of spatially correlated signals. Specifically, explore how sparsity and spatial correlations can improve inference of material distributions and spectral super-resolution in hyperspectral imagery~[20--25]. Finally, we analyze dynamical systems that perform optimization routines for sparsity-based inference. We analyze a networked system driven by a continuous-time differential equation and show that such a system is capable of recovering a large variety of different sparse signal classes~[26--30].Ph.D
    corecore