26 research outputs found

    STRUCTURED SPARSITY DRIVEN LEARNING: THEORY AND ALGORITHMS

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Generative-Discriminative Low Rank Decomposition for Medical Imaging Applications

    Get PDF
    In this thesis, we propose a method that can be used to extract biomarkers from medical images toward early diagnosis of abnormalities. Surge of demand for biomarkers and availability of medical images in the recent years call for accurate, repeatable, and interpretable approaches for extracting meaningful imaging features. However, extracting such information from medical images is a challenging task because the number of pixels (voxels) in a typical image is in order of millions while even a large sample-size in medical image dataset does not usually exceed a few hundred. Nevertheless, depending on the nature of an abnormality, only a parsimonious subset of voxels is typically relevant to the disease; therefore various notions of sparsity are exploited in this thesis to improve the generalization performance of the prediction task. We propose a novel discriminative dimensionality reduction method that yields good classification performance on various datasets without compromising the clinical interpretability of the results. This is achieved by combining the modelling strength of generative learning framework and the classification performance of discriminative learning paradigm. Clinical interpretability can be viewed as an additional measure of evaluation and is also helpful in designing methods that account for the clinical prior such as association of certain areas in a brain to a particular cognitive task or connectivity of some brain regions via neural fibres. We formulate our method as a large-scale optimization problem to solve a constrained matrix factorization. Finding an optimal solution of the large-scale matrix factorization renders off-the-shelf solver computationally prohibitive; therefore, we designed an efficient algorithm based on the proximal method to address the computational bottle-neck of the optimization problem. Our formulation is readily extended for different scenarios such as cases where a large cohort of subjects has uncertain or no class labels (semi-supervised learning) or a case where each subject has a battery of imaging channels (multi-channel), \etc. We show that by using various notions of sparsity as feasible sets of the optimization problem, we can encode different forms of prior knowledge ranging from brain parcellation to brain connectivity

    Learning by correlation for computer vision applications: from Kernel methods to deep learning

    Get PDF
    Learning to spot analogies and differences within/across visual categories is an arguably powerful approach in machine learning and pattern recognition which is directly inspired by human cognition. In this thesis, we investigate a variety of approaches which are primarily driven by correlation and tackle several computer vision applications

    Image-set, Temporal and Spatiotemporal Representations of Videos for Recognizing, Localizing and Quantifying Actions

    Get PDF
    This dissertation addresses the problem of learning video representations, which is defined here as transforming the video so that its essential structure is made more visible or accessible for action recognition and quantification. In the literature, a video can be represented by a set of images, by modeling motion or temporal dynamics, and by a 3D graph with pixels as nodes. This dissertation contributes in proposing a set of models to localize, track, segment, recognize and assess actions such as (1) image-set models via aggregating subset features given by regularizing normalized CNNs, (2) image-set models via inter-frame principal recovery and sparsely coding residual actions, (3) temporally local models with spatially global motion estimated by robust feature matching and local motion estimated by action detection with motion model added, (4) spatiotemporal models 3D graph and 3D CNN to model time as a space dimension, (5) supervised hashing by jointly learning embedding and quantization, respectively. State-of-the-art performances are achieved for tasks such as quantifying facial pain and human diving. Primary conclusions of this dissertation are categorized as follows: (i) Image set can capture facial actions that are about collective representation; (ii) Sparse and low-rank representations can have the expression, identity and pose cues untangled and can be learned via an image-set model and also a linear model; (iii) Norm is related with recognizability; similarity metrics and loss functions matter; (v) Combining the MIL based boosting tracker with the Particle Filter motion model induces a good trade-off between the appearance similarity and motion consistence; (iv) Segmenting object locally makes it amenable to assign shape priors; it is feasible to learn knowledge such as shape priors online from Web data with weak supervision; (v) It works locally in both space and time to represent videos as 3D graphs; 3D CNNs work effectively when inputted with temporally meaningful clips; (vi) the rich labeled images or videos help to learn better hash functions after learning binary embedded codes than the random projections. In addition, models proposed for videos can be adapted to other sequential images such as volumetric medical images which are not included in this dissertation

    Superresolution Reconstruction for Magnetic Resonance Spectroscopic Imaging Exploiting Low-Rank Spatio-Spectral Structure

    Get PDF
    Magnetic resonance spectroscopic imaging (MRSI) is a rapidly developing medical imaging modality, capable of conferring both spatial and spectral information content, and has become a powerful clinical tool. The ability to non-invasively observe spatial maps of metabolite concentrations, for instance, in the human brain, can offer functional, as well as pathological insights, perhaps even before structural aberrations or behavioral symptoms are evinced. Despite its lofty clinical prospects, MRSI has traditionally remained encumbered by a number of practical limitations. Of primary concern are the vastly reduced concentrations of tissue metabolites when compared to that of water, which forms the basis for conventional MR imaging. Moreover, the protracted exam durations required by MRSI routinely approach the limits for patient compliance. Taken in conjunction, the above considerations effectively circumscribe the data collection process, ultimately translating to coarse image resolutions that are of diminished clinical utility. Such shortcomings are compounded by spectral contamination artifacts due to the system pointspread function, which arise as a natural consequence when reconstructing non-band-limited data by the inverse Fourier transform. These artifacts are especially pronounced near regions characterized by substantial discrepancies in signal intensity, for example, the interface between normal brain and adipose tissue, whereby the metabolite signals are inundated by the dominant lipid resonances. In recent years, concerted efforts have been made to develop alternative, non-Fourier MRSI reconstruction strategies that aim to surmount the aforementioned limitations. In this dissertation, we build upon the burgeoning medley of innovative and promising techniques, proffering a novel superresolution reconstruction framework predicated on the recent interest in low-rank signal modeling, along with state-of-the-art regularization methods. The proposed framework is founded upon a number of key tenets. Firstly, we proclaim that the underlying spatio-spectral distribution of the investigated object admits a bilinear representation, whereby spatial and spectral signal components can be effectively segregated. We further maintain that the dimensionality of the subspace spanned by the components is, in principle, bounded by a modest number of observable metabolites. Secondly, we assume that local susceptibility effects represent the primary sources of signal corruption that tend to disallow such representations. Finally, we assert that the spatial components belong to a class of real-valued, non-negative, and piecewise linear functions, compelled in part through the use of a total variation regularization penalty. After demonstrating superior spatial and spectral localization properties in both numerical and physical phantom data when compared against standard Fourier methods, we proceed to evaluate reconstruction performance in typical in vivo settings, whereby the method is extended in order to promote the recovery of signal variations throughout the MRSI slice thickness. Aside from the various technical obstacles, one of the cardinal prospective challenges for high-resolution MRSI reconstruction is the shortfall of reliable ground truth data prudent for validation, thereby prompting reservations surrounding the resulting experimental outcomes. [...
    corecore