5 research outputs found

    Hyperfeatures - Multilevel Local Coding for Visual Recognition

    Get PDF
    Histograms of local appearance descriptors are a popular representation for visual recognition. They are highly discriminant and they have good resistance to local occlusions and to geometric and photometric variations, but they are not able to exploit spatial co-occurrence statistics of features at scales larger than their local input patches. We present a new multilevel visual representation, `hyperfeatures', that is designed to remedy this. The basis of the work is the familiar notion that to detect object parts, in practice it often suffices to detect co-occurrences of more local object fragments a process that can be formalized as comparison (vector quantization) of image patches against a codebook of known fragments, followed by local aggregation of the resulting codebook membership vectors to detect co-occurrences. This process converts collections of local image descriptor vectors into slightly less local histogram vectors higher-level but spatially coarser descriptors. Our central observation is that it can therefore be iterated, and that doing so captures and codes ever larger assemblies of object parts and increasingly abstract or `semantic' image properties. This repeated nonlinear `folding' is essentially different from that of hierarchical models such as Convolutional Neural Networks and HMAX, being based on repeated comparison to local prototypes and accumulation of co-occurrence statistics rather than on repeated convolution and rectification. We formulate the hyperfeatures model and study its performance under several different image coding methods including clustering based Vector Quantization, Gaussian Mixtures, and combinations of these with Latent Discriminant Analysis. We find that the resulting high-level features provide improved performance in several object image and texture image classification tasks

    3D data fusion by depth refinement and pose recovery

    Get PDF
    Refining depth maps from different sources to obtain a refined depth map, and aligning the rigid point clouds from different views, are two core techniques. Existing depth fusion algorithms do not provide a general framework to obtain a highly accurate depth map. Furthermore, existing rigid point cloud registration algorithms do not always align noisy point clouds robustly and accurately, especially when there are many outliers and large occlusions. In this thesis, we present a general depth fusion framework based on supervised, semi-supervised, and unsupervised adversarial network approaches. We show that the refined depth maps are more accurate than the source depth maps by depth fusion. We develop a new rigid point cloud registration algorithm by aligning two uncertainty-based Gaussian mixture models, which represent the structures of the two point clouds. We show that we can register rigid point clouds more accurately over a larger range of perturbations. Subsequently, the new supervised depth fusion algorithm and new rigid point cloud registration algorithm are integrated into the ROS system of a real gardening robot (called TrimBot) for practical usage in real environments. All the proposed algorithms have been evaluated on multiple existing datasets to show their superiority compared to prior work in the field

    Quantification of tumor heterogeneity using PET/MRI and machine learning

    Get PDF
    Despite a broad understanding that solid tumors exhibit significant tissue heterogeneity, clinical trials have not seen a remarkable development in techniques that aid in characterizing cancer. Needle biopsies often represent only a partial view of the tumor profile, lacking the ability to comprehensively reflect spatiotemporal phenotypic changes. Recent multimodal multiparametric imaging techniques could provide further valuable insights if the complementary imaging information is sufficiently analyzed. Therefore, in this work I developed and applied machine learning methods on multiparametric positron emission tomography (PET) and magnetic resonance imaging (MRI) datasets, acquired using mice bearing subcutaneous tumors, to obtain a precise spatio-temporal characterization of intratumor heterogeneity

    TOWARD SOLVING GROUPWISE MEDICAL IMAGE ANALYSIS PROBLEMS WITH DEEP LEARNING

    Get PDF
    Image regression, atlas building, and multi-atlas segmentation are three groupwise medical image analysis problems extended from image registration. These three problems are challenging because of the difficulty in establishing spatial correspondences and the associated high computational cost. Specifically, most previous methods are computationally costly as they are optimization-based approaches. Hence fast and accurate approaches are highly desirable. This dissertation addresses the following problems concerning the three groupwise medical im- age analysis problems: (1) fast and reliable geodesic regression for image time series; (2) joint atlas building and diffeomorphic registration learning; (3) efficient and accurate label fusion for multi-atlas segmentation; and (4) spatially localized probability calibration for semantic segmentation networks. Specifically, the contributions in this thesis are as follows: (1) A fast predictive simple geodesic regression approach is proposed to capture the frequently subtle deformation trends of longitudinal image data. (2) A new deep learning model that jointly builds an atlas and learns the diffeomorphic registrations in both the atlas-to-image and the image-to-atlas directions is developed. (3) A novel deep learning label fusion method (VoteNet) that locally identifies sets of trustworthy atlases is presented; and several ways to improve the performance under the VoteNet based multi-atlas segmentation framework are explored. (4) A learning-based local temperature scaling method that predicts a separate temperature scale for each pixel/voxel is designed. The resulting post-processing approach is accuracy preserving and is theoretically guaranteed to be effective.Doctor of Philosoph
    corecore