4,356 research outputs found

    Semantic Visual Localization

    Full text link
    Robust visual localization under a wide range of viewing conditions is a fundamental problem in computer vision. Handling the difficult cases of this problem is not only very challenging but also of high practical relevance, e.g., in the context of life-long localization for augmented reality or autonomous robots. In this paper, we propose a novel approach based on a joint 3D geometric and semantic understanding of the world, enabling it to succeed under conditions where previous approaches failed. Our method leverages a novel generative model for descriptor learning, trained on semantic scene completion as an auxiliary task. The resulting 3D descriptors are robust to missing observations by encoding high-level 3D geometric and semantic information. Experiments on several challenging large-scale localization datasets demonstrate reliable localization under extreme viewpoint, illumination, and geometry changes

    Generative Modeling in Structural-Hankel Domain for Color Image Inpainting

    Full text link
    In recent years, some researchers focused on using a single image to obtain a large number of samples through multi-scale features. This study intends to a brand-new idea that requires only ten or even fewer samples to construct the low-rank structural-Hankel matrices-assisted score-based generative model (SHGM) for color image inpainting task. During the prior learning process, a certain amount of internal-middle patches are firstly extracted from several images and then the structural-Hankel matrices are constructed from these patches. To better apply the score-based generative model to learn the internal statistical distribution within patches, the large-scale Hankel matrices are finally folded into the higher dimensional tensors for prior learning. During the iterative inpainting process, SHGM views the inpainting problem as a conditional generation procedure in low-rank environment. As a result, the intermediate restored image is acquired by alternatively performing the stochastic differential equation solver, alternating direction method of multipliers, and data consistency steps. Experimental results demonstrated the remarkable performance and diversity of SHGM.Comment: 11 pages, 10 figure

    Machine learning for efficient recognition of anatomical structures and abnormalities in biomedical images

    Get PDF
    Three studies have been carried out to investigate new approaches to efficient image segmentation and anomaly detection. The first study investigates the use of deep learning in patch based segmentation. Current approaches to patch based segmentation use low level features such as the sum of squared differences between patches. We argue that better segmentation can be achieved by harnessing the power of deep neural networks. Currently these networks make extensive use of convolutional layers. However, we argue that in the context of patch based segmentation, convolutional layers have little advantage over the canonical artificial neural network architecture. This is because a patch is small, and does not need decomposition and thus will not benefit from convolution. Instead, we make use of the canonical architecture in which neurons only compute dot products, but also incorporate modern techniques of deep learning. The resulting classifier is much faster and less memory-hungry than convolution based networks. In a test application to the segmentation of hippocampus in human brain MR images, we significantly outperformed prior art with a median Dice score up to 90.98% at a near real-time speed (<1s). The second study is an investigation into mouse phenotyping, and develops a high-throughput framework to detect morphological abnormality in mouse embryo micro-CT images. Existing work in this line is centred on, either the detection of phenotype-specific features or comparative analytics. The former approach lacks generality and the latter can often fail, for example, when the abnormality is not associated with severe volume variation. Both these approaches often require image segmentation as a pre-requisite, which is very challenging when applied to embryo phenotyping. A new approach to this problem in which non-rigid registration is combined with robust principal component analysis (RPCA), is proposed. The new framework is able to efficiently perform abnormality detection in a batch of images. It is sensitive to both volumetric and non-volumetric variations, and does not require image segmentation. In a validation study, it successfully distinguished the abnormal VSD and polydactyly phenotypes from the normal, respectively, at 85.19% and 88.89% specificities, with 100% sensitivity in both cases. The third study investigates the RPCA technique in more depth. RPCA is an extension of PCA that tolerates certain levels of data distortion during feature extraction, and is able to decompose images into regular and singular components. It has previously been applied to many computer vision problems (e.g. video surveillance), attaining excellent performance. However these applications commonly rest on a critical condition: in the majority of images being processed, there is a background with very little variation. By contrast in biomedical imaging there is significant natural variation across different images, resulting from inter-subject variability and physiological movements. Non-rigid registration can go some way towards reducing this variance, but cannot eliminate it entirely. To address this problem we propose a modified framework (RPCA-P) that is able to incorporate natural variation priors and adjust outlier tolerance locally, so that voxels associated with structures of higher variability are compensated with a higher tolerance in regularity estimation. An experimental study was applied to the same mouse embryo micro-CT data, and notably improved the detection specificity to 94.12% for the VSD and 90.97% for the polydactyly, while maintaining the sensitivity at 100%.Open Acces

    Acute Angle Repositioning in Mobile C-Arm Using Image Processing and Deep Learning

    Get PDF
    During surgery, medical practitioners rely on the mobile C-Arm medical x-ray system (C-Arm) and its fluoroscopic functions to not only perform the surgery but also validate the outcome. Currently, technicians reposition the C-Arm arbitrarily through estimation and guesswork. In cases when the positioning and repositioning of the C-Arm are critical for surgical assessment, uncertainties in the angular position of the C-Arm components hinder surgical performance. This thesis proposes an integrated approach to automatically reposition C-Arms during critically acute movements in orthopedic surgery. Robot vision and control with deep learning are used to determine the necessary angles of rotation for desired C-Arm repositioning. More specifically, a convolutional neural network is trained to detect and classify internal bodily structures. Image generation using the fast Fourier transform and Monte Carlo simulation is included to improve the robustness of the training progression of the neural network. Matching control points between a reference x-ray image and a test x-ray image allows for the determination of the projective transformation relating the images. From the projective transformation matrix, the tilt and orbital angles of rotation of the C-Arm are calculated. Key results indicate that the proposed method is successful in repositioning mobile C-Arms to a desired position within 8.9% error for the tilt and 3.5% error for the orbit. As a result, the guesswork entailed in fine C-Arm repositioning is replaced by a better, more refined method. Ultimately, confidence in C-Arm positioning and repositioning is reinforced, and surgical performance with the C-Arm is improved
    corecore