764 research outputs found

    Guided Depth Super-Resolution by Deep Anisotropic Diffusion

    Full text link
    Performing super-resolution of a depth image using the guidance from an RGB image is a problem that concerns several fields, such as robotics, medical imaging, and remote sensing. While deep learning methods have achieved good results in this problem, recent work highlighted the value of combining modern methods with more formal frameworks. In this work, we propose a novel approach which combines guided anisotropic diffusion with a deep convolutional network and advances the state of the art for guided depth super-resolution. The edge transferring/enhancing properties of the diffusion are boosted by the contextual reasoning capabilities of modern networks, and a strict adjustment step guarantees perfect adherence to the source image. We achieve unprecedented results in three commonly used benchmarks for guided depth super-resolution. The performance gain compared to other methods is the largest at larger scales, such as x32 scaling. Code for the proposed method will be made available to promote reproducibility of our results

    Efficient Long-Short Temporal Attention Network for Unsupervised Video Object Segmentation

    Full text link
    Unsupervised Video Object Segmentation (VOS) aims at identifying the contours of primary foreground objects in videos without any prior knowledge. However, previous methods do not fully use spatial-temporal context and fail to tackle this challenging task in real-time. This motivates us to develop an efficient Long-Short Temporal Attention network (termed LSTA) for unsupervised VOS task from a holistic view. Specifically, LSTA consists of two dominant modules, i.e., Long Temporal Memory and Short Temporal Attention. The former captures the long-term global pixel relations of the past frames and the current frame, which models constantly present objects by encoding appearance pattern. Meanwhile, the latter reveals the short-term local pixel relations of one nearby frame and the current frame, which models moving objects by encoding motion pattern. To speedup the inference, the efficient projection and the locality-based sliding window are adopted to achieve nearly linear time complexity for the two light modules, respectively. Extensive empirical studies on several benchmarks have demonstrated promising performances of the proposed method with high efficiency

    Salient Object Detection Techniques in Computer Vision-A Survey.

    Full text link
    Detection and localization of regions of images that attract immediate human visual attention is currently an intensive area of research in computer vision. The capability of automatic identification and segmentation of such salient image regions has immediate consequences for applications in the field of computer vision, computer graphics, and multimedia. A large number of salient object detection (SOD) methods have been devised to effectively mimic the capability of the human visual system to detect the salient regions in images. These methods can be broadly categorized into two categories based on their feature engineering mechanism: conventional or deep learning-based. In this survey, most of the influential advances in image-based SOD from both conventional as well as deep learning-based categories have been reviewed in detail. Relevant saliency modeling trends with key issues, core techniques, and the scope for future research work have been discussed in the context of difficulties often faced in salient object detection. Results are presented for various challenging cases for some large-scale public datasets. Different metrics considered for assessment of the performance of state-of-the-art salient object detection models are also covered. Some future directions for SOD are presented towards end

    Medical Image Registration Using Deep Neural Networks

    Get PDF
    Registration is a fundamental problem in medical image analysis wherein images are transformed spatially to align corresponding anatomical structures in each image. Recently, the development of learning-based methods, which exploit deep neural networks and can outperform classical iterative methods, has received considerable interest from the research community. This interest is due in part to the substantially reduced computational requirements that learning-based methods have during inference, which makes them particularly well-suited to real-time registration applications. Despite these successes, learning-based methods can perform poorly when applied to images from different modalities where intensity characteristics can vary greatly, such as in magnetic resonance and ultrasound imaging. Moreover, registration performance is often demonstrated on well-curated datasets, closely matching the distribution of the training data. This makes it difficult to determine whether demonstrated performance accurately represents the generalization and robustness required for clinical use. This thesis presents learning-based methods which address the aforementioned difficulties by utilizing intuitive point-set-based representations, user interaction and meta-learning-based training strategies. Primarily, this is demonstrated with a focus on the non-rigid registration of 3D magnetic resonance imaging to sparse 2D transrectal ultrasound images to assist in the delivery of targeted prostate biopsies. While conventional systematic prostate biopsy methods can require many samples to be taken to confidently produce a diagnosis, tumor-targeted approaches have shown improved patient, diagnostic, and disease management outcomes with fewer samples. However, the available intraoperative transrectal ultrasound imaging alone is insufficient for accurate targeted guidance. As such, this exemplar application is used to illustrate the effectiveness of sparse, interactively-acquired ultrasound imaging for real-time, interventional registration. The presented methods are found to improve registration accuracy, relative to state-of-the-art, with substantially lower computation time and require a fraction of the data at inference. As a result, these methods are particularly attractive given their potential for real-time registration in interventional applications

    Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

    Get PDF
    This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

    (An overview of) Synergistic reconstruction for multimodality/multichannel imaging methods

    Get PDF
    Imaging is omnipresent in modern society with imaging devices based on a zoo of physical principles, probing a specimen across different wavelengths, energies and time. Recent years have seen a change in the imaging landscape with more and more imaging devices combining that which previously was used separately. Motivated by these hardware developments, an ever increasing set of mathematical ideas is appearing regarding how data from different imaging modalities or channels can be synergistically combined in the image reconstruction process, exploiting structural and/or functional correlations between the multiple images. Here we review these developments, give pointers to important challenges and provide an outlook as to how the field may develop in the forthcoming years. This article is part of the theme issue 'Synergistic tomographic image reconstruction: part 1'
    • …
    corecore