14 research outputs found

    ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems

    Full text link
    In this paper we present ActiveStereoNet, the first deep learning solution for active stereo systems. Due to the lack of ground truth, our method is fully self-supervised, yet it produces precise depth with a subpixel precision of 1/30th1/30th of a pixel; it does not suffer from the common over-smoothing issues; it preserves the edges; and it explicitly handles occlusions. We introduce a novel reconstruction loss that is more robust to noise and texture-less patches, and is invariant to illumination changes. The proposed loss is optimized using a window-based cost aggregation with an adaptive support weight scheme. This cost aggregation is edge-preserving and smooths the loss function, which is key to allow the network to reach compelling results. Finally we show how the task of predicting invalid regions, such as occlusions, can be trained end-to-end without ground-truth. This component is crucial to reduce blur and particularly improves predictions along depth discontinuities. Extensive quantitatively and qualitatively evaluations on real and synthetic data demonstrate state of the art results in many challenging scenes.Comment: Accepted by ECCV2018, Oral Presentation, Main paper + Supplementary Material

    iCub World: Friendly Robots Help Building Good Vision Data-Sets

    Get PDF
    CVPR2013 Workshop: Ground Truth - What is a good dataset?. Portland, USA (June 28, 2013In this paper we present and start analyzing the iCub World data-set, an object recognition data-set, we acquired using a Human-Robot Interaction (HRI) scheme and the iCub humanoid robot platform. Our set up allows for rapid acquisition and annotation of data with corresponding ground truth. While more constrained in its scopes -- the iCub world is essentially a robotics research lab -- we demonstrate how the proposed data-set poses challenges to current recognition systems. The iCubWorld data-set is publicly available. The data-set can be downloaded from: http://www.iit.it/en/projects/data-sets.html

    Low Compute and Fully Parallel Computer Vision with HashMatch

    Get PDF
    Numerous computer vision problems such as stereo depth estimation, object-class segmentation and fore-ground/background segmentation can be formulated as per-pixel image labeling tasks. Given one or many images as input, the desired output of these methods is usually a spatially smooth assignment of labels. The large amount of such computer vision problems has lead to significant research efforts, with the state of art moving from CRF-based approaches to deep CNNs and more recently, hybrids of the two. Although these approaches have significantly advanced the state of the art, the vast majority has solely focused on improving quantitative results and are not designed for low-compute scenarios. In this paper, we present a new general framework for a variety of computer vision labeling tasks, called HashMatch. Our approach is designed to be both fully parallel, i.e. each pixel is independently processed, and low-compute, with a model complexity an order of magnitude less than existing CNN and CRF-based approaches. We evaluate HashMatch extensively on several problems such as disparity estimation, image retrieval, feature approximation and background subtraction, for which HashMatch achieves high computational efficiency while producing high quality results

    Monitoring antimalarial safety and tolerability in clinical trials: A case study from Uganda

    Get PDF
    BACKGROUND: New antimalarial regimens, including artemisinin-based combination therapies (ACTs), have been adopted widely as first-line treatment for uncomplicated malaria. Although these drugs appear to be safe and well-tolerated, experience with their use in Africa is limited and continued assessment of safety is a priority. However, no standardized guidelines for evaluating drug safety and tolerability in malaria studies exist. A system for monitoring adverse events in antimalarial trials conducted in Uganda was developed. Here the reporting system is described, and difficulties faced in analysing and interpreting the safety results are illustrated, using data from the trials. CASE DESCRIPTION: Between 2002 and 2007, eleven randomized, controlled clinical trials were conducted to compare the efficacy, safety, and tolerability of different antimalarial regimens for treatment of uncomplicated malaria in Uganda. The approach to adverse event monitoring was similar in all studies. A total of 5,614 treatments were evaluated in 4,876 patients. Differences in baseline characteristics and patterns of adverse event reporting were noted between the sites, which limited the ability to pool and analyse data. Clinical failure following antimalarial treatment confounded associations between treatment and adverse events that were also common symptoms of malaria, particularly in areas of lower transmission intensity. DISCUSSION AND EVALUATION: Despite prospectively evaluating for adverse events, limitations in the monitoring system were identified. New standardized guidelines for monitoring safety and tolerability in antimalarial trials are needed, which should address how to detect events of greatest importance, including serious events, those with a causal relationship to the treatment, those which impact on adherence, and events not previously reported. CONCLUSION: Although the World Health Organization has supported the development of pharmacovigilance systems in African countries deploying ACTs, additional guidance on adverse events monitoring in antimalarial clinical trials is needed, similar to the standardized recommendations available for assessment of drug efficacy

    Refining Geometry from Depth Sensors using IR Shading Images

    No full text
    We propose a method to refine geometry of 3D meshes from a consumer level depth camera, e.g. Kinect, by exploiting shading cues captured from an infrared (IR) camera. A major benefit to using an IR camera instead of an RGB camera is that the IR images captured are narrow band images that filter out most undesired ambient light, which makes our system robust against natural indoor illumination. Moreover, for many natural objects with colorful textures in the visible spectrum, the subjects appear to have a uniform albedo in the IR spectrum. Based on our analyses on the IR projector light of the Kinect, we define a near light source IR shading model that describes the captured intensity as a function of surface normals, albedo, lighting direction, and distance between light source and surface points. To resolve the ambiguity in our model between the normals and distances, we utilize an initial 3D mesh from the Kinect fusion and multi-view information to reliably estimate surface details that were not captured and reconstructed by the Kinect fusion. Our approach directly operates on the mesh model for geometry refinement. We ran experiments on our algorithm for geometries captured by both the Kinect I and Kinect II, as the depth acquisition in Kinect I is based on a structured-light technique and that of the Kinect II is based on a time-of-flight technology. The effectiveness of our approach is demonstrated through several challenging real-world examples. We have also performed a user study to evaluate the quality of the mesh models before and after our refinements.11Nsciescopu
    corecore