79 research outputs found

    R3^3SGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems

    Full text link
    Stereo depth estimation is used for many computer vision applications. Though many popular methods strive solely for depth quality, for real-time mobile applications (e.g. prosthetic glasses or micro-UAVs), speed and power efficiency are equally, if not more, important. Many real-world systems rely on Semi-Global Matching (SGM) to achieve a good accuracy vs. speed balance, but power efficiency is hard to achieve with conventional hardware, making the use of embedded devices such as FPGAs attractive for low-power applications. However, the full SGM algorithm is ill-suited to deployment on FPGAs, and so most FPGA variants of it are partial, at the expense of accuracy. In a non-FPGA context, the accuracy of SGM has been improved by More Global Matching (MGM), which also helps tackle the streaking artifacts that afflict SGM. In this paper, we propose a novel, resource-efficient method that is inspired by MGM's techniques for improving depth quality, but which can be implemented to run in real time on a low-power FPGA. Through evaluation on multiple datasets (KITTI and Middlebury), we show that in comparison to other real-time capable stereo approaches, we can achieve a state-of-the-art balance between accuracy, power efficiency and speed, making our approach highly desirable for use in real-time systems with limited power.Comment: Accepted in FPT 2018 as Oral presentation, 8 pages, 6 figures, 4 table

    Automatic spine identification in abdominal CT slices using image partition forests

    Get PDF
    The identification of key features (e.g. organs and tumours) in medical scans (CT, MRI, etc.) is a vital first step in many other image analysis applications, but it is by no means easy to identify such features automatically. Using statistical properties of image regions alone, it is not always possible to distinguish between different features with overlapping greyscale distributions. To do so, it helps to make use of additional knowledge that may have been acquired (e.g. from a medic) about a patient's anatomy. One important form this external knowledge can take is localization information: this allows a program to narrow down its search to a particular region of the image, or to decide how likely a feature candidate is to be correct (e.g. it would be worrisome were the aorta identified as running through the middle of a kidney). To make use of this information, however, it is necessary to identify a suitable frame of reference in which it can be specified. This frame should ideally be based on rigid structures, e.g. the spine and ribs. In this paper, we present a method for automatically identifying cross-sections of the spine in image partition forests of axial abdominal CT slices as a first step towards defining a robust coordinate system for localization

    Real-time RGB-D camera pose estimation in novel scenes using a relocalisation cascade

    Get PDF
    Camera pose estimation is an important problem in computer vision, with applications as diverse as simultaneous localisation and mapping, virtual/augmented reality and navigation. Common techniques match the current image against keyframes with known poses coming from a tracker, directly regress the pose, or establish correspondences between keypoints in the current image and points in the scene in order to estimate the pose. In recent years, regression forests have become a popular alternative to establish such correspondences. They achieve accurate results, but have traditionally needed to be trained offline on the target scene, preventing relocalisation in new environments. Recently, we showed how to circumvent this limitation by adapting a pre-trained forest to a new scene on the fly. The adapted forests achieved relocalisation performance that was on par with that of offline forests, and our approach was able to estimate the camera pose in close to real time, which made it desirable for systems that require online relocalisation. In this paper, we present an extension of this work that achieves significantly better relocalisation performance whilst running fully in real time. To achieve this, we make several changes to the original approach: (i) instead of simply accepting the camera pose hypothesis produced by RANSAC without question, we make it possible to score the final few hypotheses it considers using a geometric approach and select the most promising one; (ii) we chain several instantiations of our relocaliser (with different parameter settings) together in a cascade, allowing us to try faster but less accurate relocalisation first, only falling back to slower, more accurate relocalisation as necessary; and (iii) we tune the parameters of our cascade, and the individual relocalisers it contains, to achieve effective overall performance. Taken together, these changes allow us to significantly improve upon the performance our original state-of-the-art method was able to achieve on the well-known 7-Scenes and Stanford 4 Scenes benchmarks. As additional contributions, we present a novel way of visualising the internal behaviour of our forests, and use the insights gleaned from this to show how to entirely circumvent the need to pre-train a forest on a generic scene

    Calibrating Deep Neural Networks using Focal Loss

    Full text link
    Miscalibration -- a mismatch between a model's confidence and its correctness -- of Deep Neural Networks (DNNs) makes their predictions hard to rely on. Ideally, we want networks to be accurate, calibrated and confident. We show that, as opposed to the standard cross-entropy loss, focal loss (Lin et al., 2017) allows us to learn models that are already very well calibrated. When combined with temperature scaling, whilst preserving accuracy, it yields state-of-the-art calibrated models. We provide a thorough analysis of the factors causing miscalibration, and use the insights we glean from this to justify the empirically excellent performance of focal loss. To facilitate the use of focal loss in practice, we also provide a principled approach to automatically select the hyperparameter involved in the loss function. We perform extensive experiments on a variety of computer vision and NLP datasets, and with a wide variety of network architectures, and show that our approach achieves state-of-the-art accuracy and calibration in almost all cases

    Struck: Structured Output Tracking with Kernels.

    Get PDF
    Adaptive tracking-by-detection methods are widely used in computer vision for tracking arbitrary objects. Current approaches treat the tracking problem as a classification task and use online learning techniques to update the object model. However, for these updates to happen one needs to convert the estimated object position into a set of labelled training examples, and it is not clear how best to perform this intermediate step. Furthermore, the objective for the classifier (label prediction) is not explicitly coupled to the objective for the tracker (estimation of object position). In this paper, we present a framework for adaptive visual object tracking based on structured output prediction. By explicitly allowing the output space to express the needs of the tracker, we avoid the need for an intermediate classification step. Our method uses a kernelised structured output support vector machine (SVM), which is learned online to provide adaptive tracking. To allow our tracker to run at high frame rates, we (a) introduce a budgeting mechanism that prevents the unbounded growth in the number of support vectors that would otherwise occur during tracking, and (b) show how to implement tracking on the GPU. Experimentally, we show that our algorithm is able to outperform state-of-the-art trackers on various benchmark videos. Additionally, we show that we can easily incorporate additional features and kernels into our framework, which results in increased tracking performance

    Beyond Controlled Environments: 3D Camera Re-Localization in Changing Indoor Scenes

    Full text link
    Long-term camera re-localization is an important task with numerous computer vision and robotics applications. Whilst various outdoor benchmarks exist that target lighting, weather and seasonal changes, far less attention has been paid to appearance changes that occur indoors. This has led to a mismatch between popular indoor benchmarks, which focus on static scenes, and indoor environments that are of interest for many real-world applications. In this paper, we adapt 3RScan - a recently introduced indoor RGB-D dataset designed for object instance re-localization - to create RIO10, a new long-term camera re-localization benchmark focused on indoor scenes. We propose new metrics for evaluating camera re-localization and explore how state-of-the-art camera re-localizers perform according to these metrics. We also examine in detail how different types of scene change affect the performance of different methods, based on novel ways of detecting such changes in a given RGB-D frame. Our results clearly show that long-term indoor re-localization is an unsolved problem. Our benchmark and tools are publicly available at waldjohannau.github.io/RIO10Comment: ECCV 2020, project website https://waldjohannau.github.io/RIO1

    Acute phase response in two consecutive experimentally induced E. coli intramammary infections in dairy cows

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Acute phase proteins haptoglobin (Hp), serum amyloid A (SAA) and lipopolysaccharide binding protein (LBP) have suggested to be suitable inflammatory markers for bovine mastitis. The aim of the study was to investigate acute phase markers along with clinical parameters in two consecutive intramammary challenges with <it>Escherichia coli </it>and to evaluate the possible carry-over effect when same animals are used in an experimental model.</p> <p>Methods</p> <p>Mastitis was induced with a dose of 1500 cfu of <it>E. coli </it>in one quarter of six cows and inoculation repeated in another quarter after an interval of 14 days. Concentrations of acute phase proteins haptoglobin (Hp), serum amyloid A (SAA) and lipopolysaccharide binding protein (LBP) were determined in serum and milk.</p> <p>Results</p> <p>In both challenges all cows became infected and developed clinical mastitis within 12 hours of inoculation. Clinical disease and acute phase response was generally milder in the second challenge. Concentrations of SAA in milk started to increase 12 hours after inoculation and peaked at 60 hours after the first challenge and at 44 hours after the second challenge. Concentrations of SAA in serum increased more slowly and peaked at the same times as in milk; concentrations in serum were about one third of those in milk. Hp started to increase in milk similarly and peaked at 36–44 hours. In serum, the concentration of Hp peaked at 60–68 hours and was twice as high as in milk. LBP concentrations in milk and serum started to increase after 12 hours and peaked at 36 hours, being higher in milk. The concentrations of acute phase proteins in serum and milk in the <it>E. coli </it>infection model were much higher than those recorded in experiments using Gram-positive pathogens, indicating the severe inflammation induced by <it>E. coli</it>.</p> <p>Conclusion</p> <p>Acute phase proteins would be useful parameters as mastitis indicators and to assess the severity of mastitis. If repeated experimental intramammary induction of the same animals with <it>E. coli </it>is used in cross-over studies, the interval between challenges should be longer than 2 weeks, due to the carry-over effect from the first infection.</p
    corecore