80 research outputs found
Straight to Shapes: Real-time Detection of Encoded Shapes
Current object detection approaches predict bounding boxes, but these provide
little instance-specific information beyond location, scale and aspect ratio.
In this work, we propose to directly regress to objects' shapes in addition to
their bounding boxes and categories. It is crucial to find an appropriate shape
representation that is compact and decodable, and in which objects can be
compared for higher-order concepts such as view similarity, pose variation and
occlusion. To achieve this, we use a denoising convolutional auto-encoder to
establish an embedding space, and place the decoder after a fast end-to-end
network trained to regress directly to the encoded shape vectors. This yields
what to the best of our knowledge is the first real-time shape prediction
network, running at ~35 FPS on a high-end desktop. With higher-order shape
reasoning well-integrated into the network pipeline, the network shows the
useful practical quality of generalising to unseen categories similar to the
ones in the training set, something that most existing approaches fail to
handle.Comment: 16 pages including appendix; Published at CVPR 201
RSGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems
Stereo depth estimation is used for many computer vision applications. Though
many popular methods strive solely for depth quality, for real-time mobile
applications (e.g. prosthetic glasses or micro-UAVs), speed and power
efficiency are equally, if not more, important. Many real-world systems rely on
Semi-Global Matching (SGM) to achieve a good accuracy vs. speed balance, but
power efficiency is hard to achieve with conventional hardware, making the use
of embedded devices such as FPGAs attractive for low-power applications.
However, the full SGM algorithm is ill-suited to deployment on FPGAs, and so
most FPGA variants of it are partial, at the expense of accuracy. In a non-FPGA
context, the accuracy of SGM has been improved by More Global Matching (MGM),
which also helps tackle the streaking artifacts that afflict SGM. In this
paper, we propose a novel, resource-efficient method that is inspired by MGM's
techniques for improving depth quality, but which can be implemented to run in
real time on a low-power FPGA. Through evaluation on multiple datasets (KITTI
and Middlebury), we show that in comparison to other real-time capable stereo
approaches, we can achieve a state-of-the-art balance between accuracy, power
efficiency and speed, making our approach highly desirable for use in real-time
systems with limited power.Comment: Accepted in FPT 2018 as Oral presentation, 8 pages, 6 figures, 4
table
InfiniTAM v3: A Framework for Large-Scale 3D Reconstruction with Loop Closure
Volumetric models have become a popular representation for 3D scenes in
recent years. One breakthrough leading to their popularity was KinectFusion,
which focuses on 3D reconstruction using RGB-D sensors. However, monocular SLAM
has since also been tackled with very similar approaches. Representing the
reconstruction volumetrically as a TSDF leads to most of the simplicity and
efficiency that can be achieved with GPU implementations of these systems.
However, this representation is memory-intensive and limits applicability to
small-scale reconstructions. Several avenues have been explored to overcome
this. With the aim of summarizing them and providing for a fast, flexible 3D
reconstruction pipeline, we propose a new, unifying framework called InfiniTAM.
The idea is that steps like camera tracking, scene representation and
integration of new data can easily be replaced and adapted to the user's needs.
This report describes the technical implementation details of InfiniTAM v3,
the third version of our InfiniTAM system. We have added various new features,
as well as making numerous enhancements to the low-level code that
significantly improve our camera tracking performance. The new features that we
expect to be of most interest are (i) a robust camera tracking module; (ii) an
implementation of Glocker et al.'s keyframe-based random ferns camera
relocaliser; (iii) a novel approach to globally-consistent TSDF-based
reconstruction, based on dividing the scene into rigid submaps and optimising
the relative poses between them; and (iv) an implementation of Keller et al.'s
surfel-based reconstruction approach.Comment: This article largely supersedes arxiv:1410.0925 (it describes version
3 of the InfiniTAM framework
Real-Time RGB-D Camera Pose Estimation in Novel Scenes using a Relocalisation Cascade
Camera pose estimation is an important problem in computer vision. Common
techniques either match the current image against keyframes with known poses,
directly regress the pose, or establish correspondences between keypoints in
the image and points in the scene to estimate the pose. In recent years,
regression forests have become a popular alternative to establish such
correspondences. They achieve accurate results, but have traditionally needed
to be trained offline on the target scene, preventing relocalisation in new
environments. Recently, we showed how to circumvent this limitation by adapting
a pre-trained forest to a new scene on the fly. The adapted forests achieved
relocalisation performance that was on par with that of offline forests, and
our approach was able to estimate the camera pose in close to real time. In
this paper, we present an extension of this work that achieves significantly
better relocalisation performance whilst running fully in real time. To achieve
this, we make several changes to the original approach: (i) instead of
accepting the camera pose hypothesis without question, we make it possible to
score the final few hypotheses using a geometric approach and select the most
promising; (ii) we chain several instantiations of our relocaliser together in
a cascade, allowing us to try faster but less accurate relocalisation first,
only falling back to slower, more accurate relocalisation as necessary; and
(iii) we tune the parameters of our cascade to achieve effective overall
performance. These changes allow us to significantly improve upon the
performance our original state-of-the-art method was able to achieve on the
well-known 7-Scenes and Stanford 4 Scenes benchmarks. As additional
contributions, we present a way of visualising the internal behaviour of our
forests and show how to entirely circumvent the need to pre-train a forest on a
generic scene.Comment: Tommaso Cavallari, Stuart Golodetz, Nicholas Lord and Julien Valentin
assert joint first authorshi
Calibrating Deep Neural Networks using Focal Loss
Miscalibration -- a mismatch between a model's confidence and its correctness
-- of Deep Neural Networks (DNNs) makes their predictions hard to rely on.
Ideally, we want networks to be accurate, calibrated and confident. We show
that, as opposed to the standard cross-entropy loss, focal loss (Lin et al.,
2017) allows us to learn models that are already very well calibrated. When
combined with temperature scaling, whilst preserving accuracy, it yields
state-of-the-art calibrated models. We provide a thorough analysis of the
factors causing miscalibration, and use the insights we glean from this to
justify the empirically excellent performance of focal loss. To facilitate the
use of focal loss in practice, we also provide a principled approach to
automatically select the hyperparameter involved in the loss function. We
perform extensive experiments on a variety of computer vision and NLP datasets,
and with a wide variety of network architectures, and show that our approach
achieves state-of-the-art accuracy and calibration in almost all cases
Beyond Controlled Environments: 3D Camera Re-Localization in Changing Indoor Scenes
Long-term camera re-localization is an important task with numerous computer
vision and robotics applications. Whilst various outdoor benchmarks exist that
target lighting, weather and seasonal changes, far less attention has been paid
to appearance changes that occur indoors. This has led to a mismatch between
popular indoor benchmarks, which focus on static scenes, and indoor
environments that are of interest for many real-world applications. In this
paper, we adapt 3RScan - a recently introduced indoor RGB-D dataset designed
for object instance re-localization - to create RIO10, a new long-term camera
re-localization benchmark focused on indoor scenes. We propose new metrics for
evaluating camera re-localization and explore how state-of-the-art camera
re-localizers perform according to these metrics. We also examine in detail how
different types of scene change affect the performance of different methods,
based on novel ways of detecting such changes in a given RGB-D frame. Our
results clearly show that long-term indoor re-localization is an unsolved
problem. Our benchmark and tools are publicly available at
waldjohannau.github.io/RIO10Comment: ECCV 2020, project website https://waldjohannau.github.io/RIO1
Acute phase response in two consecutive experimentally induced E. coli intramammary infections in dairy cows
<p>Abstract</p> <p>Background</p> <p>Acute phase proteins haptoglobin (Hp), serum amyloid A (SAA) and lipopolysaccharide binding protein (LBP) have suggested to be suitable inflammatory markers for bovine mastitis. The aim of the study was to investigate acute phase markers along with clinical parameters in two consecutive intramammary challenges with <it>Escherichia coli </it>and to evaluate the possible carry-over effect when same animals are used in an experimental model.</p> <p>Methods</p> <p>Mastitis was induced with a dose of 1500 cfu of <it>E. coli </it>in one quarter of six cows and inoculation repeated in another quarter after an interval of 14 days. Concentrations of acute phase proteins haptoglobin (Hp), serum amyloid A (SAA) and lipopolysaccharide binding protein (LBP) were determined in serum and milk.</p> <p>Results</p> <p>In both challenges all cows became infected and developed clinical mastitis within 12 hours of inoculation. Clinical disease and acute phase response was generally milder in the second challenge. Concentrations of SAA in milk started to increase 12 hours after inoculation and peaked at 60 hours after the first challenge and at 44 hours after the second challenge. Concentrations of SAA in serum increased more slowly and peaked at the same times as in milk; concentrations in serum were about one third of those in milk. Hp started to increase in milk similarly and peaked at 36–44 hours. In serum, the concentration of Hp peaked at 60–68 hours and was twice as high as in milk. LBP concentrations in milk and serum started to increase after 12 hours and peaked at 36 hours, being higher in milk. The concentrations of acute phase proteins in serum and milk in the <it>E. coli </it>infection model were much higher than those recorded in experiments using Gram-positive pathogens, indicating the severe inflammation induced by <it>E. coli</it>.</p> <p>Conclusion</p> <p>Acute phase proteins would be useful parameters as mastitis indicators and to assess the severity of mastitis. If repeated experimental intramammary induction of the same animals with <it>E. coli </it>is used in cross-over studies, the interval between challenges should be longer than 2 weeks, due to the carry-over effect from the first infection.</p
- …