5,795 research outputs found
SBNet: Sparse Blocks Network for Fast Inference
Conventional deep convolutional neural networks (CNNs) apply convolution
operators uniformly in space across all feature maps for hundreds of layers -
this incurs a high computational cost for real-time applications. For many
problems such as object detection and semantic segmentation, we are able to
obtain a low-cost computation mask, either from a priori problem knowledge, or
from a low-resolution segmentation network. We show that such computation masks
can be used to reduce computation in the high-resolution main network. Variants
of sparse activation CNNs have previously been explored on small-scale tasks
and showed no degradation in terms of object classification accuracy, but often
measured gains in terms of theoretical FLOPs without realizing a practical
speed-up when compared to highly optimized dense convolution implementations.
In this work, we leverage the sparsity structure of computation masks and
propose a novel tiling-based sparse convolution algorithm. We verified the
effectiveness of our sparse CNN on LiDAR-based 3D object detection, and we
report significant wall-clock speed-ups compared to dense convolution without
noticeable loss of accuracy.Comment: 10 pages, CVPR 201
MISR stereoscopic image matchers: techniques and results
The Multi-angle Imaging SpectroRadiometer (MISR) instrument, launched in December 1999 on the NASA EOS Terra satellite, produces images in the red band at 275-m resolution, over a swath width of 360 km, for the nine camera angles 70.5/spl deg/, 60/spl deg/, 45.6/spl deg/, and 26.1/spl deg/ forward, nadir, and 26.1/spl deg/, 45.6/spl deg/, 60/spl deg/, and 70.5/spl deg/ aft. A set of accurate and fast algorithms was developed for automated stereo matching of cloud features to obtain cloud-top height and motion over the nominal six-year lifetime of the mission. Accuracy and speed requirements necessitated the use of a combination of area-based and feature-based stereo-matchers with only pixel-level acuity. Feature-based techniques are used for cloud motion retrieval with the off-nadir MISR camera views, and the motion is then used to provide a correction to the disparities used to measure cloud-top heights which are derived from the innermost three cameras. Intercomparison with a previously developed "superstereo" matcher shows that the results are very comparable in accuracy with much greater coverage and at ten times the speed. Intercomparison of feature-based and area-based techniques shows that the feature-based techniques are comparable in accuracy at a factor of eight times the speed. An assessment of the accuracy of the area-based matcher for cloud-free scenes demonstrates the accuracy and completeness of the stereo-matcher. This trade-off has resulted in the loss of a reliable quality metric to predict accuracy and a slightly high blunder rate. Examples are shown of the application of the MISR stereo-matchers on several difficult scenes which demonstrate the efficacy of the matching approach
Vector quantization
During the past ten years Vector Quantization (VQ) has developed from a theoretical possibility promised by Shannon's source coding theorems into a powerful and competitive technique for speech and image coding and compression at medium to low bit rates. In this survey, the basic ideas behind the design of vector quantizers are sketched and some comments made on the state-of-the-art and current research efforts
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Bi-temporal 3D active appearance models with applications to unsupervised ejection fraction estimation
Rapid and unsupervised quantitative analysis is of utmost importance to ensure clinical acceptance of many examinations using cardiac magnetic resonance imaging (MRI). We present a framework that aims at fulfilling these goals for the application of left ventricular ejection fraction estimation in four-dimensional MRI. The theoretical foundation of our work is the generative two-dimensional Active Appearance Models by Cootes et al., here extended to bi-temporal, three-dimensional models. Further issues treated include correction of respiratory induced slice displacements, systole detection, and a texture model pruning strategy. Cross-validation carried out on clinical-quality scans of twelve volunteers indicates that ejection fraction and cardiac blood pool volumes can be estimated automatically and rapidly with accuracy on par with typical inter-observer variability. \u
- …