12,826 research outputs found
SBNet: Sparse Blocks Network for Fast Inference
Conventional deep convolutional neural networks (CNNs) apply convolution
operators uniformly in space across all feature maps for hundreds of layers -
this incurs a high computational cost for real-time applications. For many
problems such as object detection and semantic segmentation, we are able to
obtain a low-cost computation mask, either from a priori problem knowledge, or
from a low-resolution segmentation network. We show that such computation masks
can be used to reduce computation in the high-resolution main network. Variants
of sparse activation CNNs have previously been explored on small-scale tasks
and showed no degradation in terms of object classification accuracy, but often
measured gains in terms of theoretical FLOPs without realizing a practical
speed-up when compared to highly optimized dense convolution implementations.
In this work, we leverage the sparsity structure of computation masks and
propose a novel tiling-based sparse convolution algorithm. We verified the
effectiveness of our sparse CNN on LiDAR-based 3D object detection, and we
report significant wall-clock speed-ups compared to dense convolution without
noticeable loss of accuracy.Comment: 10 pages, CVPR 201
PraNet: Parallel Reverse Attention Network for Polyp Segmentation
Colonoscopy is an effective technique for detecting colorectal polyps, which
are highly related to colorectal cancer. In clinical practice, segmenting
polyps from colonoscopy images is of great importance since it provides
valuable information for diagnosis and surgery. However, accurate polyp
segmentation is a challenging task, for two major reasons: (i) the same type of
polyps has a diversity of size, color and texture; and (ii) the boundary
between a polyp and its surrounding mucosa is not sharp. To address these
challenges, we propose a parallel reverse attention network (PraNet) for
accurate polyp segmentation in colonoscopy images. Specifically, we first
aggregate the features in high-level layers using a parallel partial decoder
(PPD). Based on the combined feature, we then generate a global map as the
initial guidance area for the following components. In addition, we mine the
boundary cues using a reverse attention (RA) module, which is able to establish
the relationship between areas and boundary cues. Thanks to the recurrent
cooperation mechanism between areas and boundaries, our PraNet is capable of
calibrating any misaligned predictions, improving the segmentation accuracy.
Quantitative and qualitative evaluations on five challenging datasets across
six metrics show that our PraNet improves the segmentation accuracy
significantly, and presents a number of advantages in terms of
generalizability, and real-time segmentation efficiency.Comment: Accepted to MICCAI 202
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Automatic Rural Road Centerline Extraction from Aerial Images for a Forest Fire Support System
In the last decades, Portugal has been severely affected by forest fires which have caused
massive damage both environmentally and socially. Having a well-structured and precise
mapping of rural roads is critical to help firefighters to mitigate these events. The
traditional process of extracting rural roads centerlines from aerial images is extremely
time-consuming and tedious, because the mapping operator has to manually label the road
area and extract the road centerline.
A frequent challenge in the process of extracting rural roads centerlines is the high
amount of environmental complexity and road occlusions caused by vehicles, shadows, wild
vegetation, and trees, bringing heterogeneous segments that can be further improved. This
dissertation proposes an approach to automatically detect rural road segments as well as
extracting the road centerlines from aerial images.
The proposed method focuses on two main steps: on the first step, an architecture based
on a deep learning model (DeepLabV3+) is used, to extract the road features maps and
detect the rural roads. On the second step, the first stage of the process is an optimization
for improving road connections, as well as cleaning white small objects from the predicted
image by the neural network. Finally, a morphological approach is proposed to extract
the rural road centerlines from the previously detected roads by using thinning algorithms
like the Zhang-Suen and Guo-Hall methods.
With the automation of these two stages, it is now possible to detect and extract road
centerlines from complex rural environments automatically and faster than the traditional
ways, and possibly integrating that data in a Geographical Information System (GIS),
allowing the creation of real-time mapping applications.Nas últimas décadas, Portugal tem sido severamente afetado por fogos florestais, que têm
causado grandes estragos ambientais e sociais. Possuir um sistema de mapeamento de
estradas rurais bem estruturado e preciso é essencial para ajudar os bombeiros a mitigar
este tipo de eventos. Os processos tradicionais de extração de eixos de via em estradas
rurais a partir de imagens aéreas são extremamente demorados e fastidiosos. Um desafio
frequente na extração de eixos de via de estradas rurais é a alta complexidade dos ambientes
rurais e de estes serem obstruídos por veículos, sombras, vegetação selvagem e árvores,
trazendo segmentos heterogéneos que podem ser melhorados.
Esta dissertação propõe uma abordagem para detetar automaticamente estradas rurais,
bem como extrair os eixos de via de imagens aéreas.
O método proposto concentra-se em duas etapas principais: na primeira etapa é utilizada
uma arquitetura baseada em modelos de aprendizagem profunda (DeepLabV3+),
para detetar as estradas rurais. Na segunda etapa, primeiramente é proposta uma otimização
de intercessões melhorando as conexões relativas aos eixos de via, bem como a
remoção de pequenos artefactos que estejam a introduzir ruído nas imagens previstas pela
rede neuronal. E, por último, é utilizada uma abordagem morfológica para extrair os eixos
de via das estradas previamente detetadas recorrendo a algoritmos de esqueletização tais
como os algoritmos Zhang-Suen e Guo-Hall.
Automatizando estas etapas, é então possível extrair eixos de via de ambientes rurais
de grande complexidade de forma automática e com uma maior rapidez em relação aos
métodos tradicionais, permitindo, eventualmente, integrar os dados num Sistema de Informação
Geográfica (SIG), possibilitando a criação de aplicativos de mapeamento em tempo
real
- …