1,876 research outputs found
SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection
Vision-based vehicle detection approaches achieve incredible success in
recent years with the development of deep convolutional neural network (CNN).
However, existing CNN based algorithms suffer from the problem that the
convolutional features are scale-sensitive in object detection task but it is
common that traffic images and videos contain vehicles with a large variance of
scales. In this paper, we delve into the source of scale sensitivity, and
reveal two key issues: 1) existing RoI pooling destroys the structure of small
scale objects, 2) the large intra-class distance for a large variance of scales
exceeds the representation capability of a single network. Based on these
findings, we present a scale-insensitive convolutional neural network (SINet)
for fast detecting vehicles with a large variance of scales. First, we present
a context-aware RoI pooling to maintain the contextual information and original
structure of small scale objects. Second, we present a multi-branch decision
network to minimize the intra-class distance of features. These lightweight
techniques bring zero extra time complexity but prominent detection accuracy
improvement. The proposed techniques can be equipped with any deep network
architectures and keep them trained end-to-end. Our SINet achieves
state-of-the-art performance in terms of accuracy and speed (up to 37 FPS) on
the KITTI benchmark and a new highway dataset, which contains a large variance
of scales and extremely small objects.Comment: Accepted by IEEE Transactions on Intelligent Transportation Systems
(T-ITS
An Empirical Evaluation of Deep Learning on Highway Driving
Numerous groups have applied a variety of deep learning techniques to
computer vision problems in highway perception scenarios. In this paper, we
presented a number of empirical evaluations of recent deep learning advances.
Computer vision, combined with deep learning, has the potential to bring about
a relatively inexpensive, robust solution to autonomous driving. To prepare
deep learning for industry uptake and practical applications, neural networks
will require large data sets that represent all possible driving environments
and scenarios. We collect a large data set of highway data and apply deep
learning and computer vision algorithms to problems such as car and lane
detection. We show how existing convolutional neural networks (CNNs) can be
used to perform lane and vehicle detection while running at frame rates
required for a real-time system. Our results lend credence to the hypothesis
that deep learning holds promise for autonomous driving.Comment: Added a video for lane detectio
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Synthetic data approach for traffic sign recognition
Dissertação de mestrado em Computer ScienceCurrently, Advanced Driver Assistance Systems (ADAS) have been gradually increasing their
presence in everyday life, thanks in part to its ability to recognize several distinct types
of objects in the road, namely, traffic signs. These systems employ Convolutional Neural
Networks (CNNs), a type of classification algorithms that relies on an enormous amount of
data in order to be effective. Current traffic sign datasets suffer from a scarcity of samples
due to the necessity of compiling and labeling them manually. Such task is highly resource
and time consuming. Thus, researches resort to other mechanisms to deal with this problem,
such as increasing the architectural complexity of the neural networks or performing data
augmentation.
This work addresses the data shortage issue by exploring the feasibility of developing a
synthetic dataset. Such set would not require gathering and labelling manually thousands
of real word traffic sign images, requiring only easily collectable information and no human
intervention.
The only data required is a set of templates for each sign given that a particular sign may
have more than one template. This is required to cope with outdated pictograms that are
still present in streets and roads.
We apply several colour and geometric processing methods to the templates aiming to
achieve a look similar to real signs, from the CNN point of view. One of such methods is
the usage of Perlin noise to both simulate shadows and avoid the clean and homogeneous
look that templates have.
Two use cases for synthetic data usage are presented: considering the synthetic dataset
as a standalone training set, and merging synthetic data with real samples when real data
is available. The first option provided results that not only clearly surpass any previous
attempt on using synthetic data for traffic sign recognition, but are also encouragingly
placing the accuracies obtained close to state-of-the-art results, with much simpler networks.
The second approach provided results on three distinct test datasets that consistently beat
state-of-the-art results, either in accuracy or in simplicity of the network.Atualmente, Sistemas Avançados de Assistência ao Condutor têm vindo a aumentar gradualmente a sua presença no quotidiano graças, em parte, à sua capacidade de reconhecer vários objetos distintos na estrada, nomeadamente, sinais de trânsito. Estes sistemas empregam Redes Neuronais Convolucionais (CNNs), um tipo de algoritmos de classificação que dependem de unia enorme quantidade de dados de forma a serem eficientes. Os conjuntos de dados de sinais de trânsito atuais sofrem de escassez de amostras devido à necessidade de as compilar e rotular manualmente. Tal tarefa consome imenso tempo e recursos. Por conseguinte, investigadores recorrem a outros mecanismos para serem capazes de lidar com esse problema, tais como, aumentar a complexidade arquitetural das redes neuronais ou efetuar data augmentation. Desta forma, este trabalho aborda a questão da escassez de dados, explorando a viabilidade do desenvolvimento de um conjunto de dados sintéticos. Tal conjunto não exigiria recolher e rotular manualmente milhares de imagens de sinais de trânsito, necessitando apenas de informação facilmente recolhida sem intervenção humana. Os únicos dados necessários são um conjunto de modelos para cada sinal uma vez que um sinal particular pode apresentar mais que um modelo. Tal é necessário para lidar com pictogramas desatualizados que ainda se encontram nas ruas e estradas. Aplicamos vários métodos de processamento de cor e geometria aos templates visando obter uma aparência semelhante a sinais reais, do ponto de vista da CNN. Um desses métodos é a utilização do ruÃdo de Perlin para simular sombras e evitar a aparência limpa e homogênea que os modelos apresentam. Dois casos de uso com dados sintéticos são apresentados: considerar o conjunto de dados sintético como um conjunto de treino independente, e unir dados sintéticos com amostras reais sempre que estas estiverem disponÃveis. A primeira opção forneceu resultados que, não apenas superam claramente qualquer tentativa anterior de usar dados sintéticos para reconhecimento de sinais de trânsito, como também colocam as precisões obtidas próximas dos resultados do estado da arte, com redes muito mais simples. A segunda abordagem forneceu resultados em três conjuntos de dados de teste distintos que superam consistentemente os resultados do estado da arte, tanto na precisão quanto na simplicidade da rede
- …