1,876 research outputs found

    SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection

    Full text link
    Vision-based vehicle detection approaches achieve incredible success in recent years with the development of deep convolutional neural network (CNN). However, existing CNN based algorithms suffer from the problem that the convolutional features are scale-sensitive in object detection task but it is common that traffic images and videos contain vehicles with a large variance of scales. In this paper, we delve into the source of scale sensitivity, and reveal two key issues: 1) existing RoI pooling destroys the structure of small scale objects, 2) the large intra-class distance for a large variance of scales exceeds the representation capability of a single network. Based on these findings, we present a scale-insensitive convolutional neural network (SINet) for fast detecting vehicles with a large variance of scales. First, we present a context-aware RoI pooling to maintain the contextual information and original structure of small scale objects. Second, we present a multi-branch decision network to minimize the intra-class distance of features. These lightweight techniques bring zero extra time complexity but prominent detection accuracy improvement. The proposed techniques can be equipped with any deep network architectures and keep them trained end-to-end. Our SINet achieves state-of-the-art performance in terms of accuracy and speed (up to 37 FPS) on the KITTI benchmark and a new highway dataset, which contains a large variance of scales and extremely small objects.Comment: Accepted by IEEE Transactions on Intelligent Transportation Systems (T-ITS

    An Empirical Evaluation of Deep Learning on Highway Driving

    Full text link
    Numerous groups have applied a variety of deep learning techniques to computer vision problems in highway perception scenarios. In this paper, we presented a number of empirical evaluations of recent deep learning advances. Computer vision, combined with deep learning, has the potential to bring about a relatively inexpensive, robust solution to autonomous driving. To prepare deep learning for industry uptake and practical applications, neural networks will require large data sets that represent all possible driving environments and scenarios. We collect a large data set of highway data and apply deep learning and computer vision algorithms to problems such as car and lane detection. We show how existing convolutional neural networks (CNNs) can be used to perform lane and vehicle detection while running at frame rates required for a real-time system. Our results lend credence to the hypothesis that deep learning holds promise for autonomous driving.Comment: Added a video for lane detectio

    Object Detection in 20 Years: A Survey

    Full text link
    Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Its development in the past two decades can be regarded as an epitome of computer vision history. If we think of today's object detection as a technical aesthetics under the power of deep learning, then turning back the clock 20 years we would witness the wisdom of cold weapon era. This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed up techniques, and the recent state of the art detection methods. This paper also reviews some important detection applications, such as pedestrian detection, face detection, text detection, etc, and makes an in-deep analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible publicatio

    Synthetic data approach for traffic sign recognition

    Get PDF
    Dissertação de mestrado em Computer ScienceCurrently, Advanced Driver Assistance Systems (ADAS) have been gradually increasing their presence in everyday life, thanks in part to its ability to recognize several distinct types of objects in the road, namely, traffic signs. These systems employ Convolutional Neural Networks (CNNs), a type of classification algorithms that relies on an enormous amount of data in order to be effective. Current traffic sign datasets suffer from a scarcity of samples due to the necessity of compiling and labeling them manually. Such task is highly resource and time consuming. Thus, researches resort to other mechanisms to deal with this problem, such as increasing the architectural complexity of the neural networks or performing data augmentation. This work addresses the data shortage issue by exploring the feasibility of developing a synthetic dataset. Such set would not require gathering and labelling manually thousands of real word traffic sign images, requiring only easily collectable information and no human intervention. The only data required is a set of templates for each sign given that a particular sign may have more than one template. This is required to cope with outdated pictograms that are still present in streets and roads. We apply several colour and geometric processing methods to the templates aiming to achieve a look similar to real signs, from the CNN point of view. One of such methods is the usage of Perlin noise to both simulate shadows and avoid the clean and homogeneous look that templates have. Two use cases for synthetic data usage are presented: considering the synthetic dataset as a standalone training set, and merging synthetic data with real samples when real data is available. The first option provided results that not only clearly surpass any previous attempt on using synthetic data for traffic sign recognition, but are also encouragingly placing the accuracies obtained close to state-of-the-art results, with much simpler networks. The second approach provided results on three distinct test datasets that consistently beat state-of-the-art results, either in accuracy or in simplicity of the network.Atualmente, Sistemas Avançados de Assistência ao Condutor têm vindo a aumentar gradualmente a sua presença no quotidiano graças, em parte, à sua capacidade de reconhecer vários objetos distintos na estrada, nomeadamente, sinais de trânsito. Estes sistemas empregam Redes Neuronais Convolucionais (CNNs), um tipo de algoritmos de classificação que dependem de unia enorme quantidade de dados de forma a serem eficientes. Os conjuntos de dados de sinais de trânsito atuais sofrem de escassez de amostras devido à necessidade de as compilar e rotular manualmente. Tal tarefa consome imenso tempo e recursos. Por conseguinte, investigadores recorrem a outros mecanismos para serem capazes de lidar com esse problema, tais como, aumentar a complexidade arquitetural das redes neuronais ou efetuar data augmentation. Desta forma, este trabalho aborda a questão da escassez de dados, explorando a viabilidade do desenvolvimento de um conjunto de dados sintéticos. Tal conjunto não exigiria recolher e rotular manualmente milhares de imagens de sinais de trânsito, necessitando apenas de informação facilmente recolhida sem intervenção humana. Os únicos dados necessários são um conjunto de modelos para cada sinal uma vez que um sinal particular pode apresentar mais que um modelo. Tal é necessário para lidar com pictogramas desatualizados que ainda se encontram nas ruas e estradas. Aplicamos vários métodos de processamento de cor e geometria aos templates visando obter uma aparência semelhante a sinais reais, do ponto de vista da CNN. Um desses métodos é a utilização do ruído de Perlin para simular sombras e evitar a aparência limpa e homogênea que os modelos apresentam. Dois casos de uso com dados sintéticos são apresentados: considerar o conjunto de dados sintético como um conjunto de treino independente, e unir dados sintéticos com amostras reais sempre que estas estiverem disponíveis. A primeira opção forneceu resultados que, não apenas superam claramente qualquer tentativa anterior de usar dados sintéticos para reconhecimento de sinais de trânsito, como também colocam as precisões obtidas próximas dos resultados do estado da arte, com redes muito mais simples. A segunda abordagem forneceu resultados em três conjuntos de dados de teste distintos que superam consistentemente os resultados do estado da arte, tanto na precisão quanto na simplicidade da rede
    • …
    corecore