110,771 research outputs found

    Detection and recognition of moving video objects: Kalman filtering with deep learning

    Get PDF
    © 2021. All rights reserved. Research in object recognition has lately found that Deep Convolutional Neuronal Networks (CNN) provide a breakthrough in detection scores, especially in video applications. This paper presents an approach for object recognition in videos by combining Kalman filter with CNN. Kalman filter is first applied for detection, removing the background and then cropping object. Kalman filtering achieves three important functions: predicting the future location of the object, reducing noise and interference from incorrect detections, and associating multi-objects to tracks. After detection and cropping the moving object, a CNN model will predict the category of object. The CNN model is built based on more than 1000 image of humans, animals and others, with architecture that consists of ten layers. The first layer, which is the input image, is of 100 * 100 size. The convolutional layer contains 20 masks with a size of 5 * 5, with a ruling layer to normalize data, then max-pooling. The proposed hybrid algorithm has been applied to 8 different videos with total duration of is 15.4 minutes, containing 23100 frames. In this experiment, recognition accuracy reached 100%, where the proposed system outperforms six existing algorithms

    Recursive Training of 2D-3D Convolutional Networks for Neuronal Boundary Detection

    Full text link
    Efforts to automate the reconstruction of neural circuits from 3D electron microscopic (EM) brain images are critical for the field of connectomics. An important computation for reconstruction is the detection of neuronal boundaries. Images acquired by serial section EM, a leading 3D EM technique, are highly anisotropic, with inferior quality along the third dimension. For such images, the 2D max-pooling convolutional network has set the standard for performance at boundary detection. Here we achieve a substantial gain in accuracy through three innovations. Following the trend towards deeper networks for object recognition, we use a much deeper network than previously employed for boundary detection. Second, we incorporate 3D as well as 2D filters, to enable computations that use 3D context. Finally, we adopt a recursively trained architecture in which a first network generates a preliminary boundary map that is provided as input along with the original image to a second network that generates a final boundary map. Backpropagation training is accelerated by ZNN, a new implementation of 3D convolutional networks that uses multicore CPU parallelism for speed. Our hybrid 2D-3D architecture could be more generally applicable to other types of anisotropic 3D images, including video, and our recursive framework for any image labeling problem

    A Novel Hybrid CNN-AIS Visual Pattern Recognition Engine

    Full text link
    Machine learning methods are used today for most recognition problems. Convolutional Neural Networks (CNN) have time and again proved successful for many image processing tasks primarily for their architecture. In this paper we propose to apply CNN to small data sets like for example, personal albums or other similar environs where the size of training dataset is a limitation, within the framework of a proposed hybrid CNN-AIS model. We use Artificial Immune System Principles to enhance small size of training data set. A layer of Clonal Selection is added to the local filtering and max pooling of CNN Architecture. The proposed Architecture is evaluated using the standard MNIST dataset by limiting the data size and also with a small personal data sample belonging to two different classes. Experimental results show that the proposed hybrid CNN-AIS based recognition engine works well when the size of training data is limited in siz

    Efficient Spiking and Artificial Neural Networks for Event Cameras

    Get PDF
    Event cameras have become attractive alternatives to regular frame-based cameras in many scenarios, from consumer electronics over surveillance to autonomous driving.Their novel sensor paradigm of asynchronously detecting brightness changes in a scene make them faster, more energy-efficient and less susceptible to global illumination.Processing these event streams calls for algorithms that are as efficient as the camera itself, while being competitive to frame-based computer vision on tasks like object recognition and detection.This thesis contributes methods to obtain efficient neural networks for classification and object detection in event streams.We adopt ANN-to-SNN (artificial neural network to spiking neural network) conversion to handle sequential data like videos or event streams to improve state-of-the-art in accuracy and energy-efficiency.We propose a novel network architecture called hybrid SNN-ANN, to train a mixed SNN and ANN network using surrogate gradients.These hybrid networks are more efficient, even compared to trained and converted SNNs.To detect objects with only a small number of events, we propose a filter and a memory, both improving results during inference.Our networks advance the state-of-the-art in event stream processing and contribute to the success of event cameras.Given suitable neuromorphic hardware, our spiking neural networks enable event cameras to be used in scenarios with a limited energy budget.Our proposed hybrid architecture can guide the design of novel hybrid neuromorphic devices that combine efficient sparse and dense processing

    Zero-Shot Object Detection by Hybrid Region Embedding

    Full text link
    Object detection is considered as one of the most challenging problems in computer vision, since it requires correct prediction of both classes and locations of objects in images. In this study, we define a more difficult scenario, namely zero-shot object detection (ZSD) where no visual training data is available for some of the target object classes. We present a novel approach to tackle this ZSD problem, where a convex combination of embeddings are used in conjunction with a detection framework. For evaluation of ZSD methods, we propose a simple dataset constructed from Fashion-MNIST images and also a custom zero-shot split for the Pascal VOC detection challenge. The experimental results suggest that our method yields promising results for ZSD
    • …
    corecore