110,771 research outputs found
Detection and recognition of moving video objects: Kalman filtering with deep learning
© 2021. All rights reserved. Research in object recognition has lately found that Deep Convolutional Neuronal Networks (CNN) provide a breakthrough in detection scores, especially in video applications. This paper presents an approach for object recognition in videos by combining Kalman filter with CNN. Kalman filter is first applied for detection, removing the background and then cropping object. Kalman filtering achieves three important functions: predicting the future location of the object, reducing noise and interference from incorrect detections, and associating multi-objects to tracks. After detection and cropping the moving object, a CNN model will predict the category of object. The CNN model is built based on more than 1000 image of humans, animals and others, with architecture that consists of ten layers. The first layer, which is the input image, is of 100 * 100 size. The convolutional layer contains 20 masks with a size of 5 * 5, with a ruling layer to normalize data, then max-pooling. The proposed hybrid algorithm has been applied to 8 different videos with total duration of is 15.4 minutes, containing 23100 frames. In this experiment, recognition accuracy reached 100%, where the proposed system outperforms six existing algorithms
Recursive Training of 2D-3D Convolutional Networks for Neuronal Boundary Detection
Efforts to automate the reconstruction of neural circuits from 3D electron
microscopic (EM) brain images are critical for the field of connectomics. An
important computation for reconstruction is the detection of neuronal
boundaries. Images acquired by serial section EM, a leading 3D EM technique,
are highly anisotropic, with inferior quality along the third dimension. For
such images, the 2D max-pooling convolutional network has set the standard for
performance at boundary detection. Here we achieve a substantial gain in
accuracy through three innovations. Following the trend towards deeper networks
for object recognition, we use a much deeper network than previously employed
for boundary detection. Second, we incorporate 3D as well as 2D filters, to
enable computations that use 3D context. Finally, we adopt a recursively
trained architecture in which a first network generates a preliminary boundary
map that is provided as input along with the original image to a second network
that generates a final boundary map. Backpropagation training is accelerated by
ZNN, a new implementation of 3D convolutional networks that uses multicore CPU
parallelism for speed. Our hybrid 2D-3D architecture could be more generally
applicable to other types of anisotropic 3D images, including video, and our
recursive framework for any image labeling problem
A Novel Hybrid CNN-AIS Visual Pattern Recognition Engine
Machine learning methods are used today for most recognition problems.
Convolutional Neural Networks (CNN) have time and again proved successful for
many image processing tasks primarily for their architecture. In this paper we
propose to apply CNN to small data sets like for example, personal albums or
other similar environs where the size of training dataset is a limitation,
within the framework of a proposed hybrid CNN-AIS model. We use Artificial
Immune System Principles to enhance small size of training data set. A layer of
Clonal Selection is added to the local filtering and max pooling of CNN
Architecture. The proposed Architecture is evaluated using the standard MNIST
dataset by limiting the data size and also with a small personal data sample
belonging to two different classes. Experimental results show that the proposed
hybrid CNN-AIS based recognition engine works well when the size of training
data is limited in siz
Efficient Spiking and Artificial Neural Networks for Event Cameras
Event cameras have become attractive alternatives to regular frame-based cameras in many scenarios, from consumer electronics over surveillance to autonomous driving.Their novel sensor paradigm of asynchronously detecting brightness changes in a scene make them faster, more energy-efficient and less susceptible to global illumination.Processing these event streams calls for algorithms that are as efficient as the camera itself, while being competitive to frame-based computer vision on tasks like object recognition and detection.This thesis contributes methods to obtain efficient neural networks for classification and object detection in event streams.We adopt ANN-to-SNN (artificial neural network to spiking neural network) conversion to handle sequential data like videos or event streams to improve state-of-the-art in accuracy and energy-efficiency.We propose a novel network architecture called hybrid SNN-ANN, to train a mixed SNN and ANN network using surrogate gradients.These hybrid networks are more efficient, even compared to trained and converted SNNs.To detect objects with only a small number of events, we propose a filter and a memory, both improving results during inference.Our networks advance the state-of-the-art in event stream processing and contribute to the success of event cameras.Given suitable neuromorphic hardware, our spiking neural networks enable event cameras to be used in scenarios with a limited energy budget.Our proposed hybrid architecture can guide the design of novel hybrid neuromorphic devices that combine efficient sparse and dense processing
Zero-Shot Object Detection by Hybrid Region Embedding
Object detection is considered as one of the most challenging problems in
computer vision, since it requires correct prediction of both classes and
locations of objects in images. In this study, we define a more difficult
scenario, namely zero-shot object detection (ZSD) where no visual training data
is available for some of the target object classes. We present a novel approach
to tackle this ZSD problem, where a convex combination of embeddings are used
in conjunction with a detection framework. For evaluation of ZSD methods, we
propose a simple dataset constructed from Fashion-MNIST images and also a
custom zero-shot split for the Pascal VOC detection challenge. The experimental
results suggest that our method yields promising results for ZSD
Recommended from our members
Robots with Commonsense: Improving Object Recognition through Size and Spatial Awareness
To effectively assist us with our daily tasks, service robots need object recognition methods that perform robustly in dynamic environments. Our prior work has shown that augmenting Deep Learning (DL) methods with knowledge-based reasoning can drastically improve the reliability of object recognition systems. This paper proposes a novel method to equip DL-based object recognition with the ability to reason on the typical size and spatial relations of objects. Experiments in a real-world robotic scenario show that the proposed hybrid architecture significantly outperforms DL-only solutions
- …