9,002 research outputs found
Biologically Inspired Approaches to Automated Feature Extraction and Target Recognition
Ongoing research at Boston University has produced computational models of biological vision and learning that embody a growing corpus of scientific data and predictions. Vision models perform long-range grouping and figure/ground segmentation, and memory models create attentionally controlled recognition codes that intrinsically cornbine botton-up activation and top-down learned expectations. These two streams of research form the foundation of novel dynamically integrated systems for image understanding. Simulations using multispectral images illustrate road completion across occlusions in a cluttered scene and information fusion from incorrect labels that are simultaneously inconsistent and correct. The CNS Vision and Technology Labs (cns.bu.edulvisionlab and cns.bu.edu/techlab) are further integrating science and technology through analysis, testing, and development of cognitive and neural models for large-scale applications, complemented by software specification and code distribution.Air Force Office of Scientific Research (F40620-01-1-0423); National Geographic-Intelligence Agency (NMA 201-001-1-2016); National Science Foundation (SBE-0354378; BCS-0235298); Office of Naval Research (N00014-01-1-0624); National Geospatial-Intelligence Agency and the National Society of Siegfried Martens (NMA 501-03-1-2030, DGE-0221680); Department of Homeland Security graduate fellowshi
Machine Analysis of Facial Expressions
No abstract
Person re-identification via efficient inference in fully connected CRF
In this paper, we address the problem of person re-identification problem,
i.e., retrieving instances from gallery which are generated by the same person
as the given probe image. This is very challenging because the person's
appearance usually undergoes significant variations due to changes in
illumination, camera angle and view, background clutter, and occlusion over the
camera network. In this paper, we assume that the matched gallery images should
not only be similar to the probe, but also be similar to each other, under
suitable metric. We express this assumption with a fully connected CRF model in
which each node corresponds to a gallery and every pair of nodes are connected
by an edge. A label variable is associated with each node to indicate whether
the corresponding image is from target person. We define unary potential for
each node using existing feature calculation and matching techniques, which
reflect the similarity between probe and gallery image, and define pairwise
potential for each edge in terms of a weighed combination of Gaussian kernels,
which encode appearance similarity between pair of gallery images. The specific
form of pairwise potential allows us to exploit an efficient inference
algorithm to calculate the marginal distribution of each label variable for
this dense connected CRF. We show the superiority of our method by applying it
to public datasets and comparing with the state of the art.Comment: 7 pages, 4 figure
Towards A Robust Arabic Speech Recognition System Based On Reservoir Computing
In this thesis we investigate the potential of developing a speech recognition system based on a recently introduced artificial neural network (ANN) technique, namely Reservoir Computing (RC). This technique has, in theory, a higher capability for modelling dynamic behaviour compared to feed-forward ANNs due to the recurrent connections between the nodes in the reservoir layer, which serves as a memory. We conduct this study on the Arabic language, (one of the most spoken languages in the world and the official language in 26 countries), because there is a serious gap in the literature on speech recognition systems for Arabic, making the potential impact high. The investigation covers a variety of tasks, including the implementation of the first reservoir-based Arabic speech recognition system. In addition, a thorough evaluation of the developed system is conducted including several comparisons to other state- of-the-art models found in the literature, and baseline models. The impact of feature extraction methods are studied in this work, and a new biologically inspired feature extraction technique, namely the Auditory Nerve feature, is applied to the speech recognition domain. Comparing different feature extraction methods requires access to the original recorded sound, which is not possible in the only publicly accessible Arabic corpus. We have developed the largest public Arabic corpus for isolated words, which contains roughly 10,000 samples. Our investigation has led us to develop two novel approaches based on reservoir computing, ESNSVMs (Echo State Networks with Support Vector Machines) and ESNEKMs (Echo State Networks with Extreme Kernel Machines). These aim to improve the performance of the conventional RC approach by proposing different readout architectures. These two approaches have been compared to the conventional RC approach and other state-of-the- art systems. Finally, these developed approaches have been evaluated on the presence of different types and levels of noise to examine their resilience to noise, which is crucial for real world applications
Object Detection Through Exploration With A Foveated Visual Field
We present a foveated object detector (FOD) as a biologically-inspired
alternative to the sliding window (SW) approach which is the dominant method of
search in computer vision object detection. Similar to the human visual system,
the FOD has higher resolution at the fovea and lower resolution at the visual
periphery. Consequently, more computational resources are allocated at the
fovea and relatively fewer at the periphery. The FOD processes the entire
scene, uses retino-specific object detection classifiers to guide eye
movements, aligns its fovea with regions of interest in the input image and
integrates observations across multiple fixations. Our approach combines modern
object detectors from computer vision with a recent model of peripheral pooling
regions found at the V1 layer of the human visual system. We assessed various
eye movement strategies on the PASCAL VOC 2007 dataset and show that the FOD
performs on par with the SW detector while bringing significant computational
cost savings.Comment: An extended version of this manuscript was published in PLOS
Computational Biology (October 2017) at
https://doi.org/10.1371/journal.pcbi.100574
Recommended from our members
An Overview of the Use of Neural Networks for Data Mining Tasks
In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks
- …