381,674 research outputs found

    Multi-Lane Perception Using Feature Fusion Based on GraphSLAM

    Full text link
    An extensive, precise and robust recognition and modeling of the environment is a key factor for next generations of Advanced Driver Assistance Systems and development of autonomous vehicles. In this paper, a real-time approach for the perception of multiple lanes on highways is proposed. Lane markings detected by camera systems and observations of other traffic participants provide the input data for the algorithm. The information is accumulated and fused using GraphSLAM and the result constitutes the basis for a multilane clothoid model. To allow incorporation of additional information sources, input data is processed in a generic format. Evaluation of the method is performed by comparing real data, collected with an experimental vehicle on highways, to a ground truth map. The results show that ego and adjacent lanes are robustly detected with high quality up to a distance of 120 m. In comparison to serial lane detection, an increase in the detection range of the ego lane and a continuous perception of neighboring lanes is achieved. The method can potentially be utilized for the longitudinal and lateral control of self-driving vehicles

    The synthesis of visual recognition strategies

    Get PDF
    Journal ArticleA coherent automated manufacturing system needs to include CAD/CAM, computer vision, and object manipulation. Currently, most systems which support CAD/CAM do not provide for vision or manipulation and similarly, vision and manipulation systems incorporate no explicit relation to CAD/ CAM models. CAD/CAM systems have emerged which allow the designer to conceive and model an object and automatically manufacture the object to the prescribed specifications. If recognition or manipulation is to be performed, existing vision systems rely on models generated in an ad hoc manner for the vision or recognition process. Although both Vision and CAD/CAM systems rely on models of the objects involved, different modeling schemes are used in each case. A more unified system will allow vision models to be generated from the CAD database. The model generation should be guided by the class of object being constructed, the constraints of the vision algorithms used and the constraints imposed by the robotic workcell environment (fixtures, sensors, manipulators and effectors). We are implementing a framework in which objects are designed using an existing CAGD system and recognition strategies (logical sensor specifications) are automatically synthesized and used for visual recognition and manipulation

    Speaker Identification and Spoken word Recognition in Noisy Environment using Different Techniques

    Get PDF
    In this work, an attempt is made to design ASR systems through software/computer programs which would perform Speaker Identification, Spoken word recognition and combination of both speaker identification and Spoken word recognition in general noisy environment. Automatic Speech Recognition system is designed for Limited vocabulary of Telugu language words/control commands. The experiments are conducted to find the better combination of feature extraction technique and classifier model that will perform well in general noisy environment (Home/Office environment where noise is around 15-35 dB). A recently proposed features extraction technique Gammatone frequency coefficients which is reported as the best fit to the human auditory system is chosen for the experiments along with the more common feature extraction techniques MFCC and PLP as part of Front end process (i.e. speech features extraction). Two different Artificial Neural Network classifiers Learning Vector Quantization (LVQ) neural networks and Radial Basis Function (RBF) neural networks along with Hidden Markov Models (HMMs) are chosen for the experiments as part of Back end process (i.e. training/modeling the ASRs). The performance of different ASR systems that are designed by utilizing the 9 different combinations (3 feature extraction techniques and 3 classifier models) are analyzed in terms of spoken word recognition and speaker identification accuracy success rate, design time of ASRs, and recognition / identification response time .The testing speech samples are recorded in general noisy conditions i.e.in the existence of air conditioning noise, fan noise, computer key board noise and far away cross talk noise. ASR systems designed and analyzed programmatically in MATLAB 2013(a) Environment

    Deep Maxout Networks applied to Noise-Robust Speech Recognition

    Get PDF
    Proceedings of: IberSPEECH 2014 "VIII Jornadas en Tecnologías del Habla" and "IV Iberian SLTech Workshop". Las Palmas de Gran Canaria, Spain, November 19-21, 2014.Deep Neural Networks (DNN) have become very popular for acoustic modeling due to the improvements found over traditional Gaussian Mixture Models (GMM). However, not many works have addressed the robustness of these systems under noisy conditions. Recently, the machine learning community has proposed new methods to improve the accuracy of DNNs by using techniques such as dropout and maxout. In this paper, we investigate Deep Maxout Networks (DMN) for acoustic modeling in a noisy automatic speech recognition environment. Experiments show that DMNs improve substantially the recognition accuracy over DNNs and other traditional techniques in both clean and noisy conditions on the TIMIT dataset.This contribution has been supported by an Airbus Defense and Space Grant (Open Innovation - SAVIER) and Spanish Government-CICYT project 2011-26807/TEC.Publicad

    Human Detection Framework for Automated Surveillance Systems

    Get PDF
    Vision-based systems for surveillance applications have been used widely and gained more research attention. Detecting people in an image stream is challenging because of their intra-class variability, the diversity of the backgrounds, and the conditions under which the images were acquired. Existing human detection solutions suffer in their effectiveness and efficiency. In particular, the accuracy of the existing detectors is characterized by their high false positive and negative. In addition, existing detectors are slow for online surveillance systems which lead to large delay that is not suitable for surveillance systems for real-time monitoring. In this paper, a holistic framework is proposed for enhancing the performance of human detection in surveillance system. In general, the framework includes the following stages: environment modeling, motion object detection, and human object recognition. In environment modeling, modal algorithm has been suggested for background initialization and extraction. Then for effectively classifying the motion object, edge detecting and B-spline algorithm have been used for shadow detection and removal. Then, enhanced Lucas–Kanade optical flow has been used to get the area of interest for object segmentation. Finally, to enhance the segmentation, some morphological processes were performed. In the motion object recognition stage, segmentation for each blob is performed and processed to the human detector which is a complete learning-based system for detecting and localizing objects/humans in images using mixtures of deformable part models (PFF detector). Results show enhancement in each phase of the proposed framework. These enhancements are shown in the overall performance of human detection in surveillance system

    Recognising behaviours of multiple people with hierarchical probabilistic model and statistical data association

    Full text link
    Recognising behaviours of multiple people, especially high-level behaviours, is an important task in surveillance systems. When the reliable assignment of people to the set of observations is unavailable, this task becomes complicated. To solve this task, we present an approach, in which the hierarchical hidden Markov model (HHMM) is used for modeling the behaviour of each person and the joint probabilistic data association filters (JPDAF) is applied for data association. The main contributions of this paper lie in the integration of multiple HHMMs for recognising high-level behaviours of multiple people and the construction of the Rao-Blackwellised particle filters (RBPF) for approximate inference. Preliminary experimental results in a real environment show the robustness of our integrated method in behaviour recognition and its advantage over the use of Kalman filter in tracking people.<br /

    Building an enhanced vocabulary of the robot environment with a ceiling pointing camera

    Get PDF
    Mobile robots are of great help for automatic monitoring tasks in different environments. One of the first tasks that needs to be addressed when creating these kinds of robotic systems is modeling the robot environment. This work proposes a pipeline to build an enhanced visual model of a robot environment indoors. Vision based recognition approaches frequently use quantized feature spaces, commonly known as Bag of Words (BoW) or vocabulary representations. A drawback using standard BoW approaches is that semantic information is not considered as a criteria to create the visual words. To solve this challenging task, this paper studies how to leverage the standard vocabulary construction process to obtain a more meaningful visual vocabulary of the robot work environment using image sequences. We take advantage of spatio-temporal constraints and prior knowledge about the position of the camera. The key contribution of our work is the definition of a new pipeline to create a model of the environment. This pipeline incorporates (1) tracking information to the process of vocabulary construction and (2) geometric cues to the appearance descriptors. Motivated by long term robotic applications, such as the aforementioned monitoring tasks, we focus on a configuration where the robot camera points to the ceiling, which captures more stable regions of the environment. The experimental validation shows how our vocabulary models the environment in more detail than standard vocabulary approaches, without loss of recognition performance. We show different robotic tasks that could benefit of the use of our visual vocabulary approach, such as place recognition or object discovery. For this validation, we use our publicly available data-set

    Sonar discrimination of cylinders from different angles using neural networks neural networks

    Get PDF
    This paper describes an underwater object discrimination system applied to recognize cylinders of various compositions from different angles. The system is based on a new combination of simulated dolphin clicks, simulated auditory filters and artificial neural networks. The model demonstrates its potential on real data collected from four different cylinders in an environment where the angles were controlled in order to evaluate the models capabilities to recognize cylinders independent of angles. 1. INTRODUCTION Dolphins possess an excellent sonar system for solving underwater target discrimination and recognition tasks in shallow water (see e.g., [2]). This has inspired research in new sonar systems based on biological knowledge, i.e. modeling the dolphins discrimination capabilities (see e.g., [4] and [5]). The fact that the inner ear of the dolphin has many similarities with the human inner ear makes it tempting to use knowledge from simulations of the human auditory system when t..
    corecore