74 research outputs found

    Increasing Accuracy Performance through Optimal Feature Extraction Algorithms

    Get PDF
    This research developed models and techniques to improve the three key modules of popular recognition systems: preprocessing, feature extraction, and classification. Improvements were made in four key areas: processing speed, algorithm complexity, storage space, and accuracy. The focus was on the application areas of the face, traffic sign, and speaker recognition. In the preprocessing module of facial and traffic sign recognition, improvements were made through the utilization of grayscaling and anisotropic diffusion. In the feature extraction module, improvements were made in two different ways; first, through the use of mixed transforms and second through a convolutional neural network (CNN) that best fits specific datasets. The mixed transform system consists of various combinations of the Discrete Wavelet Transform (DWT) and Discrete Cosine Transform (DCT), which have a reliable track record for image feature extraction. In terms of the proposed CNN, a neuroevolution system was used to determine the characteristics and layout of a CNN to best extract image features for particular datasets. In the speaker recognition system, the improvement to the feature extraction module comprised of a quantized spectral covariance matrix and a two-dimensional Principal Component Analysis (2DPCA) function. In the classification module, enhancements were made in visual recognition through the use of two neural networks: the multilayer sigmoid and convolutional neural network. Results show that the proposed improvements in the three modules led to an increase in accuracy as well as reduced algorithmic complexity, with corresponding reductions in storage space and processing time

    Real-time vehicle detection using low-cost sensors

    Get PDF
    Improving road safety and reducing the number of accidents is one of the top priorities for the automotive industry. As human driving behaviour is one of the top causation factors of road accidents, research is working towards removing control from the human driver by automating functions and finally introducing a fully Autonomous Vehicle (AV). A Collision Avoidance System (CAS) is one of the key safety systems for an AV, as it ensures all potential threats ahead of the vehicle are identified and appropriate action is taken. This research focuses on the task of vehicle detection, which is the base of a CAS, and attempts to produce an effective vehicle detector based on the data coming from a low-cost monocular camera. Developing a robust CAS based on low-cost sensor is crucial to bringing the cost of safety systems down and in this way, increase their adoption rate by end users. In this work, detectors are developed based on the two main approaches to vehicle detection using a monocular camera. The first is the traditional image processing approach where visual cues are utilised to generate potential vehicle locations and at a second stage, verify the existence of vehicles in an image. The second approach is based on a Convolutional Neural Network, a computationally expensive method that unifies the detection process in a single pipeline. The goal is to determine which method is more appropriate for real-time applications. Following the first approach, a vehicle detector based on the combination of HOG features and SVM classification is developed. The detector attempts to optimise performance by modifying the detection pipeline and improve run-time performance. For the CNN-based approach, six different network models are developed and trained end to end using collected data, each with a different network structure and parameters, in an attempt to determine which combination produces the best results. The evaluation of the different vehicle detectors produced some interesting findings; the first approach did not manage to produce a working detector, while the CNN-based approach produced a high performing vehicle detector with an 85.87% average precision and a very low miss rate. The detector managed to perform well under different operational environments (motorway, urban and rural roads) and the results were validated using an external dataset. Additional testing of the vehicle detector indicated it is suitable as a base for safety applications such as CAS, with a run time performance of 12FPS and potential for further improvements.</div

    Deep learning for texture and dynamic texture analysis

    Get PDF
    Texture is a fundamental visual cue in computer vision which provides useful information about image regions. Dynamic Texture (DT) extends the analysis of texture to sequences of moving scenes. Classic approaches to texture and DT analysis are based on shallow hand-crafted descriptors including local binary patterns and filter banks. Deep learning and in particular Convolutional Neural Networks (CNNs) have significantly contributed to the field of computer vision in the last decade. These biologically inspired networks trained with powerful algorithms have largely improved the state of the art in various tasks such as digit, object and face recognition. This thesis explores the use of CNNs in texture and DT analysis, replacing classic hand-crafted filters by deep trainable filters. An introduction to deep learning is provided in the thesis as well as a thorough review of texture and DT analysis methods. While CNNs present interesting features for the analysis of textures such as a dense extraction of filter responses trained end to end, the deepest layers used in the decision rules commonly learn to detect large shapes and image layout instead of local texture patterns. A CNN architecture is therefore adapted to textures by using an orderless pooling of intermediate layers to discard the overall shape analysis, resulting in a reduced computational cost and improved accuracy. An application to biomedical texture images is proposed in which large tissue images are tiled and combined in a recognition scheme. An approach is also proposed for DT recognition using the developed CNNs on three orthogonal planes to combine spatial and temporal analysis. Finally, a fully convolutional network is adapted to texture segmentation based on the same idea of discarding the overall shape and by combining local shallow features with larger and deeper features

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    Road Sign Board Direction and Location Extraction and Recognition for Autonomous Vehicle.

    Get PDF
    The problem of direction and location identification is very important in technologies used for Autonomous vehicles. while the navigation systems are that they cannot cover all areas due to a lack of signals or changes made on routes due to maintenance or upgrades. This research will focus on recognizing the sign and extracting address location names and directions from road signs. Moreover, it will help better identify road exits and lane directions for better route planning. In this paper we use YOLOv5 to identify the road board sign location and direction. Then extract the direction of each address location that are included in the road board sign and inform the car about the direction because autonomous car has no any driver so the car must decide by itself witch direction to choose to get the goal address location. This system can be used to continuously cheek the frames of the video that is taken by the car’s camera for road sign boards and analyses the image to find the direction of each location that are explained inside road sign board on the road. The proposed system consists of a camera mounted on top of the front mirror of the vehicle, and also a computer to run the recorded video on the system. In experiments, yolov5 framework achieves the best performance of 98.76% mean average precision (mAP) at Intersection over Union (IoU) threshold of 0.5, evaluated on our new developed dataset. And 91.31% on different IoU thresholds, ranging from 0.5 to 0.95

    Face recognition using improved deep learning neural network

    Get PDF
    In recent years the importance and need for computer vision systems increased due to security demands, self-driving cars, cell phone logins, forensic identification, banks, etc. In security, the idea is to distinguish individuals correctly by utilizing facial recognition, iris recognition, or other means suitable for identification. Cell phones use face recognition to unlock the screen and authorization. Face recognition systems perform tremendously well, however, they still face challenges of classification. Their major challenge is the ability to identify or recognize individuals in an image or images. The causes of this challenge could be lighting (illumination) conditions, the place or environment where the image is taken and this can be associated with the background environment of the image, posing, and facial gestures or expressions. This study investigates a possible method to bring a solution. The method proposes a combination of the Principal Component Analysis (PCA), K-Means clustering, and Convolutional Neural Network (CNN) for a face recognition system. Firstly, apply PCA to reduce dataset dimensions, enable smaller network usage and training, remove redundancy, maintain quality, and produce Eigenfaces. Secondly, apply PCA output to K-Means clustering to select centres with better characteristics, and produce initial input data for CNN. Lastly, take K-Means clustering output as the input of the CNN and train the network. It is trained and evaluated using the ORL dataset. This dataset comprises 400 different faces with 40 classes of 10 face images per class. The performance of this technique was tested against (PCA), Support Vector Machine (SVM), and K-Nearest Neighbour (KNN). This method’s accuracy after 90 epochs achieved 99% F1-Score, 99% precision, and 99% recall in 463.934 seconds. It outperformed the PCA that obtained 97% F1-Score and KNN with 84% F1-Score during the experiments. Therefore, this method proved to be efficient in identifying faces in the images.School of EngineeringMTech (Electrical Engineering

    Learning Sparse Orthogonal Wavelet Filters

    Get PDF
    The wavelet transform is a well studied and understood analysis technique used in signal processing. In wavelet analysis, signals are represented by a sum of self-similar wavelet and scaling functions. Typically, the wavelet transform makes use of a fixed set of wavelet functions that are analytically derived. We propose a method for learning wavelet functions directly from data. We impose an orthogonality constraint on the functions so that the learned wavelets can be used to perform both analysis and synthesis. We accomplish this by using gradient descent and leveraging existing automatic differentiation frameworks. Our learned wavelets are able to capture the structure of the data by exploiting sparsity. We show that the learned wavelets have similar structure to traditional wavelets. Machine learning has proven to be a powerful tool in signal processing and computer vision. Recently, neural networks have become a popular and successful method used to solve a variety of tasks. However, much of the success is not well understood, and the neural network models are often treated as black boxes. This thesis provides insight into the structure of neural networks. In particular, we consider the connection between convolutional neural networks and multiresolution analysis. We show that the wavelet transform shares similarities to current convolutional neural network architectures. We hope that viewing neural networks through the lens of multiresolution analysis may provide some useful insights. We begin the thesis by motivating our method for one-dimensional signals. We then show that we can easily extend the framework to multidimensional signals. Our learning method is evaluated on a variety of supervised and unsupervised tasks, such as image compression and audio classification. The tasks are chosen to compare the usefulness of the learned wavelets to traditional wavelets, as well as provide a comparison to existing neural network architectures. The wavelet transform used in this thesis has some drawbacks and limitations, caused in part by the fact that we make use of separable real filters. We address these shortcomings by exploring an extension of the wavelet transform known as the dual-tree complex wavelet transform. Our wavelet learning model is extended into the dual-tree domain with few modifications, overcoming the limitations of our standard model. With this new model we are able to show that localized, oriented filters arise from natural images

    Vision-Based Traffic Sign Detection and Recognition Systems: Current Trends and Challenges

    Get PDF
    The automatic traffic sign detection and recognition (TSDR) system is very important research in the development of advanced driver assistance systems (ADAS). Investigations on vision-based TSDR have received substantial interest in the research community, which is mainly motivated by three factors, which are detection, tracking and classification. During the last decade, a substantial number of techniques have been reported for TSDR. This paper provides a comprehensive survey on traffic sign detection, tracking and classification. The details of algorithms, methods and their specifications on detection, tracking and classification are investigated and summarized in the tables along with the corresponding key references. A comparative study on each section has been provided to evaluate the TSDR data, performance metrics and their availability. Current issues and challenges of the existing technologies are illustrated with brief suggestions and a discussion on the progress of driver assistance system research in the future. This review will hopefully lead to increasing efforts towards the development of future vision-based TSDR system. Document type: Articl

    Analysis of facial expressions in children: Experiments based on the DB Child Affective Facial Expression (CAFE)

    Get PDF
    Analysis of facial expressions in children of 2 to 8 years old, and identification of emotions.Language: English
    corecore