134 research outputs found

    Object detection, recognition and classification using computer vision and artificial intelligence approaches

    Get PDF
    Object detection and recognition has been used extensively in recent years to solve numerus challenges in different fields. Due to the vital roles they play, object detection and recognition has enabled quantum leaps in many industry fields by helping to overcome some serious challenges and obstacles. For example, worldwide security concerns have drawn the attention and stimulated the use of highly intelligent computer vision technology to provide security in different environments and in diverse terrains. In addition, some wildlife is at present exposed to danger and extinction worldwide. Therefore, early detection and recognition of potential threats to wildlife have become essential and timely. The extent of using computer vision and artificial intelligence to convert the seemingly insecure world to a more secure one has been widely accepted. Such technologies are used in monitoring, tracking, organising, analysing objects in a scene and for a number of other countless purposes. [Continues.

    Geometrical-based lip-reading using template probabilistic multi-dimension dynamic time warping

    Get PDF
    By identifying lip movements and characterizing their associations with speech sounds, the performance of speech recognition systems can be improved, particularly when operating in noisy environments. In this paper, we present a geometrical-based automatic lip reading system that extracts the lip region from images using conventional techniques, but the contour itself is extracted using a novel application of a combination of border following and convex hull approaches. Classification is carried out using an enhanced dynamic time warping technique that has the ability to operate in multiple dimensions and a template probability technique that is able to compensate for differences in the way words are uttered in the training set. The performance of the new system has been assessed in recognition of the English digits 0 to 9 as available in the CUAVE database. The experimental results obtained from the new approach compared favorably with those of existing lip reading approaches, achieving a word recognition accuracy of up to 71% with the visual information being obtained from estimates of lip height, width and their ratio

    Pattern Recognition

    Get PDF
    Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition

    Detection of Motorcycles in Urban Traffic Using Video Analysis: A Review

    Get PDF
    Motorcycles are Vulnerable Road Users (VRU) and as such, in addition to bicycles and pedestrians, they are the traffic actors most affected by accidents in urban areas. Automatic video processing for urban surveillance cameras has the potential to effectively detect and track these road users. The present review focuses on algorithms used for detection and tracking of motorcycles, using the surveillance infrastructure provided by CCTV cameras. Given the importance of results achieved by Deep Learning theory in the field of computer vision, the use of such techniques for detection and tracking of motorcycles is also reviewed. The paper ends by describing the performance measures generally used, publicly available datasets (introducing the Urban Motorbike Dataset (UMD) with quantitative evaluation results for different detectors), discussing the challenges ahead and presenting a set of conclusions with proposed future work in this evolving area

    Mathematical modeling for partial object detection.

    Get PDF
    From a computer vision point of view, the image is a scene consisting of objects of interest and a background represented by everything else in the image. The relations and interactions among these objects are the key factors for scene understanding. In this dissertation, a mathematical model is designed for the detection of partially occluded faces captured in unconstrained real life conditions. The proposed model novelty comes from explicitly considering certain objects that are common to occlude faces and embedding them in the face model. This enables the detection of faces in difficult settings and provides more information to subsequent analysis in addition to the bounding box of the face. In the proposed Selective Part Models (SPM), the face is modelled as a collection of parts that can be selected from the visible regular facial parts and some of the occluding objects which commonly interact with faces such as sunglasses, caps, hands, shoulders, and other faces. With the face detection being the first step in the face recognition pipeline, the proposed model does not only detect partially occluded faces efficiently but it also suggests the occluded parts to be excluded from the subsequent recognition step. The model was tested on several recent face detection databases and benchmarks and achieved state of the art performance. In addition, detailed analysis for the performance with respect to different types of occlusion were provided. Moreover, a new database was collected for evaluating face detectors focusing on the partial occlusion problem. This dissertation highlights the importance of explicitly handling the partial occlusion problem in face detection and shows its efficiency in enhancing both the face detection performance and the subsequent recognition performance of partially occluded faces. The broader impact of the proposed detector exceeds the common security applications by using it for human robot interaction. The humanoid robot Nao is used to help in teaching children with autism and the proposed detector is used to achieve natural interaction between the robot and the children by detecting their faces which can be used for recognition or more interestingly for adaptive interaction by analyzing their expressions

    Data-Driven Shape Analysis and Processing

    Full text link
    Data-driven methods play an increasingly important role in discovering geometric, structural, and semantic relationships between 3D shapes in collections, and applying this analysis to support intelligent modeling, editing, and visualization of geometric data. In contrast to traditional approaches, a key feature of data-driven approaches is that they aggregate information from a collection of shapes to improve the analysis and processing of individual shapes. In addition, they are able to learn models that reason about properties and relationships of shapes without relying on hard-coded rules or explicitly programmed instructions. We provide an overview of the main concepts and components of these techniques, and discuss their application to shape classification, segmentation, matching, reconstruction, modeling and exploration, as well as scene analysis and synthesis, through reviewing the literature and relating the existing works with both qualitative and numerical comparisons. We conclude our report with ideas that can inspire future research in data-driven shape analysis and processing.Comment: 10 pages, 19 figure

    Brain Tumor Diagnosis Support System: A decision Fusion Framework

    Get PDF
    An important factor in providing effective and efficient therapy for brain tumors is early and accurate detection, which can increase survival rates. Current image-based tumor detection and diagnosis techniques are heavily dependent on interpretation by neuro-specialists and/or radiologists, making the evaluation process time-consuming and prone to human error and subjectivity. Besides, widespread use of MR spectroscopy requires specialized processing and assessment of the data and obvious and fast show of the results as photos or maps for routine medical interpretative of an exam. Automatic brain tumor detection and classification have the potential to offer greater efficiency and predictions that are more accurate. However, the performance accuracy of automatic detection and classification techniques tends to be dependent on the specific image modality and is well known to vary from technique to technique. For this reason, it would be prudent to examine the variations in the execution of these methods to obtain consistently high levels of achievement accuracy. Designing, implementing, and evaluating categorization software is the goal of the suggested framework for discerning various brain tumor types on magnetic resonance imaging (MRI) using textural features. This thesis introduces a brain tumor detection support system that involves the use of a variety of tumor classifiers. The system is designed as a decision fusion framework that enables these multi-classifier to analyze medical images, such as those obtained from magnetic resonance imaging (MRI). The fusion procedure is ground on the Dempster-Shafer evidence fusion theory. Numerous experimental scenarios have been implemented to validate the efficiency of the proposed framework. Compared with alternative approaches, the outcomes show that the methodology developed in this thesis demonstrates higher accuracy and higher computational efficiency

    Artificial Intelligence Tools for Facial Expression Analysis.

    Get PDF
    Inner emotions show visibly upon the human face and are understood as a basic guide to an individual’s inner world. It is, therefore, possible to determine a person’s attitudes and the effects of others’ behaviour on their deeper feelings through examining facial expressions. In real world applications, machines that interact with people need strong facial expression recognition. This recognition is seen to hold advantages for varied applications in affective computing, advanced human-computer interaction, security, stress and depression analysis, robotic systems, and machine learning. This thesis starts by proposing a benchmark of dynamic versus static methods for facial Action Unit (AU) detection. AU activation is a set of local individual facial muscle parts that occur in unison constituting a natural facial expression event. Detecting AUs automatically can provide explicit benefits since it considers both static and dynamic facial features. For this research, AU occurrence activation detection was conducted by extracting features (static and dynamic) of both nominal hand-crafted and deep learning representation from each static image of a video. This confirmed the superior ability of a pretrained model that leaps in performance. Next, temporal modelling was investigated to detect the underlying temporal variation phases using supervised and unsupervised methods from dynamic sequences. During these processes, the importance of stacking dynamic on top of static was discovered in encoding deep features for learning temporal information when combining the spatial and temporal schemes simultaneously. Also, this study found that fusing both temporal and temporal features will give more long term temporal pattern information. Moreover, we hypothesised that using an unsupervised method would enable the leaching of invariant information from dynamic textures. Recently, fresh cutting-edge developments have been created by approaches based on Generative Adversarial Networks (GANs). In the second section of this thesis, we propose a model based on the adoption of an unsupervised DCGAN for the facial features’ extraction and classification to achieve the following: the creation of facial expression images under different arbitrary poses (frontal, multi-view, and in the wild), and the recognition of emotion categories and AUs, in an attempt to resolve the problem of recognising the static seven classes of emotion in the wild. Thorough experimentation with the proposed cross-database performance demonstrates that this approach can improve the generalization results. Additionally, we showed that the features learnt by the DCGAN process are poorly suited to encoding facial expressions when observed under multiple views, or when trained from a limited number of positive examples. Finally, this research focuses on disentangling identity from expression for facial expression recognition. A novel technique was implemented for emotion recognition from a single monocular image. A large-scale dataset (Face vid) was created from facial image videos which were rich in variations and distribution of facial dynamics, appearance, identities, expressions, and 3D poses. This dataset was used to train a DCNN (ResNet) to regress the expression parameters from a 3D Morphable Model jointly with a back-end classifier

    Development of a Cost-Efficient Multi-Target Classification System Based on FMCW Radar for Security Gate Monitoring

    Get PDF
    Radar systems have a long history. Like many other great inventions, the origin of radar systems lies in warfare. Only in the last decade, radar systems have found widespread civil use in industrial measurement scenarios and automotive safety applications. Due to their resilience against harsh environments, they are used instead of or in addition to optical or ultrasonic systems. Radar sensors hold excellent capabilities to estimate distance and motion accurately, penetrate non-metallic objects, and remain unaffected by weather conditions. These capabilities make these devices extremely flexible in their applications. Electromagnetic waves centered at frequencies around 24 GHz offer high precision target measurements, compact antenna, and circuitry design, and lower atmospheric absorption than higher frequency-based systems. This thesis studies non-cooperative automatic radar multi-target detection and classification. A prototype of a radar system with a new microwave-radar-based technique for short-range detection and classification of multiple human and vehicle targets passing through a road gate is presented. It allows identifying different types of targets, i.e., pedestrians, motorcycles, cars, and trucks. The developed system is based on a low-cost 24 GHz off-the-shelf FMCW radar, combined with an embedded Raspberry Pi PC for data acquisition and transmission to a remote processing PC, which takes care of detection and classification. This approach, which can find applications in both security and infrastructure surveillance, relies upon the processing of the scattered-field data acquired by the radar. The developed method is based on an ad-hoc processing chain to accomplish the automatic target recognition task, which consists of blocks performing clutter and leakage removal with a frame subtraction technique, clustering with a DBSCAN approach, tracking algorithm based on the \u3b1-\u3b2 filter to follow the targets during traversal, features extraction, and finally classification of targets with a classification scheme based on support vector machines. The approach is validated in real experimental scenarios, showing its capabilities incorrectly detecting multiple targets belonging to different classes (i.e., pedestrians, cars, motorcycles, and trucks). The approach has been validated with experimental data acquired in different scenarios, showing good identification capabilities

    Knowledge Modelling and Learning through Cognitive Networks

    Get PDF
    One of the most promising developments in modelling knowledge is cognitive network science, which aims to investigate cognitive phenomena driven by the networked, associative organization of knowledge. For example, investigating the structure of semantic memory via semantic networks has illuminated how memory recall patterns influence phenomena such as creativity, memory search, learning, and more generally, knowledge acquisition, exploration, and exploitation. In parallel, neural network models for artificial intelligence (AI) are also becoming more widespread as inferential models for understanding which features drive language-related phenomena such as meaning reconstruction, stance detection, and emotional profiling. Whereas cognitive networks map explicitly which entities engage in associative relationships, neural networks perform an implicit mapping of correlations in cognitive data as weights, obtained after training over labelled data and whose interpretation is not immediately evident to the experimenter. This book aims to bring together quantitative, innovative research that focuses on modelling knowledge through cognitive and neural networks to gain insight into mechanisms driving cognitive processes related to knowledge structuring, exploration, and learning. The book comprises a variety of publication types, including reviews and theoretical papers, empirical research, computational modelling, and big data analysis. All papers here share a commonality: they demonstrate how the application of network science and AI can extend and broaden cognitive science in ways that traditional approaches cannot
    • …
    corecore