1,136 research outputs found

    A Robust Multiple Feature Approach To Endpoint Detection In Car Environment Based On Advanced Classifiers

    Get PDF
    In this paper we propose an endpoint detection system based on the use of several features extracted from each speech frame, followed by a robust classifier (i.e Adaboost and Bagging of decision trees, and a multilayer perceptron) and a finite state automata (FSA). We present results for four different classifiers. The FSA module consisted of a 4-state decision logic that filtered false alarms and false positives. We compare the use of four different classifiers in this task. The look ahead of the method that we propose was of 7 frames, which are the number of frames that maximized the accuracy of the system. The system was tested with real signals recorded inside a car, with signal to noise ratio that ranged from 6 dB to 30dB. Finally we present experimental results demonstrating that the system yields robust endpoint detection

    Detecting emotions from speech using machine learning techniques

    Get PDF
    D.Phil. (Electronic Engineering

    Automatic Pipeline Surveillance Air-Vehicle

    Get PDF
    This thesis presents the developments of a vision-based system for aerial pipeline Right-of-Way surveillance using optical/Infrared sensors mounted on Unmanned Aerial Vehicles (UAV). The aim of research is to develop a highly automated, on-board system for detecting and following the pipelines; while simultaneously detecting any third-party interference. The proposed approach of using a UAV platform could potentially reduce the cost of monitoring and surveying pipelines when compared to manned aircraft. The main contributions of this thesis are the development of the image-analysis algorithms, the overall system architecture and validation of in hardware based on scaled down Test environment. To evaluate the performance of the system, the algorithms were coded using Python programming language. A small-scale test-rig of the pipeline structure, as well as expected third-party interference, was setup to simulate the operational environment and capture/record data for the algorithm testing and validation. The pipeline endpoints are identified by transforming the 16-bits depth data of the explored environment into 3D point clouds world coordinates. Then, using the Random Sample Consensus (RANSAC) approach, the foreground and background are separated based on the transformed 3D point cloud to extract the plane that corresponds to the ground. Simultaneously, the boundaries of the explored environment are detected based on the 16-bit depth data using a canny detector. Following that, these boundaries were filtered out, after being transformed into a 3D point cloud, based on the real height of the pipeline for fast and accurate measurements using a Euclidean distance of each boundary point, relative to the plane of the ground extracted previously. The filtered boundaries were used to detect the straight lines of the object boundary (Hough lines), once transformed into 16-bit depth data, using a Hough transform method. The pipeline is verified by estimating a centre line segment, using a 3D point cloud of each pair of the Hough line segments, (transformed into 3D). Then, the corresponding linearity of the pipeline points cloud is filtered within the width of the pipeline using Euclidean distance in the foreground point cloud. Then, the segment length of the detected centre line is enhanced to match the exact pipeline segment by extending it along the filtered point cloud of the pipeline. The third-party interference is detected based on four parameters, namely: foreground depth data; pipeline depth data; pipeline endpoints location in the 3D point cloud; and Right-of-Way distance. The techniques include detection, classification, and localization algorithms. Finally, a waypoints-based navigation system was implemented for the air- vehicle to fly over the course waypoints that were generated online by a heading angle demand to follow the pipeline structure in real-time based on the online identification of the pipeline endpoints relative to a camera frame

    Robust and real-time hand detection and tracking in monocular video

    Get PDF
    In recent years, personal computing devices such as laptops, tablets and smartphones have become ubiquitous. Moreover, intelligent sensors are being integrated into many consumer devices such as eyeglasses, wristwatches and smart televisions. With the advent of touchscreen technology, a new human-computer interaction (HCI) paradigm arose that allows users to interface with their device in an intuitive manner. Using simple gestures, such as swipe or pinch movements, a touchscreen can be used to directly interact with a virtual environment. Nevertheless, touchscreens still form a physical barrier between the virtual interface and the real world. An increasingly popular field of research that tries to overcome this limitation, is video based gesture recognition, hand detection and hand tracking. Gesture based interaction allows the user to directly interact with the computer in a natural manner by exploring a virtual reality using nothing but his own body language. In this dissertation, we investigate how robust hand detection and tracking can be accomplished under real-time constraints. In the context of human-computer interaction, real-time is defined as both low latency and low complexity, such that a complete video frame can be processed before the next one becomes available. Furthermore, for practical applications, the algorithms should be robust to illumination changes, camera motion, and cluttered backgrounds in the scene. Finally, the system should be able to initialize automatically, and to detect and recover from tracking failure. We study a wide variety of existing algorithms, and propose significant improvements and novel methods to build a complete detection and tracking system that meets these requirements. Hand detection, hand tracking and hand segmentation are related yet technically different challenges. Whereas detection deals with finding an object in a static image, tracking considers temporal information and is used to track the position of an object over time, throughout a video sequence. Hand segmentation is the task of estimating the hand contour, thereby separating the object from its background. Detection of hands in individual video frames allows us to automatically initialize our tracking algorithm, and to detect and recover from tracking failure. Human hands are highly articulated objects, consisting of finger parts that are connected with joints. As a result, the appearance of a hand can vary greatly, depending on the assumed hand pose. Traditional detection algorithms often assume that the appearance of the object of interest can be described using a rigid model and therefore can not be used to robustly detect human hands. Therefore, we developed an algorithm that detects hands by exploiting their articulated nature. Instead of resorting to a template based approach, we probabilistically model the spatial relations between different hand parts, and the centroid of the hand. Detecting hand parts, such as fingertips, is much easier than detecting a complete hand. Based on our model of the spatial configuration of hand parts, the detected parts can be used to obtain an estimate of the complete hand's position. To comply with the real-time constraints, we developed techniques to speed-up the process by efficiently discarding unimportant information in the image. Experimental results show that our method is competitive with the state-of-the-art in object detection while providing a reduction in computational complexity with a factor 1 000. Furthermore, we showed that our algorithm can also be used to detect other articulated objects such as persons or animals and is therefore not restricted to the task of hand detection. Once a hand has been detected, a tracking algorithm can be used to continuously track its position in time. We developed a probabilistic tracking method that can cope with uncertainty caused by image noise, incorrect detections, changing illumination, and camera motion. Furthermore, our tracking system automatically determines the number of hands in the scene, and can cope with hands entering or leaving the video canvas. We introduced several novel techniques that greatly increase tracking robustness, and that can also be applied in other domains than hand tracking. To achieve real-time processing, we investigated several techniques to reduce the search space of the problem, and deliberately employ methods that are easily parallelized on modern hardware. Experimental results indicate that our methods outperform the state-of-the-art in hand tracking, while providing a much lower computational complexity. One of the methods used by our probabilistic tracking algorithm, is optical flow estimation. Optical flow is defined as a 2D vector field describing the apparent velocities of objects in a 3D scene, projected onto the image plane. Optical flow is known to be used by many insects and birds to visually track objects and to estimate their ego-motion. However, most optical flow estimation methods described in literature are either too slow to be used in real-time applications, or are not robust to illumination changes and fast motion. We therefore developed an optical flow algorithm that can cope with large displacements, and that is illumination independent. Furthermore, we introduce a regularization technique that ensures a smooth flow-field. This regularization scheme effectively reduces the number of noisy and incorrect flow-vector estimates, while maintaining the ability to handle motion discontinuities caused by object boundaries in the scene. The above methods are combined into a hand tracking framework which can be used for interactive applications in unconstrained environments. To demonstrate the possibilities of gesture based human-computer interaction, we developed a new type of computer display. This display is completely transparent, allowing multiple users to perform collaborative tasks while maintaining eye contact. Furthermore, our display produces an image that seems to float in thin air, such that users can touch the virtual image with their hands. This floating imaging display has been showcased on several national and international events and tradeshows. The research that is described in this dissertation has been evaluated thoroughly by comparing detection and tracking results with those obtained by state-of-the-art algorithms. These comparisons show that the proposed methods outperform most algorithms in terms of accuracy, while achieving a much lower computational complexity, resulting in a real-time implementation. Results are discussed in depth at the end of each chapter. This research further resulted in an international journal publication; a second journal paper that has been submitted and is under review at the time of writing this dissertation; nine international conference publications; a national conference publication; a commercial license agreement concerning the research results; two hardware prototypes of a new type of computer display; and a software demonstrator

    The 6th Conference of PhD Students in Computer Science

    Get PDF

    Seguimento de pessoas com drones em espaços inteligentes

    Get PDF
    Recent technological progress made over the last decades in the field of Computer Vision has introduced new methods and algorithms with ever increasing performance results. Particularly, the emergence of machine learning algorithms enabled class based object detection on live video feeds. Alongside these advances, Unmanned Aerial Vehicles (more commonly known as drones), have also experienced advancements in both hardware miniaturization and software optimization. Thanks to these improvements, drones have emerged from their military usage based background and are now both used by the general public and the scientific community for applications as distinct as aerial photography and environmental monitoring. This dissertation aims to take advantage of these recent technological advancements and apply state of the art machine learning algorithms in order to create a Unmanned Aerial Vehicle (UAV) based network architecture capable of performing real time people tracking through image detection. To perform object detection, two distinct machine learning algorithms are presented. The first one uses an SVM based approach, while the second one uses an Convolutional Neural Network (CNN) based architecture. Both methods will be evaluated using an image dataset created for the purposes of this dissertation’s work. The evaluations performed regarding the object detectors performance showed that the method using a CNN based architecture was the best both in terms of processing time required and detection accuracy, and therefore, the most suitable method for our implementation. The developed network architecture was tested in a live scenario context, with the results showing that the system is capable of performing people tracking at average walking speeds.O recente progresso tecnológico registado nas últimas décadas no campo da Visão por Computador introduziu novos métodos e algoritmos com um desempenho cada vez mais elevado. Particularmente, a criação de algoritmos de aprendizagem automática tornou possível a detecção de objetos aplicada a feeds de vídeo capturadas em tempo real. Paralelo com este progresso, a tecnologia relativa a veículos aéreos não tripulados, ou drones, também beneficiaram de avanços tanto na miniaturização dos seus componentes de hardware assim como na optimização do software. Graças a essas melhorias, os drones emergiram do seu passado militar e são agora usados tanto pelo público em geral como pela comunidade científica para aplicações tão distintas como fotografia e monitorização ambiental. O objectivo da presente dissertação pretende tirar proveito destes recentes avanços tecnológicos e aplicar algoritmos de aprendizagem automática de última geração para criar um sistema capaz de realizar seguimento automático de pessoas com drones através de visão por computador. Para realizar a detecção de objetos, dois algoritmos distintos de aprendizagem automática são apresentados. O primeiro é dotado de uma abordagem baseada em Support Vector Machine (SVM), enquanto o segundo é caracterizado por uma arquitetura baseada em Redes Neuronais Convolucionais. Ambos os métodos serão avaliados usando uma base de dados de imagens criada para os propósitos da presente dissertação. As avaliações realizadas relativas ao desempenho dos algoritmos de detecção de objectos demonstraram que o método baseado numa arquitetura de Redes Neuronais Covolucionais foi o melhor tanto em termos de tempo de processamento médio assim como na precisão das detecções, revelando-se portanto, como sendo o método mais adequado de acordo com os objectivos pretendidos. O sistema desenvolvido foi testado num contexto real, com os resultados obtidos a demonstrarem que o sistema é capaz de realizar o seguimento de pessoas a velocidades comparáveis a um ritmo normal humano de caminhada.Mestrado em Engenharia Eletrónica e Telecomunicaçõe

    Integrating in silico models and read-across methods for predicting toxicity of chemicals: A step-wise strategy

    Get PDF
    Abstract In silico methods and models are increasingly used for predicting properties of chemicals for hazard identification and hazard characterisation in the absence of experimental toxicity data. Many in silico models are available and can be used individually or in an integrated fashion. Whilst such models offer major benefits to toxicologists, risk assessors and the global scientific community, the lack of a consistent framework for the integration of in silico results can lead to uncertainty and even contradictions across models and users, even for the same chemicals. In this context, a range of methods for integrating in silico results have been proposed on a statistical or case-specific basis. Read-across constitutes another strategy for deriving reference points or points of departure for hazard characterisation of untested chemicals, from the available experimental data for structurally-similar compounds, mostly using expert judgment. Recently a number of software systems have been developed to support experts in this task providing a formalised and structured procedure. Such a procedure could also facilitate further integration of the results generated from in silico models and read-across. This article discusses a framework on weight of evidence published by EFSA to identify the stepwise approach for systematic integration of results or values obtained from these "non-testing methods". Key criteria and best practices for selecting and evaluating individual in silico models are also described, together with the means to combining the results, taking into account any limitations, and identifying strategies that are likely to provide consistent results

    An on-line VAD based on Multi-Normalisation Scoring (MNS) of observation likelihoods

    Get PDF
    Preprint del artículo públicado online el 31 de mayo 2018Voice activity detection (VAD) is an essential task in expert systems that rely on oral interfaces. The VAD module detects the presence of human speech and separates speech segments from silences and non-speech noises. The most popular current on-line VAD systems are based on adaptive parameters which seek to cope with varying channel and noise conditions. The main disadvantages of this approach are the need for some initialisation time to properly adjust the parameters to the incoming signal and uncertain performance in the case of poor estimation of the initial parameters. In this paper we propose a novel on-line VAD based only on previous training which does not introduce any delay. The technique is based on a strategy that we have called Multi-Normalisation Scoring (MNS). It consists of obtaining a vector of multiple observation likelihood scores from normalised mel-cepstral coefficients previously computed from different databases. A classifier is then used to label the incoming observation likelihood vector. Encouraging results have been obtained with a Multi-Layer Perceptron (MLP). This technique can generalise for unseen noise levels and types. A validation experiment with two current standard ITU-T VAD algorithms demonstrates the good performance of the method. Indeed, lower classification error rates are obtained for non-speech frames, while results for speech frames are similar.This work was partially supported by the EU (ERDF) under grant TEC2015-67163-C2-1-R (RESTORE) (MINECO/ERDF, EU) and by the Basque Government under grant KK-2017/00043 (BerbaOla)
    corecore