12 research outputs found

    Coopération de réseaux de caméras ambiantes et de vision embarquée sur robot mobile pour la surveillance de lieux publics

    Get PDF
    Actuellement, il y a une demande croissante pour le déploiement de robots mobile dans des lieux publics. Pour alimenter cette demande, plusieurs chercheurs ont déployé des systèmes robotiques de prototypes dans des lieux publics comme les hôpitaux, les supermarchés, les musées, et les environnements de bureau. Une principale préoccupation qui ne doit pas être négligé, comme des robots sortent de leur milieu industriel isolé et commencent à interagir avec les humains dans un espace de travail partagé, est une interaction sécuritaire. Pour un robot mobile à avoir un comportement interactif sécuritaire et acceptable - il a besoin de connaître la présence, la localisation et les mouvements de population à mieux comprendre et anticiper leurs intentions et leurs actions. Cette thèse vise à apporter une contribution dans ce sens en mettant l'accent sur les modalités de perception pour détecter et suivre les personnes à proximité d'un robot mobile. Comme une première contribution, cette thèse présente un système automatisé de détection des personnes visuel optimisé qui prend explicitement la demande de calcul prévue sur le robot en considération. Différentes expériences comparatives sont menées pour mettre clairement en évidence les améliorations de ce détecteur apporte à la table, y compris ses effets sur la réactivité du robot lors de missions en ligne. Dans un deuxiè contribution, la thèse propose et valide un cadre de coopération pour fusionner des informations depuis des caméras ambiant affixé au mur et de capteurs montés sur le robot mobile afin de mieux suivre les personnes dans le voisinage. La même structure est également validée par des données de fusion à partir des différents capteurs sur le robot mobile au cours de l'absence de perception externe. Enfin, nous démontrons les améliorations apportées par les modalités perceptives développés en les déployant sur notre plate-forme robotique et illustrant la capacité du robot à percevoir les gens dans les lieux publics supposés et respecter leur espace personnel pendant la navigation.This thesis deals with detection and tracking of people in a surveilled public place. It proposes to include a mobile robot in classical surveillance systems that are based on environment fixed sensors. The mobile robot brings about two important benefits: (1) it acts as a mobile sensor with perception capabilities, and (2) it can be used as means of action for service provision. In this context, as a first contribution, it presents an optimized visual people detector based on Binary Integer Programming that explicitly takes the computational demand stipulated into consideration. A set of homogeneous and heterogeneous pool of features are investigated under this framework, thoroughly tested and compared with the state-of-the-art detectors. The experimental results clearly highlight the improvements the different detectors learned with this framework bring to the table including its effect on the robot's reactivity during on-line missions. As a second contribution, the thesis proposes and validates a cooperative framework to fuse information from wall mounted cameras and sensors on the mobile robot to better track people in the vicinity. Finally, we demonstrate the improvements brought by the developed perceptual modalities by deploying them on our robotic platform and illustrating the robot's ability to perceive people in supposed public areas and respect their personal space during navigation

    Visual Clutter Study for Pedestrian Using Large Scale Naturalistic Driving Data

    Get PDF
    Some of the pedestrian crashes are due to driver’s late or difficult perception of pedestrian’s appearance. Recognition of pedestrians during driving is a complex cognitive activity. Visual clutter analysis can be used to study the factors that affect human visual search efficiency and help design advanced driver assistant system for better decision making and user experience. In this thesis, we propose the pedestrian perception evaluation model which can quantitatively analyze the pedestrian perception difficulty using naturalistic driving data. An efficient detection framework was developed to locate pedestrians within large scale naturalistic driving data. Visual clutter analysis was used to study the factors that may affect the driver’s ability to perceive pedestrian appearance. The candidate factors were explored by the designed exploratory study using naturalistic driving data and a bottom-up image-based pedestrian clutter metric was proposed to quantify the pedestrian perception difficulty in naturalistic driving data. Based on the proposed bottom-up clutter metrics and top-down pedestrian appearance based estimator, a Bayesian probabilistic pedestrian perception evaluation model was further constructed to simulate the pedestrian perception process

    Pedestrian and Vehicle Detection in Autonomous Vehicle Perception Systems—A Review

    Get PDF
    Autonomous Vehicles (AVs) have the potential to solve many traffic problems, such as accidents, congestion and pollution. However, there are still challenges to overcome, for instance, AVs need to accurately perceive their environment to safely navigate in busy urban scenarios. The aim of this paper is to review recent articles on computer vision techniques that can be used to build an AV perception system. AV perception systems need to accurately detect non-static objects and predict their behaviour, as well as to detect static objects and recognise the information they are providing. This paper, in particular, focuses on the computer vision techniques used to detect pedestrians and vehicles. There have been many papers and reviews on pedestrians and vehicles detection so far. However, most of the past papers only reviewed pedestrian or vehicle detection separately. This review aims to present an overview of the AV systems in general, and then review and investigate several detection computer vision techniques for pedestrians and vehicles. The review concludes that both traditional and Deep Learning (DL) techniques have been used for pedestrian and vehicle detection; however, DL techniques have shown the best results. Although good detection results have been achieved for pedestrians and vehicles, the current algorithms still struggle to detect small, occluded, and truncated objects. In addition, there is limited research on how to improve detection performance in difficult light and weather conditions. Most of the algorithms have been tested on well-recognised datasets such as Caltech and KITTI; however, these datasets have their own limitations. Therefore, this paper recommends that future works should be implemented on more new challenging datasets, such as PIE and BDD100K.EPSRC DTP PhD studentshi

    Efficient Pedestrian Detection in Urban Traffic Scenes

    Get PDF
    Pedestrians are important participants in urban traffic environments, and thus act as an interesting category of objects for autonomous cars. Automatic pedestrian detection is an essential task for protecting pedestrians from collision. In this thesis, we investigate and develop novel approaches by interpreting spatial and temporal characteristics of pedestrians, in three different aspects: shape, cognition and motion. The special up-right human body shape, especially the geometry of the head and shoulder area, is the most discriminative characteristic for pedestrians from other object categories. Inspired by the success of Haar-like features for detecting human faces, which also exhibit a uniform shape structure, we propose to design particular Haar-like features for pedestrians. Tailored to a pre-defined statistical pedestrian shape model, Haar-like templates with multiple modalities are designed to describe local difference of the shape structure. Cognition theories aim to explain how human visual systems process input visual signals in an accurate and fast way. By emulating the center-surround mechanism in human visual systems, we design multi-channel, multi-direction and multi-scale contrast features, and boost them to respond to the appearance of pedestrians. In this way, our detector is considered as a top-down saliency system. In the last part of this thesis, we exploit the temporal characteristics for moving pedestrians and then employ motion information for feature design, as well as for regions of interest (ROIs) selection. Motion segmentation on optical flow fields enables us to select those blobs most probably containing moving pedestrians; a combination of Histogram of Oriented Gradients (HOG) and motion self difference features further enables robust detection. We test our three approaches on image and video data captured in urban traffic scenes, which are rather challenging due to dynamic and complex backgrounds. The achieved results demonstrate that our approaches reach and surpass state-of-the-art performance, and can also be employed for other applications, such as indoor robotics or public surveillance. In this thesis, we investigate and develop novel approaches by interpreting spatial and temporal characteristics of pedestrians, in three different aspects: shape, cognition and motion. The special up-right human body shape, especially the geometry of the head and shoulder area, is the most discriminative characteristic for pedestrians from other object categories. Inspired by the success of Haar-like features for detecting human faces, which also exhibit a uniform shape structure, we propose to design particular Haar-like features for pedestrians. Tailored to a pre-defined statistical pedestrian shape model, Haar-like templates with multiple modalities are designed to describe local difference of the shape structure. Cognition theories aim to explain how human visual systems process input visual signals in an accurate and fast way. By emulating the center-surround mechanism in human visual systems, we design multi-channel, multi-direction and multi-scale contrast features, and boost them to respond to the appearance of pedestrians. In this way, our detector is considered as a top-down saliency system. In the last part of this thesis, we exploit the temporal characteristics for moving pedestrians and then employ motion information for feature design, as well as for regions of interest (ROIs) selection. Motion segmentation on optical flow fields enables us to select those blobs most probably containing moving pedestrians; a combination of Histogram of Oriented Gradients (HOG) and motion self difference features further enables robust detection. We test our three approaches on image and video data captured in urban traffic scenes, which are rather challenging due to dynamic and complex backgrounds. The achieved results demonstrate that our approaches reach and surpass state-of-the-art performance, and can also be employed for other applications, such as indoor robotics or public surveillance

    Monokulare Blickrichtungsschätzung zur berührungslosen Mensch-Maschine-Interaktion

    Get PDF
    Die vorliegende Arbeit beschäftigt sich mit der berührungslosen Mensch-Maschine-Interaktion, welche hier als Interaktion mittels Erkennen der Blickrichtung des Nutzers unter Verwendung einfacher Hardware interpretiert wird. Die Forschungsschwerpunkte liegen in der Extraktion der zur Bestimmung der Blickrichtung benötigten Informationen aus 2D-Bilddaten, bestehend aus der präzisen Position der Iriden und der dreidimensionalen Position des Kopfes, mittels derer die Blickrichtung bestimmt wird

    Monokulare Blickrichtungsschätzung zur berührungslosen Mensch-Maschine-Interaktion

    Get PDF
    Die vorliegende Arbeit beschäftigt sich mit der berührungslosen Mensch-Maschine-Interaktion, welche hier als Interaktion mittels Erkennen der Blickrichtung des Nutzers unter Verwendung einfacher Hardware interpretiert wird. Die Forschungsschwerpunkte liegen in der Extraktion der zur Bestimmung der Blickrichtung benötigten Informationen aus 2D-Bilddaten, bestehend aus der präzisen Position der Iriden und der dreidimensionalen Position des Kopfes, mittels derer die Blickrichtung bestimmt wird

    Video foreground extraction for mobile camera platforms

    Get PDF
    Foreground object detection is a fundamental task in computer vision with many applications in areas such as object tracking, event identification, and behavior analysis. Most conventional foreground object detection methods work only in a stable illumination environments using fixed cameras. In real-world applications, however, it is often the case that the algorithm needs to operate under the following challenging conditions: drastic lighting changes, object shape complexity, moving cameras, low frame capture rates, and low resolution images. This thesis presents four novel approaches for foreground object detection on real-world datasets using cameras deployed on moving vehicles.The first problem addresses passenger detection and tracking tasks for public transport buses investigating the problem of changing illumination conditions and low frame capture rates. Our approach integrates a stable SIFT (Scale Invariant Feature Transform) background seat modelling method with a human shape model into a weighted Bayesian framework to detect passengers. To deal with the problem of tracking multiple targets, we employ the Reversible Jump Monte Carlo Markov Chain tracking algorithm. Using the SVM classifier, the appearance transformation models capture changes in the appearance of the foreground objects across two consecutives frames under low frame rate conditions. In the second problem, we present a system for pedestrian detection involving scenes captured by a mobile bus surveillance system. It integrates scene localization, foreground-background separation, and pedestrian detection modules into a unified detection framework. The scene localization module performs a two stage clustering of the video data.In the first stage, SIFT Homography is applied to cluster frames in terms of their structural similarity, and the second stage further clusters these aligned frames according to consistency in illumination. This produces clusters of images that are differential in viewpoint and lighting. A kernel density estimation (KDE) technique for colour and gradient is then used to construct background models for each image cluster, which is further used to detect candidate foreground pixels. Finally, using a hierarchical template matching approach, pedestrians can be detected.In addition to the second problem, we present three direct pedestrian detection methods that extend the HOG (Histogram of Oriented Gradient) techniques (Dalal and Triggs, 2005) and provide a comparative evaluation of these approaches. The three approaches include: a) a new histogram feature, that is formed by the weighted sum of both the gradient magnitude and the filter responses from a set of elongated Gaussian filters (Leung and Malik, 2001) corresponding to the quantised orientation, which we refer to as the Histogram of Oriented Gradient Banks (HOGB) approach; b) the codebook based HOG feature with branch-and-bound (efficient subwindow search) algorithm (Lampert et al., 2008) and; c) the codebook based HOGB approach.In the third problem, a unified framework that combines 3D and 2D background modelling is proposed to detect scene changes using a camera mounted on a moving vehicle. The 3D scene is first reconstructed from a set of videos taken at different times. The 3D background modelling identifies inconsistent scene structures as foreground objects. For the 2D approach, foreground objects are detected using the spatio-temporal MRF algorithm. Finally, the 3D and 2D results are combined using morphological operations.The significance of these research is that it provides basic frameworks for automatic large-scale mobile surveillance applications and facilitates many higher-level applications such as object tracking and behaviour analysis

    CAMBADA@Home: deteção e seguimento de humanos

    Get PDF
    Mestrado em Engenharia Electrónica e TelecomunicaçõesEste trabalho apresenta uma abordagem ao problema da deteção e seguimento de humanos, usando uma câmara RGB-D. Existem soluções propostas para este tipo de problema, no entanto, algumas são baseadas em técnicas de extração de fundo ou outras e, como tal, necessitam que a câmara se encontre numa posição estacionária. Com o sistema proposto, a deteção e seguimento podem ser desempenhadas enquanto a câmara se move, em tempo real. O objetivo deste projeto é a implementação de um sistema de deteção e seguimento de pessoas para o robô de serviço CAMBADA@Home, permitindo assim o desenvolvimento de futuras aplicações na área da interação humano-robô. O sistema aqui descrito permite realizar deteção, classificação e monitorização de múltiplas pessoas. Na primeira etapa, regiões de interesse (ROIs) são segmentadas através da análise do histograma da imagem de profundidade seguido da utilização de um algoritmo de preenchimento. Na etapa seguinte, cada região é classificada como humana ou não-humana através de uma técnica de correspondência de modelos, baseada no algoritmo de descida de gradiantes RPROP, com suporte para múltiplos modelos. A terceira e última etapa permite a monitorização de várias pessoas, através de um método de atribuição de identificadores únicos baseado em comparação de histogramas, assim como estimação de pose e localização. Os resultados obtidos em ambiente não controlado são encorajadores, com altas taxas de deteção, e, em geral, os algoritmos de estimação de pose e localização são executados como esperado. Para além disto, o projeto CAMBADA@Home foi premiado com o primeiro lugar no Desafio Free Bots, que teve lugar durante o campeonato nacional de robótica, Robótica 2013, onde o robô provou ser capaz de executar rondas autónomas num ambiente desconhecido enquanto detetava e monitorizava pessoas com as quais se cruzava.This work presents an approach to the people detection and tracking problem, using an RGB-D camera. While there are already solutions for this problem, some are based on background extraction techniques or other, which require the camera to be in a stationary position. With the proposed method, detection and tracking can be performed while the camera is moving, in real time. The aim of this project is the implementation of a people detection and tracking system for the CAMBADA@Home service robot, enabling the development of further human-robot interaction applications. The system here described enables object detection, classi cation and multiple person tracking. In the rst stage, regions of interest (ROIs) are segmented through the analysis of the depth image histogram and using a ood ll algorithm. On the next stage, each region is classi ed as human or not-human using a template matching technique, based on the RPROP gradient descent algorithm, with support for multiple templates. The third and last stage enables the tracking for multiple persons, using a unique identi cation assignment method based on histogram comparison, as well as pose and location estimation. The results obtained in unconstrained environments are encouraging, with high detection rates, and, in general, the algorithms for pose and location estimation perform as expected. Furthermore the CAMBADA@Home project has been awarded with the rst place in the Free Bots Challenge, which took place on the Rob otica 2013 robotics national championship, where the robot was proven to be capable of performing autonomous tours in an unknown environment while at the same time detecting and tracking people it came across

    Detección jerárquica de grupos de personas

    Full text link
    La implantación generalizada de cámaras de vídeo en la sociedad hace que sea inviable controlar y analizar las ingentes cantidades de vídeo capturadas. Por este motivo la algoritmia referente al análisis de vídeo ha adquirido en nuestros días gran importancia. Actualmente, los algoritmos de detección de personas en entornos controlados consiguen un rendimiento óptimo, aunque en escenarios con multitud de personas, en los que se generan gran número de oclusiones entre ellas, los algoritmos existentes no tienen un rendimiento aceptable. El objetivo principal de este proyecto es desarrollar un algoritmo de detección de personas en el que su mayor característica diferenciadora será la detección jerárquica de estas con el objetivo de mejorar los algoritmos existentes hasta la actualidad en entornos con alta densidad de personas. La idea principal que se desarrollará durante el proyecto es que la detección no se centre únicamente en la información de personas individuales, sino que utilice la información de detección de múltiples personas para mejorar los resultados obtenidos en este tipo de escenarios. Además, el algoritmo utilizará la información de la fisionomía de la persona, pudiendo esta estar definida como un todo o escogiendo únicamente algunas de sus partes como cabeza, hombro, tronco, etc. El algoritmo propuesto ha sido evaluado sobre secuencias de vídeo de referencia y los resultados obtenidos demuestran que se ha mejorado el rendimiento en la detección de personas debido a las mejoras implementadas.The massive establishment of video cameras in society makes impossible the control and analysis of the enormous amount of videos files captured. For this reason, the algorithm referred to video analyses has lately gained enormous importance. Nowadays, the algorithms used for person detection under control environments, have achieved an optimum performance, although in crowded sceneries, in which a great number of occlusions among themselves occurred, the performance of the actual algorithms are not acceptable. The main objective of this project is to develop an algorithm for people detection whose main difference would be the hierarchical detection, and thus, improve the actual algorithms in high density of people settings. The key point of the project would be that the detection should not be only focused in the information of individuals, but it should also take into consideration the information from the detection of multiple people, and subsequently, improve the results obtained in this type of sceneries. At the same time, the algorithm would use the person appearance, which could be defined as a whole, or by choosing certain parts such as head, shoulder, trunk, etc. The suggested algorithm has been tested in video sequences of reference, and the results obtained demonstrate that the detection performance has improved due to the upgrades implemented
    corecore