38 research outputs found

    Suivi Multi-Locuteurs avec des Informations Audio-Visuelles pour la Perception des Robots

    Get PDF
    Robot perception plays a crucial role in human-robot interaction (HRI). Perception system provides the robot information of the surroundings and enables the robot to give feedbacks. In a conversational scenario, a group of people may chat in front of the robot and move freely. In such situations, robots are expected to understand where are the people, who are speaking, or what are they talking about. This thesis concentrates on answering the first two questions, namely speaker tracking and diarization. We use different modalities of the robot’s perception system to achieve the goal. Like seeing and hearing for a human-being, audio and visual information are the critical cues for a robot in a conversational scenario. The advancement of computer vision and audio processing of the last decade has revolutionized the robot perception abilities. In this thesis, we have the following contributions: we first develop a variational Bayesian framework for tracking multiple objects. The variational Bayesian framework gives closed-form tractable problem solutions, which makes the tracking process efficient. The framework is first applied to visual multiple-person tracking. Birth and death process are built jointly with the framework to deal with the varying number of the people in the scene. Furthermore, we exploit the complementarity of vision and robot motorinformation. On the one hand, the robot’s active motion can be integrated into the visual tracking system to stabilize the tracking. On the other hand, visual information can be used to perform motor servoing. Moreover, audio and visual information are then combined in the variational framework, to estimate the smooth trajectories of speaking people, and to infer the acoustic status of a person- speaking or silent. In addition, we employ the model to acoustic-only speaker localization and tracking. Online dereverberation techniques are first applied then followed by the tracking system. Finally, a variant of the acoustic speaker tracking model based on von-Mises distribution is proposed, which is specifically adapted to directional data. All the proposed methods are validated on datasets according to applications.La perception des robots joue un rôle crucial dans l’interaction homme-robot (HRI). Le système de perception fournit les informations au robot sur l’environnement, ce qui permet au robot de réagir en consequence. Dans un scénario de conversation, un groupe de personnes peut discuter devant le robot et se déplacer librement. Dans de telles situations, les robots sont censés comprendre où sont les gens, ceux qui parlent et de quoi ils parlent. Cette thèse se concentre sur les deux premières questions, à savoir le suivi et la diarisation des locuteurs. Nous utilisons différentes modalités du système de perception du robot pour remplir cet objectif. Comme pour l’humain, l’ouie et la vue sont essentielles pour un robot dans un scénario de conversation. Les progrès de la vision par ordinateur et du traitement audio de la dernière décennie ont révolutionné les capacités de perception des robots. Dans cette thèse, nous développons les contributions suivantes : nous développons d’abord un cadre variationnel bayésien pour suivre plusieurs objets. Le cadre bayésien variationnel fournit des solutions explicites, rendant le processus de suivi très efficace. Cette approche est d’abord appliqué au suivi visuel de plusieurs personnes. Les processus de créations et de destructions sont en adéquation avecle modèle probabiliste proposé pour traiter un nombre variable de personnes. De plus, nous exploitons la complémentarité de la vision et des informations du moteur du robot : d’une part, le mouvement actif du robot peut être intégré au système de suivi visuel pour le stabiliser ; d’autre part, les informations visuelles peuvent être utilisées pour effectuer l’asservissement du moteur. Par la suite, les informations audio et visuelles sont combinées dans le modèle variationnel, pour lisser les trajectoires et déduire le statut acoustique d’une personne : parlant ou silencieux. Pour experimenter un scenario où l’informationvisuelle est absente, nous essayons le modèle pour la localisation et le suivi des locuteurs basé sur l’information acoustique uniquement. Les techniques de déréverbération sont d’abord appliquées, dont le résultat est fourni au système de suivi. Enfin, une variante du modèle de suivi des locuteurs basée sur la distribution de von-Mises est proposée, celle-ci étant plus adaptée aux données directionnelles. Toutes les méthodes proposées sont validées sur des bases de données specifiques à chaque application

    Neural-Kalman Schemes for Non-Stationary Channel Tracking and Learning

    Get PDF
    This Thesis focuses on channel tracking in Orthogonal Frequency-Division Multiplexing (OFDM), a widely-used method of data transmission in wireless communications, when abrupt changes occur in the channel. In highly mobile applications, new dynamics appear that might make channel tracking non-stationary, e.g. channels might vary with location, and location rapidly varies with time. Simple examples might be the di erent channel dynamics a train receiver faces when it is close to a station vs. crossing a bridge vs. entering a tunnel, or a car receiver in a route that grows more tra c-dense. Some of these dynamics can be modelled as channel taps dying or being reborn, and so tap birth-death detection is of the essence. In order to improve the quality of communications, we delved into mathematical methods to detect such abrupt changes in the channel, such as the mathematical areas of Sequential Analysis/ Abrupt Change Detection and Random Set Theory (RST), as well as the engineering advances in Neural Network schemes. This knowledge helped us nd a solution to the problem of abrupt change detection by informing and inspiring the creation of low-complexity implementations for real-world channel tracking. In particular, two such novel trackers were created: the Simpli- ed Maximum A Posteriori (SMAP) and the Neural-Network-switched Kalman Filtering (NNKF) schemes. The SMAP is a computationally inexpensive, threshold-based abrupt-change detector. It applies the three following heuristics for tap birth-death detection: a) detect death if the tap gain jumps into approximately zero (memoryless detection); b) detect death if the tap gain has slowly converged into approximately zero (memory detection); c) detect birth if the tap gain is far from zero. The precise parameters for these three simple rules can be approximated with simple theoretical derivations and then ne-tuned through extensive simulations. The status detector for each tap using only these three computationally inexpensive threshold comparisons achieves an error reduction matching that of a close-to-perfect path death/birth detection, as shown in simulations. This estimator was shown to greatly reduce channel tracking error in the target Signal-to-Noise Ratio (SNR) range at a very small computational cost, thus outperforming previously known systems. The underlying RST framework for the SMAP was then extended to combined death/birth and SNR detection when SNR is dynamical and may drift. We analyzed how di erent quasi-ideal SNR detectors a ect the SMAP-enhanced Kalman tracker's performance. Simulations showed SMAP is robust to SNR drift in simulations, although it was also shown to bene t from an accurate SNR detection. The core idea behind the second novel tracker, NNKFs, is similar to the SMAP, but now the tap birth/death detection will be performed via an arti cial neuronal network (NN). Simulations show that the proposed NNKF estimator provides extremely good performance, practically identical to a detector with 100% accuracy. These proposed Neural-Kalman schemes can work as novel trackers for multipath channels, since they are robust to wide variations in the probabilities of tap birth and death. Such robustness suggests a single, low-complexity NNKF could be reusable over di erent tap indices and communication environments. Furthermore, a di erent kind of abrupt change was proposed and analyzed: energy shifts from one channel tap to adjacent taps (partial tap lateral hops). This Thesis also discusses how to model, detect and track such changes, providing a geometric justi cation for this and additional non-stationary dynamics in vehicular situations, such as road scenarios where re ections on trucks and vans are involved, or the visual appearance/disappearance of drone swarms. An extensive literature review of empirically-backed abrupt-change dynamics in channel modelling/measuring campaigns is included. For this generalized framework of abrupt channel changes that includes partial tap lateral hopping, a neural detector for lateral hops with large energy transfers is introduced. Simulation results suggest the proposed NN architecture might be a feasible lateral hop detector, suitable for integration in NNKF schemes. Finally, the newly found understanding of abrupt changes and the interactions between Kalman lters and neural networks is leveraged to analyze the neural consequences of abrupt changes and brie y sketch a novel, abrupt-change-derived stochastic model for neural intelligence, extract some neuro nancial consequences of unstereotyped abrupt dynamics, and propose a new portfolio-building mechanism in nance: Highly Leveraged Abrupt Bets Against Failing Experts (HLABAFEOs). Some communication-engineering-relevant topics, such as a Bayesian stochastic stereotyper for hopping Linear Gauss-Markov (LGM) models, are discussed in the process. The forecasting problem in the presence of expert disagreements is illustrated with a hopping LGM model and a novel structure for a Bayesian stereotyper is introduced that might eventually solve such problems through bio-inspired, neuroscienti cally-backed mechanisms, like dreaming and surprise (biological Neural-Kalman). A generalized framework for abrupt changes and expert disagreements was introduced with the novel concept of Neural-Kalman Phenomena. This Thesis suggests mathematical (Neural-Kalman Problem Category Conjecture), neuro-evolutionary and social reasons why Neural-Kalman Phenomena might exist and found signi cant evidence for their existence in the areas of neuroscience and nance. Apart from providing speci c examples, practical guidelines and historical (out)performance for some HLABAFEO investing portfolios, this multidisciplinary research suggests that a Neural- Kalman architecture for ever granular stereotyping providing a practical solution for continual learning in the presence of unstereotyped abrupt dynamics would be extremely useful in communications and other continual learning tasks.Programa de Doctorado en Multimedia y Comunicaciones por la Universidad Carlos III de Madrid y la Universidad Rey Juan CarlosPresidente: Luis Castedo Ribas.- Secretaria: Ana García Armada.- Vocal: José Antonio Portilla Figuera

    Bayesian multi-target tracking: application to total internal reflection fluorescence microscopy

    No full text
    This thesis focuses on the problem of automated tracking of tiny cellular and sub-cellular structures, known as particles, in the sequences acquired from total internal reflection fluorescence microscopy (TIRFM) imaging technique. Our primary biological motivation is to develop an automated system for tracking the sub-cellular structures involving exocytosis (an intracellular mechanism) which is helpful for studying the possible causes of the defects in diseases such as diabetes and obesity. However, all methods proposed in this thesis are generalized to be applicable for a wide range of particle tracking applications. A reliable multi-particle tracking method should be capable of tracking numerous similar objects in the presence of high levels of noise, high target density and complex motions and interactions. In this thesis, we choose the Bayesian filtering framework as our main approach to deal with this problem. We focus on the approaches that work based on detections. Therefore, in this thesis, we first propose a method that robustly detects the particles in the noisy TIRFM sequences with inhomogeneous and time-varying background. In order to evaluate our detection and tracking methods on the sequences with known and reliable ground truth, we also present a framework for generating realistic synthetic TIRFM data. To propose a reliable multi-particle tracking method for TIRFM sequences, we suggest a framework by combining two robust Bayesian filters, the interacting multiple model and joint probabilistic data association (IMM-JPDA) filters. The performance of our particle tracking method is compared against those of several popular and state-of-the art particle tracking approaches on both synthetic and real sequences. Although our approach performs well in tracking particles, it can be very computationally demanding for the applications with dense targets with poor detections. To propose a computationally cheap, but reliable, multi-particle tracking method, we investigate the performance of a recent multi-target Bayesian filter based on random finite theory, the probability hypothesis density (PHD) filter, on our application. To this end, we propose a general framework for tracking particles using this filter. Moreover, we assess the performance of our proposed PHD filter on both synthetic and real sequences with high level of noise and particle density. We compare its results from both aspects of accuracy and processing time against our IMM-JPDA filter. Finally, we suggest a framework for tracking particles in a challenging problem where the noise characteristic and the background intensity of sequences change during the acquisition process which make detection profile and clutter rate time-variant. To deal with this, we propose a bootstrap filter using another type of the random finite set based Bayesian filters, the cardinalized PHD (CPHD) filter, composed of an estimator and a tracker. The estimator adaptively estimates the required meta parameters for the tracker such as clutter rate and the detection probability while the tracker estimates the state of the targets. We evaluate the performance of our bootstrap on both synthetic and real sequences under these time-varying conditions. Moreover, its performance is compared against those of our other particle trackers as well as the state-of-the art particle tracking approaches

    Tracking and Fusion Methods for Extended Targets Parameterized by Center, Orientation, and Semi-axes

    Get PDF
    The improvements in sensor technology, e.g., the development of automotive Radio Detection and Ranging (RADAR) or Light Detection and Ranging (LIDAR), which are able to provide a higher detail of the sensor’s environment, have introduced new opportunities but also new challenges to target tracking. In classic target tracking, targets are assumed as points. However, this assumption is no longer valid if targets occupy more than one sensor resolution cell, creating the need for extended targets, modeling the shape in addition to the kinematic parameters. Different shape models are possible and this thesis focuses on an elliptical shape, parameterized with center, orientation, and semi-axes lengths. This parameterization can be used to model rectangles as well. Furthermore, this thesis is concerned with multi-sensor fusion for extended targets, which can be used to improve the target tracking by providing information gathered from different sensors or perspectives. We also consider estimation of extended targets, i.e., to account for uncertainties, the target is modeled by a probability density, so we need to find a so-called point estimate. Extended target tracking provides a variety of challenges due to the spatial extent, which need to be handled, even for basic shapes like ellipses and rectangles. Among these challenges are the choice of the target model, e.g., how the measurements are distributed across the shape. Additional challenges arise for sensor fusion, as it is unclear how to best consider the geometric properties when combining two extended targets. Finally, the extent needs to be involved in the estimation. Traditional methods often use simple uniform distributions across the shape, which do not properly portray reality, while more complex methods require the use of optimization techniques or large amounts of data. In addition, for traditional estimation, metrics such as the Euclidean distance between state vectors are used. However, they might no longer be valid because they do not consider the geometric properties of the targets’ shapes, e.g., rotating an ellipse by 180 degree results in the same ellipse, but the Euclidean distance between them is not 0. In multi-sensor fusion, the same holds, i.e., simply combining the corresponding elements of the state vectors can lead to counter-intuitive fusion results. In this work, we compare different elliptic trackers and discuss more complex measurement distributions across the shape’s surface or contour. Furthermore, we discuss the problems which can occur when fusing extended target estimates from different sensors and how to handle them by providing a transformation into a special density. We then proceed to discuss how a different metric, namely the Gaussian Wasserstein (GW) distance, can be used to improve target estimation. We define an estimator and propose an approximation based on an extension of the square root distance. It can be applied on the posterior densities of the aforementioned trackers to incorporate the unique properties of ellipses in the estimation process. We also discuss how this can be applied to rectangular targets as well. Finally, we evaluate and discuss our approaches. We show the benefits of more complex target models in simulations and on real data and we demonstrate our estimation and fusion approaches compared to classic methods on simulated data.2022-01-2

    Estimation and control of multi-object systems with high-fidenlity sensor models: A labelled random finite set approach

    Get PDF
    Principled and novel multi-object tracking algorithms are proposed, that have the ability to optimally process realistic sensor data, by accommodating complex observational phenomena such as merged measurements and extended targets. Additionally, a sensor control scheme based on a tractable, information theoretic objective is proposed, the goal of which is to optimise tracking performance in multi-object scenarios. The concept of labelled random finite sets is adopted in the development of these new techniques

    Aircraft state estimation using cameras and passive radar

    Get PDF
    Multiple target tracking (MTT) is a fundamental task in many application domains. It is a difficult problem to solve in general, so applications make use of domain specific and problem-specific knowledge to approach the problem by solving subtasks separately. This work puts forward a MTT framework (MTTF) which is based on the Bayesian recursive estimator (BRE). The MTTF extends a particle filter (PF) to handle the multiple targets and adds a probabilistic graphical model (PGM) data association stage to compute the mapping from detections to trackers. The MTTF was applied to the problem of passively monitoring airspace. Two applications were built: a passive radar MTT module and a comprehensive visual object tracking (VOT) system. Both applications require a solution to the MTT problem, for which the MTTF was utilized. The VOT system performed well on real data recorded at the University of Cape Town (UCT) as part of this investigation. The system was able to detect and track aircraft flying within the region of interest (ROI). The VOT system consisted of a single camera, an image processing module, the MTTF module and an evaluation module. The world coordinate frame target localization was within ±3.2 km and these results are presented on Google Earth. The image plane target localization has an average reprojection error of ±17.3 pixels. The VOT system achieved an average area under the curve value of 0.77 for all receiver operating characteristic curves. These performance figures are typical over the ±1 hr of video recordings taken from the UCT site. The passive radar application was tested on simulated data. The MTTF module was designed to connect to an existing passive radar system developed by Peralex Electronics Pty Ltd. The MTTF module estimated the number of targets in the scene and localized them within a 2D local world Cartesian coordinate system. The investigations encompass numerous areas of research as well as practical aspects of software engineering and systems design
    corecore