13 research outputs found

    Enhanced particle PHD filtering for multiple human tracking

    Get PDF
    PhD ThesisVideo-based single human tracking has found wide application but multiple human tracking is more challenging and enhanced processing techniques are required to estimate the positions and number of targets in each frame. In this thesis, the particle probability hypothesis density (PHD) lter is therefore the focus due to its ability to estimate both localization and cardinality information related to multiple human targets. To improve the tracking performance of the particle PHD lter, a number of enhancements are proposed. The Student's-t distribution is employed within the state and measurement models of the PHD lter to replace the Gaussian distribution because of its heavier tails, and thereby better predict particles with larger amplitudes. Moreover, the variational Bayesian approach is utilized to estimate the relationship between the measurement noise covariance matrix and the state model, and a joint multi-dimensioned Student's-t distribution is exploited. In order to obtain more observable measurements, a backward retrodiction step is employed to increase the measurement set, building upon the concept of a smoothing algorithm. To make further improvement, an adaptive step is used to combine the forward ltering and backward retrodiction ltering operations through the similarities of measurements achieved over discrete time. As such, the errors in the delayed measurements generated by false alarms and environment noise are avoided. In the nal work, information describing human behaviour is employed iv Abstract v to aid particle sampling in the prediction step of the particle PHD lter, which is captured in a social force model. A novel social force model is proposed based on the exponential function. Furthermore, a Markov Chain Monte Carlo (MCMC) step is utilized to resample the predicted particles, and the acceptance ratio is calculated by the results from the social force model to achieve more robust prediction. Then, a one class support vector machine (OCSVM) is applied in the measurement model of the PHD lter, trained on human features, to mitigate noise from the environment and to achieve better tracking performance. The proposed improvements of the particle PHD lters are evaluated with benchmark datasets such as the CAVIAR, PETS2009 and TUD datasets and assessed with quantitative and global evaluation measures, and are compared with state-of-the-art techniques to con rm the improvement of multiple human tracking performance

    Multi-Object Tracking with Interacting Vehicles and Road Map Information

    Full text link
    In many applications, tracking of multiple objects is crucial for a perception of the current environment. Most of the present multi-object tracking algorithms assume that objects move independently regarding other dynamic objects as well as the static environment. Since in many traffic situations objects interact with each other and in addition there are restrictions due to drivable areas, the assumption of an independent object motion is not fulfilled. This paper proposes an approach adapting a multi-object tracking system to model interaction between vehicles, and the current road geometry. Therefore, the prediction step of a Labeled Multi-Bernoulli filter is extended to facilitate modeling interaction between objects using the Intelligent Driver Model. Furthermore, to consider road map information, an approximation of a highly precise road map is used. The results show that in scenarios where the assumption of a standard motion model is violated, the tracking system adapted with the proposed method achieves higher accuracy and robustness in its track estimations

    Novel data association methods for online multiple human tracking

    Get PDF
    PhD ThesisVideo-based multiple human tracking has played a crucial role in many applications such as intelligent video surveillance, human behavior analysis, and health-care systems. The detection based tracking framework has become the dominant paradigm in this research eld, and the major task is to accurately perform the data association between detections across the frames. However, online multiple human tracking, which merely relies on the detections given up to the present time for the data association, becomes more challenging with noisy detections, missed detections, and occlusions. To address these challenging problems, there are three novel data association methods for online multiple human tracking are presented in this thesis, which are online group-structured dictionary learning, enhanced detection reliability and multi-level cooperative fusion. The rst proposed method aims to address the noisy detections and occlusions. In this method, sequential Monte Carlo probability hypothesis density (SMC-PHD) ltering is the core element for accomplishing the tracking task, where the measurements are produced by the detection based tracking framework. To enhance the measurement model, a novel adaptive gating strategy is developed to aid the classi cation of measurements. In addition, online group-structured dictionary learning with a maximum voting method is proposed to estimate robustly the target birth intensity. It enables the new-born targets in the tracking process to be accurately initialized from noisy sensor measurements. To improve the adaptability of the group-structured dictionary to target appearance changes, the simultaneous codeword optimization (SimCO) algorithm is employed for the dictionary update. The second proposed method relates to accurate measurement selection of detections, which is further to re ne the noisy detections prior to the tracking pipeline. In order to achieve more reliable measurements in the Gaussian mixture (GM)-PHD ltering process, a global-to-local enhanced con dence rescoring strategy is proposed by exploiting the classi cation power of a mask region-convolutional neural network (R-CNN). Then, an improved pruning algorithm namely soft-aggregated non-maximal suppression (Soft-ANMS) is devised to further enhance the selection step. In addition, to avoid the misuse of ambiguous measurements in the tracking process, person re-identi cation (ReID) features driven by convolutional neural networks (CNNs) are integrated to model the target appearances. The third proposed method focuses on addressing the issues of missed detections and occlusions. This method integrates two human detectors with di erent characteristics (full-body and body-parts) in the GM-PHD lter, and investigates their complementary bene ts for tracking multiple targets. For each detector domain, a novel discriminative correlation matching (DCM) model for integration in the feature-level fusion is proposed, and together with spatio-temporal information is used to reduce the ambiguous identity associations in the GM-PHD lter. Moreover, a robust fusion center is proposed within the decision-level fusion to mitigate the sensitivity of missed detections in the fusion process, thereby improving the fusion performance and tracking consistency. The e ectiveness of these proposed methods are investigated using the MOTChallenge benchmark, which is a framework for the standardized evaluation of multiple object tracking methods. Detailed evaluations on challenging video datasets, as well as comparisons with recent state-of-the-art techniques, con rm the improved multiple human tracking performance

    Context-aware home monitoring system for Parkinson's disease patietns : ambient and werable sensing for freezing of gait detection

    Get PDF
    Tesi en modalitat de cotutela: Universitat Polit猫cnica de Catalunya i Technische Universiteit Eindhoven. This PhD Thesis has been developed in the framework of, and according to, the rules of the Erasmus Mundus Joint Doctorate on Interactive and Cognitive Environments EMJD ICE [FPA no. 2010-0012]Parkinson鈥檚 disease (PD). It is characterized by brief episodes of inability to step, or by extremely short steps that typically occur on gait initiation or on turning while walking. The consequences of FOG are aggravated mobility and higher affinity to falls, which have a direct effect on the quality of life of the individual. There does not exist completely effective pharmacological treatment for the FOG phenomena. However, external stimuli, such as lines on the floor or rhythmic sounds, can focus the attention of a person who experiences a FOG episode and help her initiate gait. The optimal effectiveness in such approach, known as cueing, is achieved through timely activation of a cueing device upon the accurate detection of a FOG episode. Therefore, a robust and accurate FOG detection is the main problem that needs to be solved when developing a suitable assistive technology solution for this specific user group. This thesis proposes the use of activity and spatial context of a person as the means to improve the detection of FOG episodes during monitoring at home. The thesis describes design, algorithm implementation and evaluation of a distributed home system for FOG detection based on multiple cameras and a single inertial gait sensor worn at the waist of the patient. Through detailed observation of collected home data of 17 PD patients, we realized that a novel solution for FOG detection could be achieved by using contextual information of the patient鈥檚 position, orientation, basic posture and movement on a semantically annotated two-dimensional (2D) map of the indoor environment. We envisioned the future context-aware system as a network of Microsoft Kinect cameras placed in the patient鈥檚 home that interacts with a wearable inertial sensor on the patient (smartphone). Since the hardware platform of the system constitutes from the commercial of-the-shelf hardware, the majority of the system development efforts involved the production of software modules (for position tracking, orientation tracking, activity recognition) that run on top of the middle-ware operating system in the home gateway server. The main component of the system that had to be developed is the Kinect application for tracking the position and height of multiple people, based on the input in the form of 3D point cloud data. Besides position tracking, this software module also provides mapping and semantic annotation of FOG specific zones on the scene in front of the Kinect. One instance of vision tracking application is supposed to run for every Kinect sensor in the system, yielding potentially high number of simultaneous tracks. At any moment, the system has to track one specific person - the patient. To enable tracking of the patient between different non-overlapped cameras in the distributed system, a new re-identification approach based on appearance model learning with one-class Support Vector Machine (SVM) was developed. Evaluation of the re-identification method was conducted on a 16 people dataset in a laboratory environment. Since the patient orientation in the indoor space was recognized as an important part of the context, the system necessitated the ability to estimate the orientation of the person, expressed in the frame of the 2D scene on which the patient is tracked by the camera. We devised method to fuse position tracking information from the vision system and inertial data from the smartphone in order to obtain patient鈥檚 2D pose estimation on the scene map. Additionally, a method for the estimation of the position of the smartphone on the waist of the patient was proposed. Position and orientation estimation accuracy were evaluated on a 12 people dataset. Finally, having available positional, orientation and height information, a new seven-class activity classification was realized using a hierarchical classifier that combines height-based posture classifier with translational and rotational SVM movement classifiers. Each of the SVM movement classifiers and the joint hierarchical classifier were evaluated in the laboratory experiment with 8 healthy persons. The final context-based FOG detection algorithm uses activity information and spatial context information in order to confirm or disprove FOG detected by the current state-of-the-art FOG detection algorithm (which uses only wearable sensor data). A dataset with home data of 3 PD patients was produced using two Kinect cameras and a smartphone in synchronized recording. The new context-based FOG detection algorithm and the wearable-only FOG detection algorithm were both evaluated with the home dataset and their results were compared. The context-based algorithm very positively influences the reduction of false positive detections, which is expressed through achieved higher specificity. In some cases, context-based algorithm also eliminates true positive detections, reducing sensitivity to the lesser extent. The final comparison of the two algorithms on the basis of their sensitivity and specificity, shows the improvement in the overall FOG detection achieved with the new context-aware home system.Esta tesis propone el uso de la actividad y el contexto espacial de una persona como medio para mejorar la detecci贸n de episodios de FOG (Freezing of gait) durante el seguimiento en el domicilio. La tesis describe el dise帽o, implementaci贸n de algoritmos y evaluaci贸n de un sistema dom茅stico distribuido para detecci贸n de FOG basado en varias c谩maras y un 煤nico sensor de marcha inercial en la cintura del paciente. Mediante de la observaci贸n detallada de los datos caseros recopilados de 17 pacientes con EP, nos dimos cuenta de que se puede lograr una soluci贸n novedosa para la detecci贸n de FOG mediante el uso de informaci贸n contextual de la posici贸n del paciente, orientaci贸n, postura b谩sica y movimiento anotada sem谩nticamente en un mapa bidimensional (2D) del entorno interior. Imaginamos el futuro sistema de consciencia del contexto como una red de c谩maras Microsoft Kinect colocadas en el hogar del paciente, que interact煤a con un sensor de inercia port谩til en el paciente (tel茅fono inteligente). Al constituirse la plataforma del sistema a partir de hardware comercial disponible, los esfuerzos de desarrollo consistieron en la producci贸n de m贸dulos de software (para el seguimiento de la posici贸n, orientaci贸n seguimiento, reconocimiento de actividad) que se ejecutan en la parte superior del sistema operativo del servidor de puerta de enlace de casa. El componente principal del sistema que tuvo que desarrollarse es la aplicaci贸n Kinect para seguimiento de la posici贸n y la altura de varias personas, seg煤n la entrada en forma de punto 3D de datos en la nube. Adem谩s del seguimiento de posici贸n, este m贸dulo de software tambi茅n proporciona mapeo y sem谩ntica. anotaci贸n de zonas espec铆ficas de FOG en la escena frente al Kinect. Se supone que una instancia de la aplicaci贸n de seguimiento de visi贸n se ejecuta para cada sensor Kinect en el sistema, produciendo un n煤mero potencialmente alto de pistas simult谩neas. En cualquier momento, el sistema tiene que rastrear a una persona espec铆fica - el paciente. Para habilitar el seguimiento del paciente entre diferentes c谩maras no superpuestas en el sistema distribuido, se desarroll贸 un nuevo enfoque de re-identificaci贸n basado en el aprendizaje de modelos de apariencia con one-class Suport Vector Machine (SVM). La evaluaci贸n del m茅todo de re-identificaci贸n se realiz贸 con un conjunto de datos de 16 personas en un entorno de laboratorio. Dado que la orientaci贸n del paciente en el espacio interior fue reconocida como una parte importante del contexto, el sistema necesitaba la capacidad de estimar la orientaci贸n de la persona, expresada en el marco de la escena 2D en la que la c谩mara sigue al paciente. Dise帽amos un m茅todo para fusionar la informaci贸n de seguimiento de posici贸n del sistema de visi贸n y los datos de inercia del smartphone para obtener la estimaci贸n de postura 2D del paciente en el mapa de la escena. Adem谩s, se propuso un m茅todo para la estimaci贸n de la posici贸n del Smartphone en la cintura del paciente. La precisi贸n de la estimaci贸n de la posici贸n y la orientaci贸n se evalu贸 en un conjunto de datos de 12 personas. Finalmente, al tener disponible informaci贸n de posici贸n, orientaci贸n y altura, se realiz贸 una nueva clasificaci贸n de actividad de seven-class utilizando un clasificador jer谩rquico que combina un clasificador de postura basado en la altura con clasificadores de movimiento SVM traslacional y rotacional. Cada uno de los clasificadores de movimiento SVM y el clasificador jer谩rquico conjunto se evaluaron en el experimento de laboratorio con 8 personas sanas. El 煤ltimo algoritmo de detecci贸n de FOG basado en el contexto utiliza informaci贸n de actividad e informaci贸n de texto espacial para confirmar o refutar el FOG detectado por el algoritmo de detecci贸n de FOG actual. El algoritmo basado en el contexto influye muy positivamente en la reducci贸n de las detecciones de falsos positivos, que se expresa a trav茅s de una mayor especificidadPostprint (published version

    Context-aware home monitoring system for Parkinson's disease patients : ambient and wearable sensing for freezing of gait detection

    Get PDF
    Tesi en modalitat de cotutela: Universitat Polit猫cnica de Catalunya i Technische Universiteit Eindhoven. This PhD Thesis has been developed in the framework of, and according to, the rules of the Erasmus Mundus Joint Doctorate on Interactive and Cognitive Environments EMJD ICE [FPA no. 2010-0012]Parkinson鈥檚 disease (PD). It is characterized by brief episodes of inability to step, or by extremely short steps that typically occur on gait initiation or on turning while walking. The consequences of FOG are aggravated mobility and higher affinity to falls, which have a direct effect on the quality of life of the individual. There does not exist completely effective pharmacological treatment for the FOG phenomena. However, external stimuli, such as lines on the floor or rhythmic sounds, can focus the attention of a person who experiences a FOG episode and help her initiate gait. The optimal effectiveness in such approach, known as cueing, is achieved through timely activation of a cueing device upon the accurate detection of a FOG episode. Therefore, a robust and accurate FOG detection is the main problem that needs to be solved when developing a suitable assistive technology solution for this specific user group. This thesis proposes the use of activity and spatial context of a person as the means to improve the detection of FOG episodes during monitoring at home. The thesis describes design, algorithm implementation and evaluation of a distributed home system for FOG detection based on multiple cameras and a single inertial gait sensor worn at the waist of the patient. Through detailed observation of collected home data of 17 PD patients, we realized that a novel solution for FOG detection could be achieved by using contextual information of the patient鈥檚 position, orientation, basic posture and movement on a semantically annotated two-dimensional (2D) map of the indoor environment. We envisioned the future context-aware system as a network of Microsoft Kinect cameras placed in the patient鈥檚 home that interacts with a wearable inertial sensor on the patient (smartphone). Since the hardware platform of the system constitutes from the commercial of-the-shelf hardware, the majority of the system development efforts involved the production of software modules (for position tracking, orientation tracking, activity recognition) that run on top of the middle-ware operating system in the home gateway server. The main component of the system that had to be developed is the Kinect application for tracking the position and height of multiple people, based on the input in the form of 3D point cloud data. Besides position tracking, this software module also provides mapping and semantic annotation of FOG specific zones on the scene in front of the Kinect. One instance of vision tracking application is supposed to run for every Kinect sensor in the system, yielding potentially high number of simultaneous tracks. At any moment, the system has to track one specific person - the patient. To enable tracking of the patient between different non-overlapped cameras in the distributed system, a new re-identification approach based on appearance model learning with one-class Support Vector Machine (SVM) was developed. Evaluation of the re-identification method was conducted on a 16 people dataset in a laboratory environment. Since the patient orientation in the indoor space was recognized as an important part of the context, the system necessitated the ability to estimate the orientation of the person, expressed in the frame of the 2D scene on which the patient is tracked by the camera. We devised method to fuse position tracking information from the vision system and inertial data from the smartphone in order to obtain patient鈥檚 2D pose estimation on the scene map. Additionally, a method for the estimation of the position of the smartphone on the waist of the patient was proposed. Position and orientation estimation accuracy were evaluated on a 12 people dataset. Finally, having available positional, orientation and height information, a new seven-class activity classification was realized using a hierarchical classifier that combines height-based posture classifier with translational and rotational SVM movement classifiers. Each of the SVM movement classifiers and the joint hierarchical classifier were evaluated in the laboratory experiment with 8 healthy persons. The final context-based FOG detection algorithm uses activity information and spatial context information in order to confirm or disprove FOG detected by the current state-of-the-art FOG detection algorithm (which uses only wearable sensor data). A dataset with home data of 3 PD patients was produced using two Kinect cameras and a smartphone in synchronized recording. The new context-based FOG detection algorithm and the wearable-only FOG detection algorithm were both evaluated with the home dataset and their results were compared. The context-based algorithm very positively influences the reduction of false positive detections, which is expressed through achieved higher specificity. In some cases, context-based algorithm also eliminates true positive detections, reducing sensitivity to the lesser extent. The final comparison of the two algorithms on the basis of their sensitivity and specificity, shows the improvement in the overall FOG detection achieved with the new context-aware home system.Esta tesis propone el uso de la actividad y el contexto espacial de una persona como medio para mejorar la detecci贸n de episodios de FOG (Freezing of gait) durante el seguimiento en el domicilio. La tesis describe el dise帽o, implementaci贸n de algoritmos y evaluaci贸n de un sistema dom茅stico distribuido para detecci贸n de FOG basado en varias c谩maras y un 煤nico sensor de marcha inercial en la cintura del paciente. Mediante de la observaci贸n detallada de los datos caseros recopilados de 17 pacientes con EP, nos dimos cuenta de que se puede lograr una soluci贸n novedosa para la detecci贸n de FOG mediante el uso de informaci贸n contextual de la posici贸n del paciente, orientaci贸n, postura b谩sica y movimiento anotada sem谩nticamente en un mapa bidimensional (2D) del entorno interior. Imaginamos el futuro sistema de consciencia del contexto como una red de c谩maras Microsoft Kinect colocadas en el hogar del paciente, que interact煤a con un sensor de inercia port谩til en el paciente (tel茅fono inteligente). Al constituirse la plataforma del sistema a partir de hardware comercial disponible, los esfuerzos de desarrollo consistieron en la producci贸n de m贸dulos de software (para el seguimiento de la posici贸n, orientaci贸n seguimiento, reconocimiento de actividad) que se ejecutan en la parte superior del sistema operativo del servidor de puerta de enlace de casa. El componente principal del sistema que tuvo que desarrollarse es la aplicaci贸n Kinect para seguimiento de la posici贸n y la altura de varias personas, seg煤n la entrada en forma de punto 3D de datos en la nube. Adem谩s del seguimiento de posici贸n, este m贸dulo de software tambi茅n proporciona mapeo y sem谩ntica. anotaci贸n de zonas espec铆ficas de FOG en la escena frente al Kinect. Se supone que una instancia de la aplicaci贸n de seguimiento de visi贸n se ejecuta para cada sensor Kinect en el sistema, produciendo un n煤mero potencialmente alto de pistas simult谩neas. En cualquier momento, el sistema tiene que rastrear a una persona espec铆fica - el paciente. Para habilitar el seguimiento del paciente entre diferentes c谩maras no superpuestas en el sistema distribuido, se desarroll贸 un nuevo enfoque de re-identificaci贸n basado en el aprendizaje de modelos de apariencia con one-class Suport Vector Machine (SVM). La evaluaci贸n del m茅todo de re-identificaci贸n se realiz贸 con un conjunto de datos de 16 personas en un entorno de laboratorio. Dado que la orientaci贸n del paciente en el espacio interior fue reconocida como una parte importante del contexto, el sistema necesitaba la capacidad de estimar la orientaci贸n de la persona, expresada en el marco de la escena 2D en la que la c谩mara sigue al paciente. Dise帽amos un m茅todo para fusionar la informaci贸n de seguimiento de posici贸n del sistema de visi贸n y los datos de inercia del smartphone para obtener la estimaci贸n de postura 2D del paciente en el mapa de la escena. Adem谩s, se propuso un m茅todo para la estimaci贸n de la posici贸n del Smartphone en la cintura del paciente. La precisi贸n de la estimaci贸n de la posici贸n y la orientaci贸n se evalu贸 en un conjunto de datos de 12 personas. Finalmente, al tener disponible informaci贸n de posici贸n, orientaci贸n y altura, se realiz贸 una nueva clasificaci贸n de actividad de seven-class utilizando un clasificador jer谩rquico que combina un clasificador de postura basado en la altura con clasificadores de movimiento SVM traslacional y rotacional. Cada uno de los clasificadores de movimiento SVM y el clasificador jer谩rquico conjunto se evaluaron en el experimento de laboratorio con 8 personas sanas. El 煤ltimo algoritmo de detecci贸n de FOG basado en el contexto utiliza informaci贸n de actividad e informaci贸n de texto espacial para confirmar o refutar el FOG detectado por el algoritmo de detecci贸n de FOG actual. El algoritmo basado en el contexto influye muy positivamente en la reducci贸n de las detecciones de falsos positivos, que se expresa a trav茅s de una mayor especificida

    A Review on Human Activity Recognition Using Vision-Based Method

    Get PDF

    Novel methods for posture-based human action recognition and activity anomaly detection

    Get PDF
    PhD ThesisArti cial Intelligence (AI) for Human Action Recognition (HAR) and Human Activity Anomaly Detection (HAAD) is an active and exciting research eld. Video-based HAR aims to classify human actions and video-based HAAD aims to detect abnormal human activities within data. However, a human is an extremely complex subject and a non-rigid object in the video, which provides great challenges for Computer Vision and Signal Processing. Relevant applications elds are surveillance and public monitoring, assisted living, robotics, human-to-robot interaction, prosthetics, gaming, video captioning, and sports analysis. The focus of this thesis is on the posture-related HAR and HAAD. The aim is to design computationally-e cient, machine and deep learning-based HAR and HAAD methods which can run in multiple humans monitoring scenarios. This thesis rstly contributes two novel 3D Histogram of Oriented Gradient (3D-HOG) driven frameworks for silhouette-based HAR. The 3D-HOG state-of-the-art limitations, e.g. unweighted local body areas based processing and unstable performance over di erent training rounds, are addressed. The proposed methods achieve more accurate results than the baseline, outperforming the state-of-the-art. Experiments are conducted on publicly available datasets, alongside newly recorded data. This thesis also contributes a new algorithm for human poses-based HAR. In particular, the proposed human poses-based HAR is among the rst, few, simultaneous attempts which have been conducted at the time. The proposed HAR algorithm, named ActionXPose, is based on Convolutional Neural Networks and Long Short-Term Memory. It turns out to be more reliable and computationally advantageous when compared to human silhouette-based approaches. The ActionXPose's exibility also allows crossdatasets processing and more robustness to occlusions scenarios. Extensive evaluation on publicly available datasets demonstrates the e cacy of ActionXPose over the state-of-the-art. Moreover, newly recorded data, i.e. Intelligent Sensing Lab Dataset (ISLD), is also contributed and exploited to further test ActionXPose in real-world, non-cooperative scenarios. The last set of contributions in this thesis regards pose-driven, combined HAR and HAAD algorithms. Motivated by ActionXPose achievements, this thesis contributes a new algorithm to simultaneously extract deep-learningbased features from human-poses, RGB Region of Interests (ROIs) and detected objects positions. The proposed method outperforms the stateof- the-art in both HAR and HAAD. The HAR performance is extensively tested on publicly available datasets, including the contributed ISLD dataset. Moreover, to compensate for the lack of data in the eld, this thesis also contributes three new datasets for human-posture and objects-positions related HAAD, i.e. BMbD, M-BMdD and JBMOPbD datasets

    Advances in knowledge discovery and data mining Part II

    Get PDF
    19th Pacific-Asia Conference, PAKDD 2015, Ho Chi Minh City, Vietnam, May 19-22, 2015, Proceedings, Part II</p
    corecore