784 research outputs found

    TennisSense: a platform for extracting semantic information from multi-camera tennis data

    Get PDF
    In this paper, we introduce TennisSense, a technology platform for the digital capture, analysis and retrieval of tennis training and matches. Our algorithms for extracting useful metadata from the overhead court camera are described and evaluated. We track the tennis ball using motion images for ball candidate detection and then link ball candidates into locally linear tracks. From these tracks we can infer when serves and rallies take place. Using background subtraction and hysteresis-type blob tracking, we track the tennis players positions. The performance of both modules is evaluated using ground-truthed data. The extracted metadata provides valuable information for indexing and efficient browsing of hours of multi-camera tennis footage and we briefly illustrative how this data is used by our tennis-coach playback interface

    Attention estimation by simultaneous analysis of viewer and view

    Get PDF
    Abstract — This paper introduces a system for estimating the attention of a driver wearing a first person view camera using salient objects to improve gaze estimation. A challenging data set of pedestrians crossing intersections has been captured using Google Glass worn by a driver. A challenge unique to first person view from cars is that the interior of the car can take up a large part of the image. The proposed system automatically filters out the dashboard of the car, along with other parts of the instrumentation. The remaining area is used as a region of interest for a pedestrian detector. Two cameras looking at the driver are used to determine the direction of the driver’s gaze, by examining the eye corners and the center of the iris. This coarse gaze estimation is then linked to the detected pedestrians to determine which pedestrian the driver is focused on at any given time. I

    Vision-based toddler tracking at home

    Get PDF
    This paper presents a vision-based toddler tracking system for detecting risk factors of a toddler's fall within the home environment. The risk factors have environmental and behavioral aspects and the research in this paper focuses on the behavioral aspects. Apart from common image processing tasks such as background subtraction, the vision-based toddler tracking involves human classification, acquisition of motion and position information, and handling of regional merges and splits. The human classification is based on dynamic motion vectors of the human body. The center of mass of each contour is detected and connected with the closest center of mass in the next frame to obtain position, speed, and directional information. This tracking system is further enhanced by dealing with regional merges and splits due to multiple object occlusions. In order to identify the merges and splits, two directional detections of closest region centers are conducted between every two successive frames. Merges and splits of a single object due to errors in the background subtraction are also handled. The tracking algorithms have been developed, implemented and tested

    Sensing, interpreting, and anticipating human social behaviour in the real world

    Get PDF
    Low-level nonverbal social signals like glances, utterances, facial expressions and body language are central to human communicative situations and have been shown to be connected to important high-level constructs, such as emotions, turn-taking, rapport, or leadership. A prerequisite for the creation of social machines that are able to support humans in e.g. education, psychotherapy, or human resources is the ability to automatically sense, interpret, and anticipate human nonverbal behaviour. While promising results have been shown in controlled settings, automatically analysing unconstrained situations, e.g. in daily-life settings, remains challenging. Furthermore, anticipation of nonverbal behaviour in social situations is still largely unexplored. The goal of this thesis is to move closer to the vision of social machines in the real world. It makes fundamental contributions along the three dimensions of sensing, interpreting and anticipating nonverbal behaviour in social interactions. First, robust recognition of low-level nonverbal behaviour lays the groundwork for all further analysis steps. Advancing human visual behaviour sensing is especially relevant as the current state of the art is still not satisfactory in many daily-life situations. While many social interactions take place in groups, current methods for unsupervised eye contact detection can only handle dyadic interactions. We propose a novel unsupervised method for multi-person eye contact detection by exploiting the connection between gaze and speaking turns. Furthermore, we make use of mobile device engagement to address the problem of calibration drift that occurs in daily-life usage of mobile eye trackers. Second, we improve the interpretation of social signals in terms of higher level social behaviours. In particular, we propose the first dataset and method for emotion recognition from bodily expressions of freely moving, unaugmented dyads. Furthermore, we are the first to study low rapport detection in group interactions, as well as investigating a cross-dataset evaluation setting for the emergent leadership detection task. Third, human visual behaviour is special because it functions as a social signal and also determines what a person is seeing at a given moment in time. Being able to anticipate human gaze opens up the possibility for machines to more seamlessly share attention with humans, or to intervene in a timely manner if humans are about to overlook important aspects of the environment. We are the first to propose methods for the anticipation of eye contact in dyadic conversations, as well as in the context of mobile device interactions during daily life, thereby paving the way for interfaces that are able to proactively intervene and support interacting humans.Blick, Gesichtsausdrücke, Körpersprache, oder Prosodie spielen als nonverbale Signale eine zentrale Rolle in menschlicher Kommunikation. Sie wurden durch vielzählige Studien mit wichtigen Konzepten wie Emotionen, Sprecherwechsel, Führung, oder der Qualität des Verhältnisses zwischen zwei Personen in Verbindung gebracht. Damit Menschen effektiv während ihres täglichen sozialen Lebens von Maschinen unterstützt werden können, sind automatische Methoden zur Erkennung, Interpretation, und Antizipation von nonverbalem Verhalten notwendig. Obwohl die bisherige Forschung in kontrollierten Studien zu ermutigenden Ergebnissen gekommen ist, bleibt die automatische Analyse nonverbalen Verhaltens in weniger kontrollierten Situationen eine Herausforderung. Darüber hinaus existieren kaum Untersuchungen zur Antizipation von nonverbalem Verhalten in sozialen Situationen. Das Ziel dieser Arbeit ist, die Vision vom automatischen Verstehen sozialer Situationen ein Stück weit mehr Realität werden zu lassen. Diese Arbeit liefert wichtige Beiträge zur autmatischen Erkennung menschlichen Blickverhaltens in alltäglichen Situationen. Obwohl viele soziale Interaktionen in Gruppen stattfinden, existieren unüberwachte Methoden zur Augenkontakterkennung bisher lediglich für dyadische Interaktionen. Wir stellen einen neuen Ansatz zur Augenkontakterkennung in Gruppen vor, welcher ohne manuelle Annotationen auskommt, indem er sich den statistischen Zusammenhang zwischen Blick- und Sprechverhalten zu Nutze macht. Tägliche Aktivitäten sind eine Herausforderung für Geräte zur mobile Augenbewegungsmessung, da Verschiebungen dieser Geräte zur Verschlechterung ihrer Kalibrierung führen können. In dieser Arbeit verwenden wir Nutzerverhalten an mobilen Endgeräten, um den Effekt solcher Verschiebungen zu korrigieren. Neben der Erkennung verbessert diese Arbeit auch die Interpretation sozialer Signale. Wir veröffentlichen den ersten Datensatz sowie die erste Methode zur Emotionserkennung in dyadischen Interaktionen ohne den Einsatz spezialisierter Ausrüstung. Außerdem stellen wir die erste Studie zur automatischen Erkennung mangelnder Verbundenheit in Gruppeninteraktionen vor, und führen die erste datensatzübergreifende Evaluierung zur Detektion von sich entwickelndem Führungsverhalten durch. Zum Abschluss der Arbeit präsentieren wir die ersten Ansätze zur Antizipation von Blickverhalten in sozialen Interaktionen. Blickverhalten hat die besondere Eigenschaft, dass es sowohl als soziales Signal als auch der Ausrichtung der visuellen Wahrnehmung dient. Somit eröffnet die Fähigkeit zur Antizipation von Blickverhalten Maschinen die Möglichkeit, sich sowohl nahtloser in soziale Interaktionen einzufügen, als auch Menschen zu warnen, wenn diese Gefahr laufen wichtige Aspekte der Umgebung zu übersehen. Wir präsentieren Methoden zur Antizipation von Blickverhalten im Kontext der Interaktion mit mobilen Endgeräten während täglicher Aktivitäten, als auch während dyadischer Interaktionen mittels Videotelefonie

    Building an Understanding of Human Activities in First Person Video using Fuzzy Inference

    Get PDF
    Activities of Daily Living (ADL’s) are the activities that people perform every day in their home as part of their typical routine. The in-home, automated monitoring of ADL’s has broad utility for intelligent systems that enable independent living for the elderly and mentally or physically disabled individuals. With rising interest in electronic health (e-Health) and mobile health (m-Health) technology, opportunities abound for the integration of activity monitoring systems into these newer forms of healthcare. In this dissertation we propose a novel system for describing ’s based on video collected from a wearable camera. Most in-home activities are naturally defined by interaction with objects. We leverage these object-centric activity definitions to develop a set of rules for a Fuzzy Inference System (FIS) that uses video features and the identification of objects to identify and classify activities. Further, we demonstrate that the use of FIS enhances the reliability of the system and provides enhanced explainability and interpretability of results over popular machine-learning classifiers due to the linguistic nature of fuzzy systems

    Context-aware home monitoring system for Parkinson's disease patietns : ambient and werable sensing for freezing of gait detection

    Get PDF
    Tesi en modalitat de cotutela: Universitat Politècnica de Catalunya i Technische Universiteit Eindhoven. This PhD Thesis has been developed in the framework of, and according to, the rules of the Erasmus Mundus Joint Doctorate on Interactive and Cognitive Environments EMJD ICE [FPA no. 2010-0012]Parkinson’s disease (PD). It is characterized by brief episodes of inability to step, or by extremely short steps that typically occur on gait initiation or on turning while walking. The consequences of FOG are aggravated mobility and higher affinity to falls, which have a direct effect on the quality of life of the individual. There does not exist completely effective pharmacological treatment for the FOG phenomena. However, external stimuli, such as lines on the floor or rhythmic sounds, can focus the attention of a person who experiences a FOG episode and help her initiate gait. The optimal effectiveness in such approach, known as cueing, is achieved through timely activation of a cueing device upon the accurate detection of a FOG episode. Therefore, a robust and accurate FOG detection is the main problem that needs to be solved when developing a suitable assistive technology solution for this specific user group. This thesis proposes the use of activity and spatial context of a person as the means to improve the detection of FOG episodes during monitoring at home. The thesis describes design, algorithm implementation and evaluation of a distributed home system for FOG detection based on multiple cameras and a single inertial gait sensor worn at the waist of the patient. Through detailed observation of collected home data of 17 PD patients, we realized that a novel solution for FOG detection could be achieved by using contextual information of the patient’s position, orientation, basic posture and movement on a semantically annotated two-dimensional (2D) map of the indoor environment. We envisioned the future context-aware system as a network of Microsoft Kinect cameras placed in the patient’s home that interacts with a wearable inertial sensor on the patient (smartphone). Since the hardware platform of the system constitutes from the commercial of-the-shelf hardware, the majority of the system development efforts involved the production of software modules (for position tracking, orientation tracking, activity recognition) that run on top of the middle-ware operating system in the home gateway server. The main component of the system that had to be developed is the Kinect application for tracking the position and height of multiple people, based on the input in the form of 3D point cloud data. Besides position tracking, this software module also provides mapping and semantic annotation of FOG specific zones on the scene in front of the Kinect. One instance of vision tracking application is supposed to run for every Kinect sensor in the system, yielding potentially high number of simultaneous tracks. At any moment, the system has to track one specific person - the patient. To enable tracking of the patient between different non-overlapped cameras in the distributed system, a new re-identification approach based on appearance model learning with one-class Support Vector Machine (SVM) was developed. Evaluation of the re-identification method was conducted on a 16 people dataset in a laboratory environment. Since the patient orientation in the indoor space was recognized as an important part of the context, the system necessitated the ability to estimate the orientation of the person, expressed in the frame of the 2D scene on which the patient is tracked by the camera. We devised method to fuse position tracking information from the vision system and inertial data from the smartphone in order to obtain patient’s 2D pose estimation on the scene map. Additionally, a method for the estimation of the position of the smartphone on the waist of the patient was proposed. Position and orientation estimation accuracy were evaluated on a 12 people dataset. Finally, having available positional, orientation and height information, a new seven-class activity classification was realized using a hierarchical classifier that combines height-based posture classifier with translational and rotational SVM movement classifiers. Each of the SVM movement classifiers and the joint hierarchical classifier were evaluated in the laboratory experiment with 8 healthy persons. The final context-based FOG detection algorithm uses activity information and spatial context information in order to confirm or disprove FOG detected by the current state-of-the-art FOG detection algorithm (which uses only wearable sensor data). A dataset with home data of 3 PD patients was produced using two Kinect cameras and a smartphone in synchronized recording. The new context-based FOG detection algorithm and the wearable-only FOG detection algorithm were both evaluated with the home dataset and their results were compared. The context-based algorithm very positively influences the reduction of false positive detections, which is expressed through achieved higher specificity. In some cases, context-based algorithm also eliminates true positive detections, reducing sensitivity to the lesser extent. The final comparison of the two algorithms on the basis of their sensitivity and specificity, shows the improvement in the overall FOG detection achieved with the new context-aware home system.Esta tesis propone el uso de la actividad y el contexto espacial de una persona como medio para mejorar la detección de episodios de FOG (Freezing of gait) durante el seguimiento en el domicilio. La tesis describe el diseño, implementación de algoritmos y evaluación de un sistema doméstico distribuido para detección de FOG basado en varias cámaras y un único sensor de marcha inercial en la cintura del paciente. Mediante de la observación detallada de los datos caseros recopilados de 17 pacientes con EP, nos dimos cuenta de que se puede lograr una solución novedosa para la detección de FOG mediante el uso de información contextual de la posición del paciente, orientación, postura básica y movimiento anotada semánticamente en un mapa bidimensional (2D) del entorno interior. Imaginamos el futuro sistema de consciencia del contexto como una red de cámaras Microsoft Kinect colocadas en el hogar del paciente, que interactúa con un sensor de inercia portátil en el paciente (teléfono inteligente). Al constituirse la plataforma del sistema a partir de hardware comercial disponible, los esfuerzos de desarrollo consistieron en la producción de módulos de software (para el seguimiento de la posición, orientación seguimiento, reconocimiento de actividad) que se ejecutan en la parte superior del sistema operativo del servidor de puerta de enlace de casa. El componente principal del sistema que tuvo que desarrollarse es la aplicación Kinect para seguimiento de la posición y la altura de varias personas, según la entrada en forma de punto 3D de datos en la nube. Además del seguimiento de posición, este módulo de software también proporciona mapeo y semántica. anotación de zonas específicas de FOG en la escena frente al Kinect. Se supone que una instancia de la aplicación de seguimiento de visión se ejecuta para cada sensor Kinect en el sistema, produciendo un número potencialmente alto de pistas simultáneas. En cualquier momento, el sistema tiene que rastrear a una persona específica - el paciente. Para habilitar el seguimiento del paciente entre diferentes cámaras no superpuestas en el sistema distribuido, se desarrolló un nuevo enfoque de re-identificación basado en el aprendizaje de modelos de apariencia con one-class Suport Vector Machine (SVM). La evaluación del método de re-identificación se realizó con un conjunto de datos de 16 personas en un entorno de laboratorio. Dado que la orientación del paciente en el espacio interior fue reconocida como una parte importante del contexto, el sistema necesitaba la capacidad de estimar la orientación de la persona, expresada en el marco de la escena 2D en la que la cámara sigue al paciente. Diseñamos un método para fusionar la información de seguimiento de posición del sistema de visión y los datos de inercia del smartphone para obtener la estimación de postura 2D del paciente en el mapa de la escena. Además, se propuso un método para la estimación de la posición del Smartphone en la cintura del paciente. La precisión de la estimación de la posición y la orientación se evaluó en un conjunto de datos de 12 personas. Finalmente, al tener disponible información de posición, orientación y altura, se realizó una nueva clasificación de actividad de seven-class utilizando un clasificador jerárquico que combina un clasificador de postura basado en la altura con clasificadores de movimiento SVM traslacional y rotacional. Cada uno de los clasificadores de movimiento SVM y el clasificador jerárquico conjunto se evaluaron en el experimento de laboratorio con 8 personas sanas. El último algoritmo de detección de FOG basado en el contexto utiliza información de actividad e información de texto espacial para confirmar o refutar el FOG detectado por el algoritmo de detección de FOG actual. El algoritmo basado en el contexto influye muy positivamente en la reducción de las detecciones de falsos positivos, que se expresa a través de una mayor especificidadPostprint (published version

    Context-aware home monitoring system for Parkinson's disease patietns : ambient and werable sensing for freezing of gait detection

    Get PDF
    Parkinson’s disease (PD). It is characterized by brief episodes of inability to step, or by extremely short steps that typically occur on gait initiation or on turning while walking. The consequences of FOG are aggravated mobility and higher affinity to falls, which have a direct effect on the quality of life of the individual. There does not exist completely effective pharmacological treatment for the FOG phenomena. However, external stimuli, such as lines on the floor or rhythmic sounds, can focus the attention of a person who experiences a FOG episode and help her initiate gait. The optimal effectiveness in such approach, known as cueing, is achieved through timely activation of a cueing device upon the accurate detection of a FOG episode. Therefore, a robust and accurate FOG detection is the main problem that needs to be solved when developing a suitable assistive technology solution for this specific user group. This thesis proposes the use of activity and spatial context of a person as the means to improve the detection of FOG episodes during monitoring at home. The thesis describes design, algorithm implementation and evaluation of a distributed home system for FOG detection based on multiple cameras and a single inertial gait sensor worn at the waist of the patient. Through detailed observation of collected home data of 17 PD patients, we realized that a novel solution for FOG detection could be achieved by using contextual information of the patient’s position, orientation, basic posture and movement on a semantically annotated two-dimensional (2D) map of the indoor environment. We envisioned the future context-aware system as a network of Microsoft Kinect cameras placed in the patient’s home that interacts with a wearable inertial sensor on the patient (smartphone). Since the hardware platform of the system constitutes from the commercial of-the-shelf hardware, the majority of the system development efforts involved the production of software modules (for position tracking, orientation tracking, activity recognition) that run on top of the middle-ware operating system in the home gateway server. The main component of the system that had to be developed is the Kinect application for tracking the position and height of multiple people, based on the input in the form of 3D point cloud data. Besides position tracking, this software module also provides mapping and semantic annotation of FOG specific zones on the scene in front of the Kinect. One instance of vision tracking application is supposed to run for every Kinect sensor in the system, yielding potentially high number of simultaneous tracks. At any moment, the system has to track one specific person - the patient. To enable tracking of the patient between different non-overlapped cameras in the distributed system, a new re-identification approach based on appearance model learning with one-class Support Vector Machine (SVM) was developed. Evaluation of the re-identification method was conducted on a 16 people dataset in a laboratory environment. Since the patient orientation in the indoor space was recognized as an important part of the context, the system necessitated the ability to estimate the orientation of the person, expressed in the frame of the 2D scene on which the patient is tracked by the camera. We devised method to fuse position tracking information from the vision system and inertial data from the smartphone in order to obtain patient’s 2D pose estimation on the scene map. Additionally, a method for the estimation of the position of the smartphone on the waist of the patient was proposed. Position and orientation estimation accuracy were evaluated on a 12 people dataset. Finally, having available positional, orientation and height information, a new seven-class activity classification was realized using a hierarchical classifier that combines height-based posture classifier with translational and rotational SVM movement classifiers. Each of the SVM movement classifiers and the joint hierarchical classifier were evaluated in the laboratory experiment with 8 healthy persons. The final context-based FOG detection algorithm uses activity information and spatial context information in order to confirm or disprove FOG detected by the current state-of-the-art FOG detection algorithm (which uses only wearable sensor data). A dataset with home data of 3 PD patients was produced using two Kinect cameras and a smartphone in synchronized recording. The new context-based FOG detection algorithm and the wearable-only FOG detection algorithm were both evaluated with the home dataset and their results were compared. The context-based algorithm very positively influences the reduction of false positive detections, which is expressed through achieved higher specificity. In some cases, context-based algorithm also eliminates true positive detections, reducing sensitivity to the lesser extent. The final comparison of the two algorithms on the basis of their sensitivity and specificity, shows the improvement in the overall FOG detection achieved with the new context-aware home system.Esta tesis propone el uso de la actividad y el contexto espacial de una persona como medio para mejorar la detección de episodios de FOG (Freezing of gait) durante el seguimiento en el domicilio. La tesis describe el diseño, implementación de algoritmos y evaluación de un sistema doméstico distribuido para detección de FOG basado en varias cámaras y un único sensor de marcha inercial en la cintura del paciente. Mediante de la observación detallada de los datos caseros recopilados de 17 pacientes con EP, nos dimos cuenta de que se puede lograr una solución novedosa para la detección de FOG mediante el uso de información contextual de la posición del paciente, orientación, postura básica y movimiento anotada semánticamente en un mapa bidimensional (2D) del entorno interior. Imaginamos el futuro sistema de consciencia del contexto como una red de cámaras Microsoft Kinect colocadas en el hogar del paciente, que interactúa con un sensor de inercia portátil en el paciente (teléfono inteligente). Al constituirse la plataforma del sistema a partir de hardware comercial disponible, los esfuerzos de desarrollo consistieron en la producción de módulos de software (para el seguimiento de la posición, orientación seguimiento, reconocimiento de actividad) que se ejecutan en la parte superior del sistema operativo del servidor de puerta de enlace de casa. El componente principal del sistema que tuvo que desarrollarse es la aplicación Kinect para seguimiento de la posición y la altura de varias personas, según la entrada en forma de punto 3D de datos en la nube. Además del seguimiento de posición, este módulo de software también proporciona mapeo y semántica. anotación de zonas específicas de FOG en la escena frente al Kinect. Se supone que una instancia de la aplicación de seguimiento de visión se ejecuta para cada sensor Kinect en el sistema, produciendo un número potencialmente alto de pistas simultáneas. En cualquier momento, el sistema tiene que rastrear a una persona específica - el paciente. Para habilitar el seguimiento del paciente entre diferentes cámaras no superpuestas en el sistema distribuido, se desarrolló un nuevo enfoque de re-identificación basado en el aprendizaje de modelos de apariencia con one-class Suport Vector Machine (SVM). La evaluación del método de re-identificación se realizó con un conjunto de datos de 16 personas en un entorno de laboratorio. Dado que la orientación del paciente en el espacio interior fue reconocida como una parte importante del contexto, el sistema necesitaba la capacidad de estimar la orientación de la persona, expresada en el marco de la escena 2D en la que la cámara sigue al paciente. Diseñamos un método para fusionar la información de seguimiento de posición del sistema de visión y los datos de inercia del smartphone para obtener la estimación de postura 2D del paciente en el mapa de la escena. Además, se propuso un método para la estimación de la posición del Smartphone en la cintura del paciente. La precisión de la estimación de la posición y la orientación se evaluó en un conjunto de datos de 12 personas. Finalmente, al tener disponible información de posición, orientación y altura, se realizó una nueva clasificación de actividad de seven-class utilizando un clasificador jerárquico que combina un clasificador de postura basado en la altura con clasificadores de movimiento SVM traslacional y rotacional. Cada uno de los clasificadores de movimiento SVM y el clasificador jerárquico conjunto se evaluaron en el experimento de laboratorio con 8 personas sanas. El último algoritmo de detección de FOG basado en el contexto utiliza información de actividad e información de texto espacial para confirmar o refutar el FOG detectado por el algoritmo de detección de FOG actual. El algoritmo basado en el contexto influye muy positivamente en la reducción de las detecciones de falsos positivos, que se expresa a través de una mayor especificida

    Context-aware home monitoring system for Parkinson's disease patients : ambient and wearable sensing for freezing of gait detection

    Get PDF
    Tesi en modalitat de cotutela: Universitat Politècnica de Catalunya i Technische Universiteit Eindhoven. This PhD Thesis has been developed in the framework of, and according to, the rules of the Erasmus Mundus Joint Doctorate on Interactive and Cognitive Environments EMJD ICE [FPA no. 2010-0012]Parkinson’s disease (PD). It is characterized by brief episodes of inability to step, or by extremely short steps that typically occur on gait initiation or on turning while walking. The consequences of FOG are aggravated mobility and higher affinity to falls, which have a direct effect on the quality of life of the individual. There does not exist completely effective pharmacological treatment for the FOG phenomena. However, external stimuli, such as lines on the floor or rhythmic sounds, can focus the attention of a person who experiences a FOG episode and help her initiate gait. The optimal effectiveness in such approach, known as cueing, is achieved through timely activation of a cueing device upon the accurate detection of a FOG episode. Therefore, a robust and accurate FOG detection is the main problem that needs to be solved when developing a suitable assistive technology solution for this specific user group. This thesis proposes the use of activity and spatial context of a person as the means to improve the detection of FOG episodes during monitoring at home. The thesis describes design, algorithm implementation and evaluation of a distributed home system for FOG detection based on multiple cameras and a single inertial gait sensor worn at the waist of the patient. Through detailed observation of collected home data of 17 PD patients, we realized that a novel solution for FOG detection could be achieved by using contextual information of the patient’s position, orientation, basic posture and movement on a semantically annotated two-dimensional (2D) map of the indoor environment. We envisioned the future context-aware system as a network of Microsoft Kinect cameras placed in the patient’s home that interacts with a wearable inertial sensor on the patient (smartphone). Since the hardware platform of the system constitutes from the commercial of-the-shelf hardware, the majority of the system development efforts involved the production of software modules (for position tracking, orientation tracking, activity recognition) that run on top of the middle-ware operating system in the home gateway server. The main component of the system that had to be developed is the Kinect application for tracking the position and height of multiple people, based on the input in the form of 3D point cloud data. Besides position tracking, this software module also provides mapping and semantic annotation of FOG specific zones on the scene in front of the Kinect. One instance of vision tracking application is supposed to run for every Kinect sensor in the system, yielding potentially high number of simultaneous tracks. At any moment, the system has to track one specific person - the patient. To enable tracking of the patient between different non-overlapped cameras in the distributed system, a new re-identification approach based on appearance model learning with one-class Support Vector Machine (SVM) was developed. Evaluation of the re-identification method was conducted on a 16 people dataset in a laboratory environment. Since the patient orientation in the indoor space was recognized as an important part of the context, the system necessitated the ability to estimate the orientation of the person, expressed in the frame of the 2D scene on which the patient is tracked by the camera. We devised method to fuse position tracking information from the vision system and inertial data from the smartphone in order to obtain patient’s 2D pose estimation on the scene map. Additionally, a method for the estimation of the position of the smartphone on the waist of the patient was proposed. Position and orientation estimation accuracy were evaluated on a 12 people dataset. Finally, having available positional, orientation and height information, a new seven-class activity classification was realized using a hierarchical classifier that combines height-based posture classifier with translational and rotational SVM movement classifiers. Each of the SVM movement classifiers and the joint hierarchical classifier were evaluated in the laboratory experiment with 8 healthy persons. The final context-based FOG detection algorithm uses activity information and spatial context information in order to confirm or disprove FOG detected by the current state-of-the-art FOG detection algorithm (which uses only wearable sensor data). A dataset with home data of 3 PD patients was produced using two Kinect cameras and a smartphone in synchronized recording. The new context-based FOG detection algorithm and the wearable-only FOG detection algorithm were both evaluated with the home dataset and their results were compared. The context-based algorithm very positively influences the reduction of false positive detections, which is expressed through achieved higher specificity. In some cases, context-based algorithm also eliminates true positive detections, reducing sensitivity to the lesser extent. The final comparison of the two algorithms on the basis of their sensitivity and specificity, shows the improvement in the overall FOG detection achieved with the new context-aware home system.Esta tesis propone el uso de la actividad y el contexto espacial de una persona como medio para mejorar la detección de episodios de FOG (Freezing of gait) durante el seguimiento en el domicilio. La tesis describe el diseño, implementación de algoritmos y evaluación de un sistema doméstico distribuido para detección de FOG basado en varias cámaras y un único sensor de marcha inercial en la cintura del paciente. Mediante de la observación detallada de los datos caseros recopilados de 17 pacientes con EP, nos dimos cuenta de que se puede lograr una solución novedosa para la detección de FOG mediante el uso de información contextual de la posición del paciente, orientación, postura básica y movimiento anotada semánticamente en un mapa bidimensional (2D) del entorno interior. Imaginamos el futuro sistema de consciencia del contexto como una red de cámaras Microsoft Kinect colocadas en el hogar del paciente, que interactúa con un sensor de inercia portátil en el paciente (teléfono inteligente). Al constituirse la plataforma del sistema a partir de hardware comercial disponible, los esfuerzos de desarrollo consistieron en la producción de módulos de software (para el seguimiento de la posición, orientación seguimiento, reconocimiento de actividad) que se ejecutan en la parte superior del sistema operativo del servidor de puerta de enlace de casa. El componente principal del sistema que tuvo que desarrollarse es la aplicación Kinect para seguimiento de la posición y la altura de varias personas, según la entrada en forma de punto 3D de datos en la nube. Además del seguimiento de posición, este módulo de software también proporciona mapeo y semántica. anotación de zonas específicas de FOG en la escena frente al Kinect. Se supone que una instancia de la aplicación de seguimiento de visión se ejecuta para cada sensor Kinect en el sistema, produciendo un número potencialmente alto de pistas simultáneas. En cualquier momento, el sistema tiene que rastrear a una persona específica - el paciente. Para habilitar el seguimiento del paciente entre diferentes cámaras no superpuestas en el sistema distribuido, se desarrolló un nuevo enfoque de re-identificación basado en el aprendizaje de modelos de apariencia con one-class Suport Vector Machine (SVM). La evaluación del método de re-identificación se realizó con un conjunto de datos de 16 personas en un entorno de laboratorio. Dado que la orientación del paciente en el espacio interior fue reconocida como una parte importante del contexto, el sistema necesitaba la capacidad de estimar la orientación de la persona, expresada en el marco de la escena 2D en la que la cámara sigue al paciente. Diseñamos un método para fusionar la información de seguimiento de posición del sistema de visión y los datos de inercia del smartphone para obtener la estimación de postura 2D del paciente en el mapa de la escena. Además, se propuso un método para la estimación de la posición del Smartphone en la cintura del paciente. La precisión de la estimación de la posición y la orientación se evaluó en un conjunto de datos de 12 personas. Finalmente, al tener disponible información de posición, orientación y altura, se realizó una nueva clasificación de actividad de seven-class utilizando un clasificador jerárquico que combina un clasificador de postura basado en la altura con clasificadores de movimiento SVM traslacional y rotacional. Cada uno de los clasificadores de movimiento SVM y el clasificador jerárquico conjunto se evaluaron en el experimento de laboratorio con 8 personas sanas. El último algoritmo de detección de FOG basado en el contexto utiliza información de actividad e información de texto espacial para confirmar o refutar el FOG detectado por el algoritmo de detección de FOG actual. El algoritmo basado en el contexto influye muy positivamente en la reducción de las detecciones de falsos positivos, que se expresa a través de una mayor especificida
    corecore