374 research outputs found

    Hierarchical Hidden Markov Model in Detecting Activities of Daily Living in Wearable Videos for Studies of Dementia

    Get PDF
    International audienceThis paper presents a method for indexing activities of daily living in videos obtained from wearable cameras. In the context of dementia diagnosis by doctors, the videos are recorded at patients' houses and later visualized by the medical practitioners. The videos may last up to two hours, therefore a tool for an efficient navigation in terms of activities of interest is crucial for the doctors. The specific recording mode provides video data which are really difficult, being a single sequence shot where strong motion and sharp lighting changes often appear. Our work introduces an automatic motion based segmentation of the video and a video structuring approach in terms of activities by a hierarchical two-level Hidden Markov Model. We define our description space over motion and visual characteristics of video and audio channels. Experiments on real data obtained from the recording at home of several patients show the difficulty of the task and the promising results of our approach

    Estimation of posture and prediction of the elderly getting out of bed using a body pressure sensor

    Get PDF
    We propose an IoT support system for estimating the posture of the care recipient on the bed from the body pressure of the care recipient measured by a sheet-type body pressure sensor, and detecting the posture related to leaving the bed in real time. In addition, we propose a method that predicts getting out of the bed before the care recipient takes a posture related to getting out of the bed by considering the state transition. Intervention experiment showed that using body pressure features as an explanatory variable and applying machine learning, 16 types of postures on the bed of care recipients with an F value of 0.7 or more could be identified. From the experiment without intervention, by applying the hidden Markov model, we calculated the transition probability to each hidden state when the care recipient getting out of the bed and the transition probability to each hidden state when the care recipient not getting out of the bed. As a result, there was a difference of about 0.1 in the transition probability of the state related to raising upper body

    Markov modelling on human activity recognition

    Get PDF
    Human Activity Recognition (HAR) is a research topic with a relevant interest in the machine learning community. Understanding the activities that a person is performing and the context where they perform them has a huge importance in multiple applications, including medical research, security or patient monitoring. The improvement of the smart-phones and inertial sensors technologies has lead to the implementation of activity recognition systems based on these devices, either by themselves or combining their information with other sensors. Since humans perform their daily activities sequentially in a specific order, there exist some temporal information in the physical activities that characterize the different human behaviour patterns. However, the most popular approach in HAR is to assume that the data is conditionally independent, segmenting the data in different windows and extracting the most relevant features from each segment. In this thesis we employ the temporal information explicitly, where the raw data provided by the wearable sensors is fed to the training models. Thus, we study how to perform a Markov modelling implementation of a long-term monitoring HAR system with wearable sensors, and we address the existing open problems arising while processing and training the data, combining different sensors and performing the long-term monitoring with battery powered devices. Employing directly the signals from the sensors to perform the recognition can lead to problems due to misplacements of the sensors on the body. We propose an orientation correction algorithm based on quaternions to process the signals and find a common frame reference for all of them independently on the position of the sensors or their orientation. This algorithm allows for a better activity recognition when feed to the classification algorithm when compared with similar approaches, and the quaternion transformations allow for a faster implementation. One of the most popular algorithms to model time series data are Hidden Markov Models (HMMs) and the training of the parameters of the model is performed using the Baum-Welch algorithm. However, this algorithm converges to local maxima and the multiple initializations needed to avoid them makes it computationally expensive for large datasets. We propose employing the theory of spectral learning to develop a discriminative HMM that avoids the problems of the Baum-Welch algorithm, outperforming it in both complexity and computational cost. When we implement a HAR system with several sensors, we need to consider how to perform the combination of the information provided by them. Data fusion can be performed either at signal level or at classification level. When performed at classification level, the usual approach is to combine the decisions of multiple classifiers on the body to obtain the performed activities. However, in the simple case with two classifiers, which can be a practical implementation of a HAR system, the combination reduces to selecting the most discriminative sensor, and no performance improvement is obtained against the single sensor implementation. In this thesis, we propose to employ the soft-outputs of the classifiers in the combination and we develop a method that considers the Markovian structure of the ground truth to capture the dynamics of the activities. We will show that this method improves the recognition of the activities with respect to other combination methods and with respect to the signal fusion case. Finally, in long-term monitoring HAR systems with wearable sensors we need to address the energy efficiency problem that is inherent to battery powered devices. The most common approach to improve the energy efficiency of such devices is to reduce the amount of data acquired by the wearable sensors. In that sense, we introduce a general framework for the energy efficiency of a system with multiple sensors under several energy restrictions. We propose a sensing strategy to optimize the temporal data acquisition based on computing the uncertainty of the activities given the data and adapt the acquisition actively. Furthermore, we develop a sensor selection algorithm based on Bayesian Experimental Design to obtain the best configuration of sensors that performs the activity recognition accurately, allowing for a further improvement on the energy efficiency by limiting the number of sensors employed in the acquisition.El reconocimiento de actividades humanas (HAR) es un tema de investigación con una gran relevancia para la comunidad de aprendizaje máquina. Comprender las actividades que una persona está realizando y el contexto en el que las realiza es de gran importancia en multitud de aplicaciones, entre las que se incluyen investigación médica, seguridad o monitorización de pacientes. La mejora en los smart-phones y en las tecnologías de sensores inerciales han dado lugar a la implementación de sistemas de reconocimiento de actividades basado en dichos dispositivos, ya sea por si mismos o combinándolos con otro tipo de sensores. Ya que los seres humanos realizan sus actividades diarias de manera secuencial en un orden específico, existe una cierta información temporal en las actividades físicas que caracterizan los diferentes patrones de comportamiento, Sin embargo, los algoritmos más comunes asumen que los datos son condicionalmente independientes, segmentándolos en diferentes ventanas y extrayendo las características más relevantes de cada segmento. En esta tesis utilizamos la información temporal de manera explícita, usando los datos crudos de los sensores como entrada de los modelos de entrenamiento. Por ello, analizamos como implementar modelos Markovianos para el reconocimiento de actividades en monitorizaciones de larga duración con sensores wearable, y tratamos los problemas existentes al procesar y entrenar los datos, al combinar diferentes sensores y al realizar adquisiciones de larga duración con dispositivos alimentados por baterías. Emplear directamente las señales de los sensores para realizar el reconocimiento de actividades puede dar lugar a problemas debido a la incorrecta colocación de los sensores en el cuerpo. Proponemos un algoritmo de corrección de la orientación basado en quaterniones para procesar las señales y encontrar un marco de referencia común independiente de la posición de los sensores y su orientación. Este algoritmo permite obtener un mejor reconocimiento de actividades al emplearlo en conjunto con un algoritmo de clasificación, cuando se compara con modelos similares. Además, la transformación de la orientación basada en quaterniones da lugar a una implementación más rápida. Uno de los algoritmos más populares para modelar series temporales son los modelos ocultos de Markov, donde los parámetros del modelo se entrenan usando el algoritmo de Baum-Welch. Sin embargo, este algoritmo converge en general a máximos locales, y las múltiples inicializaciones que se necesitan en su implementación lo convierten en un algoritmo de gran carga computacional cuando se emplea con bases de datos de un volumen considerable. Proponemos emplear la teoría de aprendizaje espectral para desarrollar un HMM discriminativo que evita los problemas del algoritmo de Baum-Welch, superándolo tanto en complejidad como en coste computacional. Cuando se implementa un sistema de reconocimiento de actividades con múltiples sensores, necesitamos considerar cómo realizar la combinación de la información que proporcionan. La fusión de los datos, se puede realizar tanto a nivel de señal como a nivel de clasificación. Cuando se realiza a nivel de clasificación, lo normal es combinar las decisiones de múltiples clasificadores colocados en el cuerpo para obtener las actividades que se están realizando. Sin embargo, en un caso simple donde únicamente se emplean dos sensores, que podría ser una implantación habitual de un sistema de reconocimiento de actividades, la combinación se reduce a seleccionar el sensor más discriminativo, y no se obtiene mejora con respecto a emplear un único sensor. En esta tesis proponemos emplear salidas blandas de los clasificadores para la combinación, desarrollando un modelo que considera la estructura Markoviana de los datos reales para capturar la dinámica de las actividades. Mostraremos como este método mejora el reconocimiento de actividades con respecto a otros métodos de combinación de clasificadores y con respecto a la fusión de los datos a nivel de señal. Por último, abordamos el problema de la eficiencia energética de dispositivos alimentados por baterías en sistemas de reconocimiento de actividades de larga duración. La aproximación más habitual para mejorar la eficiencia energética consiste en reducir el volumen de datos que adquieren los sensores. En ese sentido, introducimos un marco general para tratar el problema de la eficiencia energética en un sistema con múltiples sensores bajo ciertas restricciones de energética. Proponemos una estrategia de adquisición activa para optimizar el sistema temporal de recogida de datos, basándonos en la incertidumbre de las actividades dados los datos que conocemos. Además, desarrollamos un algoritmo de selección de sensores basado diseño experimental Bayesiano y así obtener la mejor configuración para realizar el reconocimiento de actividades limitando el número de sensores empleados y al mismo tiempo reduciendo su consumo energético.Programa Oficial de Doctorado en Multimedia y ComunicacionesPresidente: Luis Ignacio Santamaría Caballero.- Secretario: Pablo Martínez Olmos.- Vocal: Alberto Suárez Gonzále

    Portuguese sign language recognition via computer vision and depth sensor

    Get PDF
    Sign languages are used worldwide by a multitude of individuals. They are mostly used by the deaf communities and their teachers, or people associated with them by ties of friendship or family. Speakers are a minority of citizens, often segregated, and over the years not much attention has been given to this form of communication, even by the scientific community. In fact, in Computer Science there is some, but limited, research and development in this area. In the particular case of sign Portuguese Sign Language-PSL that fact is more evident and, to our knowledge there isn’t yet an efficient system to perform the automatic recognition of PSL signs. With the advent and wide spreading of devices such as depth sensors, there are new possibilities to address this problem. In this thesis, we have specified, developed, tested and preliminary evaluated, solutions that we think will bring valuable contributions to the problem of Automatic Gesture Recognition, applied to Sign Languages, such as the case of Portuguese Sign Language. In the context of this work, Computer Vision techniques were adapted to the case of Depth Sensors. A proper gesture taxonomy for this problem was proposed, and techniques for feature extraction, representation, storing and classification were presented. Two novel algorithms to solve the problem of real-time recognition of isolated static poses were specified, developed, tested and evaluated. Two other algorithms for isolated dynamic movements for gesture recognition (one of them novel), have been also specified, developed, tested and evaluated. Analyzed results compare well with the literature.As Línguas Gestuais são utilizadas em todo o Mundo por uma imensidão de indivíduos. Trata-se na sua grande maioria de surdos e/ou mudos, ou pessoas a eles associados por laços familiares de amizade ou professores de Língua Gestual. Tratando-se de uma minoria, muitas vezes segregada, não tem vindo a ser dada ao longo dos anos pela comunidade científica, a devida atenção a esta forma de comunicação. Na área das Ciências da Computação existem alguns, mas poucos trabalhos de investigação e desenvolvimento. No caso particular da Língua Gestual Portuguesa - LGP esse facto é ainda mais evidente não sendo nosso conhecimento a existência de um sistema eficaz e efetivo para fazer o reconhecimento automático de gestos da LGP. Com o aparecimento ou massificação de dispositivos, tais como sensores de profundidade, surgem novas possibilidades para abordar este problema. Nesta tese, foram especificadas, desenvolvidas, testadas e efectuada a avaliação preliminar de soluções que acreditamos que trarão valiosas contribuições para o problema do Reconhecimento Automático de Gestos, aplicado às Línguas Gestuais, como é o caso da Língua Gestual Portuguesa. Foram adaptadas técnicas de Visão por Computador ao caso dos Sensores de Profundidade. Foi proposta uma taxonomia adequada ao problema, e apresentadas técnicas para a extração, representação e armazenamento de características. Foram especificados, desenvolvidos, testados e avaliados dois algoritmos para resolver o problema do reconhecimento em tempo real de poses estáticas isoladas. Foram também especificados, desenvolvidos, testados e avaliados outros dois algoritmos para o Reconhecimento de Movimentos Dinâmicos Isolados de Gestos(um deles novo).Os resultados analisados são comparáveis à literatura.Las lenguas de Signos se utilizan en todo el Mundo por una multitud de personas. En su mayoría son personas sordas y/o mudas, o personas asociadas con ellos por vínculos de amistad o familiares y profesores de Lengua de Signos. Es una minoría de personas, a menudo segregadas, y no se ha dado en los últimos años por la comunidad científica, la atención debida a esta forma de comunicación. En el área de Ciencias de la Computación hay alguna pero poca investigación y desarrollo. En el caso particular de la Lengua de Signos Portuguesa - LSP, no es de nuestro conocimiento la existencia de un sistema eficiente y eficaz para el reconocimiento automático. Con la llegada en masa de dispositivos tales como Sensores de Profundidad, hay nuevas posibilidades para abordar el problema del Reconocimiento de Gestos. En esta tesis se han especificado, desarrollado, probado y hecha una evaluación preliminar de soluciones, aplicada a las Lenguas de Signos como el caso de la Lengua de Signos Portuguesa - LSP. Se han adaptado las técnicas de Visión por Ordenador para el caso de los Sensores de Profundidad. Se propone una taxonomía apropiada para el problema y se presentan técnicas para la extracción, representación y el almacenamiento de características. Se desarrollaran, probaran, compararan y analizan los resultados de dos nuevos algoritmos para resolver el problema del Reconocimiento Aislado y Estático de Posturas. Otros dos algoritmos (uno de ellos nuevo) fueran también desarrollados, probados, comparados y analizados los resultados, para el Reconocimiento de Movimientos Dinámicos Aislados de los Gestos

    Wearable computing and contextual awareness

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 1999.Includes bibliographical references (leaves 231-248).Computer hardware continues to shrink in size and increase in capability. This trend has allowed the prevailing concept of a computer to evolve from the mainframe to the minicomputer to the desktop. Just as the physical hardware changes, so does the use of the technology, tending towards more interactive and personal systems. Currently, another physical change is underway, placing computational power on the user's body. These wearable machines encourage new applications that were formerly infeasible and, correspondingly, will result in new usage patterns. This thesis suggests that the fundamental improvement offered by wearable computing is an increased sense of user context. I hypothesize that on-body systems can sense the user's context with little or no assistance from environmental infrastructure. These body-centered systems that "see" as the user sees and "hear" as the user hears, provide a unique "first-person" viewpoint of the user's environment. By exploiting models recovered by these systems, interfaces are created which require minimal directed action or attention by the user. In addition, more traditional applications are augmented by the contextual information recovered by these systems. To investigate these issues, I provide perceptually sensible tools for recovering and modeling user context in a mobile, everyday environment. These tools include a downward-facing, camera-based system for establishing the location of the user; a tag-based object recognition system for augmented reality; and several on-body gesture recognition systems to identify various user tasks in constrained environments. To address the practicality of contextually-aware wearable computers, issues of power recovery, heat dissipation, and weight distribution are examined. In addition, I have encouraged a community of wearable computer users at the Media Lab through design, management, and support of hardware and software infrastructure. This unique community provides a heightened awareness of the use and social issues of wearable computing. As much as possible, the lessons from this experience will be conveyed in the thesis.by Thad Eugene Starner.Ph.D

    A cognitive ego-vision system for interactive assistance

    Get PDF
    With increasing computational power and decreasing size, computers nowadays are already wearable and mobile. They become attendant of peoples' everyday life. Personal digital assistants and mobile phones equipped with adequate software gain a lot of interest in public, although the functionality they provide in terms of assistance is little more than a mobile databases for appointments, addresses, to-do lists and photos. Compared to the assistance a human can provide, such systems are hardly to call real assistants. The motivation to construct more human-like assistance systems that develop a certain level of cognitive capabilities leads to the exploration of two central paradigms in this work. The first paradigm is termed cognitive vision systems. Such systems take human cognition as a design principle of underlying concepts and develop learning and adaptation capabilities to be more flexible in their application. They are embodied, active, and situated. Second, the ego-vision paradigm is introduced as a very tight interaction scheme between a user and a computer system that especially eases close collaboration and assistance between these two. Ego-vision systems (EVS) take a user's (visual) perspective and integrate the human in the system's processing loop by means of a shared perception and augmented reality. EVSs adopt techniques of cognitive vision to identify objects, interpret actions, and understand the user's visual perception. And they articulate their knowledge and interpretation by means of augmentations of the user's own view. These two paradigms are studied as rather general concepts, but always with the goal in mind to realize more flexible assistance systems that closely collaborate with its users. This work provides three major contributions. First, a definition and explanation of ego-vision as a novel paradigm is given. Benefits and challenges of this paradigm are discussed as well. Second, a configuration of different approaches that permit an ego-vision system to perceive its environment and its user is presented in terms of object and action recognition, head gesture recognition, and mosaicing. These account for the specific challenges identified for ego-vision systems, whose perception capabilities are based on wearable sensors only. Finally, a visual active memory (VAM) is introduced as a flexible conceptual architecture for cognitive vision systems in general, and for assistance systems in particular. It adopts principles of human cognition to develop a representation for information stored in this memory. So-called memory processes continuously analyze, modify, and extend the content of this VAM. The functionality of the integrated system emerges from their coordinated interplay of these memory processes. An integrated assistance system applying the approaches and concepts outlined before is implemented on the basis of the visual active memory. The system architecture is discussed and some exemplary processing paths in this system are presented and discussed. It assists users in object manipulation tasks and has reached a maturity level that allows to conduct user studies. Quantitative results of different integrated memory processes are as well presented as an assessment of the interactive system by means of these user studies

    Human activity recognition for pervasive interaction

    Get PDF
    PhD ThesisThis thesis addresses the challenge of computing food preparation context in the kitchen. The automatic recognition of fine-grained human activities and food ingredients is realized through pervasive sensing which we achieve by instrumenting kitchen objects such as knives, spoons, and chopping boards with sensors. Context recognition in the kitchen lies at the heart of a broad range of real-world applications. In particular, activity and food ingredient recognition in the kitchen is an essential component for situated services such as automatic prompting services for cognitively impaired kitchen users and digital situated support for healthier eating interventions. Previous works, however, have addressed the activity recognition problem by exploring high-level-human activities using wearable sensing (i.e. worn sensors on human body) or using technologies that raise privacy concerns (i.e. computer vision). Although such approaches have yielded significant results for a number of activity recognition problems, they are not applicable to our domain of investigation, for which we argue that the technology itself must be genuinely “invisible”, thereby allowing users to perform their activities in a completely natural manner. In this thesis we describe the development of pervasive sensing technologies and algorithms for finegrained human activity and food ingredient recognition in the kitchen. After reviewing previous work on food and activity recognition we present three systems that constitute increasingly sophisticated approaches to the challenge of kitchen context recognition. Two of these systems, Slice&Dice and Classbased Threshold Dynamic Time Warping (CBT-DTW), recognize fine-grained food preparation activities. Slice&Dice is a proof-of-concept application, whereas CBT-DTW is a real-time application that also addresses the problem of recognising unknown activities. The final system, KitchenSense is a real-time context recognition framework that deals with the recognition of a more complex set of activities, and includes the recognition of food ingredients and events in the kitchen. For each system, we describe the prototyping of pervasive sensing technologies, algorithms, as well as real-world experiments and empirical evaluations that validate the proposed solutions.Vietnamese government’s 322 project, executed by the Vietnamese Ministry of Education and Training

    Signal Processing and Machine Learning Techniques Towards Various Real-World Applications

    Get PDF
    abstract: Machine learning (ML) has played an important role in several modern technological innovations and has become an important tool for researchers in various fields of interest. Besides engineering, ML techniques have started to spread across various departments of study, like health-care, medicine, diagnostics, social science, finance, economics etc. These techniques require data to train the algorithms and model a complex system and make predictions based on that model. Due to development of sophisticated sensors it has become easier to collect large volumes of data which is used to make necessary hypotheses using ML. The promising results obtained using ML have opened up new opportunities of research across various departments and this dissertation is a manifestation of it. Here, some unique studies have been presented, from which valuable inference have been drawn for a real-world complex system. Each study has its own unique sets of motivation and relevance to the real world. An ensemble of signal processing (SP) and ML techniques have been explored in each study. This dissertation provides the detailed systematic approach and discusses the results achieved in each study. Valuable inferences drawn from each study play a vital role in areas of science and technology, and it is worth further investigation. This dissertation also provides a set of useful SP and ML tools for researchers in various fields of interest.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Self-adaptation via concurrent multi-action evaluation for unknown context

    Get PDF
    Context-aware computing has been attracting growing attention in recent years. Generally, there are several ways for a context-aware system to select a course of action for a particular change of context. One way is for the system developers to encompass all possible context changes in the domain knowledge. Other methods include system inferences and adaptive learning whereby the system executes one action and evaluates the outcome and self-adapts/self-learns based on that. However, in situations where a system encounters unknown contexts, the iterative approach would become unfeasible when the size of the action space increases. Providing efficient solutions to this problem has been the main goal of this research project. Based on the developed abstract model, the designed methodology replaces the single action implementation and evaluation by multiple actions implemented and evaluated concurrently. This parallel evaluation of actions speeds up significantly the evolution time taken to select the best action suited to unknown context compared to the iterative approach. The designed and implemented framework efficiently carries out concurrent multi-action evaluation when an unknown context is encountered and finds the best course of action. Two concrete implementations of the framework were carried out demonstrating the usability and adaptability of the framework across multiple domains. The first implementation was in the domain of database performance tuning. The concrete implementation of the framework demonstrated the ability of concurrent multi-action evaluation technique to performance tune a database when performance is regressed for an unknown reason. The second implementation demonstrated the ability of the framework to correctly determine the threshold price to be used in a name-your-own-price channel when an unknown context is encountered. In conclusion the research introduced a new paradigm of a self-adaptation technique for context-aware application. Among the existing body of work, the concurrent multi-action evaluation is classified under the abstract concept of experiment-based self-adaptation techniques
    corecore