138 research outputs found

    Real-time human body detection and tracking for augmented reality mobile applications

    Get PDF
    Hoje em dia, cada vez mais experiências culturais são melhoradas tendo por base aplicações móveis, incluindo aqueles que usam Realidade Aumentada (RA). Estas aplicações têm crescido em número de utilizadores, em muito suportadas no aumento do poder de cálculo dos processadores mais recentes, na popularidade dos dispositivos móveis (com câmaras de alta definição e sistemas de posicionamento global – GPS), e na massificação da disponibilidade de conexões de internet. Tendo este contexto em mente, o projeto Mobile Five Senses Augmented Reality System for Museums (M5SAR) visa desenvolver um sistema de RA para ser um guia em eventos culturais, históricos e em museus, complementando ou substituindo a orientação tradicional dada pelos guias ou mapas. O trabalho descrito na presente tese faz parte do projeto M5SAR. O sistema completo consiste numa aplicação para dispositivos móveis e num dispositivo físico, a acoplar ao dispositivo móvel, que em conjunto visam explorar os 5 sentidos humanos: visão, audição, tato, olfacto e paladar. O projeto M5SAR tem como objetivos principais (a) detectar peças do museu (por exemplo, pinturas e estátuas (Pereira et al., 2017)), (b) detectar paredes / ambientes do museu (Veiga et al., 2017) e (c) detectar formas humanas para sobrepor o conteúdo de Realidade Aumentada (?). Esta tese apresenta uma abordagem relativamente ao último objectivo, combinando informações de articulações do corpo humano com métodos de sobreposição de roupas. Os atuais sistemas relacionados com a sobreposição de roupas, que permitem ao utilizador mover-se livremente, são baseados em sensores tridimensionais (3D), e.g., Sensor Kinect (Erra et al., 2018), sendo estes não portáteis. A contribuição desta tese é apresentar uma solução portátil baseado na câmara (RGB) do telemóvel que permite ao utilizador movimentar-se livremente, fazendo ao mesmo tempo a sobreposição de roupa (para o corpo completo). Nos últimos anos, a capacidade de Redes Neurais Convolucionais (CNN) foi comprovado numa grande variedade de tarefas de visão computacional, tais como classificação e detecção de objetos e no reconhecimento de faces e texto (Amos et al., 2016; Ren et al., 2015a). Uma das áreas de uso das CNN é a estimativa de posição (pose) humana em ambientes reais (Insafutdinov et al., 2017; Pishchulin et al., 2016). Recentemente, duas populares CNN frameworks para detecção e segmentação de formas humanas apresentam destaque, o OpenPose (Cao et al., 2017;Wei et al., 2016) e o Mask R-CNN (He et al., 2017). No entanto, testes experimentais mostraram que as implementações originais não são adequadas para dispositivos móveis. Apesar disso, estas frameworks são a base para as implementações mais recentes, que possibilitam o uso em dispositivos móveis. Uma abordagem que alcança a estimativa e a segmentação de pose de corpo inteiro é o Mask R-CNN2Go (Jindal, 2018), baseado na estrutura original do Mask R-CNN. A principal razão para o tempo de processamento ser reduzido foi a otimização do número de camadas de convolução e a largura de cada camada. Outra abordagem para obter a estimativa de pose humana em dispositivos móveis foi a modificação da arquitetura original do OpenPose para mobile (Kim, 2018; Solano, 2018) e sua combinação com MobileNets (Howard et al., 2017). MobileNets, como o nome sugere, é projetado para aplicativos móveis, fazendo uso de camadas de convoluções separáveis em profundidade. Essa modificação reduz o tempo de processamento, mas também reduz a precisão na estimativa da pose, quando comparado à arquitetura original. É importante ressaltar que apesar de a detecção de pessoas com a sobreposição de roupas ser um tema atual, já existem aplicações disponíveis no mercado, como o Pozus (GENTLEMINDS, 2018). O Pozus é disponibilizado numa versão beta que é executado no sistema operativo iOS, usa a câmera do telemóvel como entrada para a estimação da pose humana aplicando segmentos de texturas sobre o corpo humano. No entanto, Pozus não faz ajuste de texturas (roupas) à forma da pessoa. Na presente tese, o modelo OpenPose foi usado para determinar as articulações do corpo e diferentes abordagens foram usadas para sobreposição de roupas, enquanto uma pessoa se move em ambientes reais. A primeira abordagem utiliza o algoritmo GrabCut (Rother et al., 2004) para segmentação de pessoas, permitindo o ajuste de segmentos de roupas. Uma segunda abordagem usa uma ferramenta bidimensional (2D) de Animação do Esqueleto para permitir deformações em texturas 2D de acordo com as poses estimadas. A terceira abordagem é semelhante à anterior, mas usa modelos 3D, volumes, para obter uma simulação mais realista do processo de sobreposição de roupas. Os resultados e a prova de conceito são mostrados. Os resultados são coerentes com uma prova de conceito. Os testes revelaram que como trabalho futuro as otimizações para melhorar a precisão do modelo de estimação da pose e o tempo de execução ainda são necessárias para dispositivos móveis. O método final utilizado para sobrepor roupas no corpo demonstrou resultados positivos, pois possibilitaram uma simulação mais realística do processo de sobreposição de roupas.When it comes to visitors at museums and heritage places, objects speak for themselves. Nevertheless, it is important to give visitors the best experience possible, this will lead to an increase in the visits number and enhance the perception and value of the organization. With the aim of enhancing a traditional museum visit, a mobile Augmented Reality (AR) framework is being developed as part of the Mobile Five Senses Augmented Reality (M5SAR) project. This thesis presents an initial approach to human shape detection and AR content superimposition in a mobile environment, achieved by combining information of human body joints with clothes overlapping methods. The present existing systems related to clothes overlapping, that allow the user to move freely, are based mainly in three-dimensional (3D) sensors (e.g., Kinect sensor (Erra et al., 2018)), making them far from being portable. The contribution of this thesis is to present a portable system that allows the user to move freely and does full body clothes overlapping. The OpenPose model (Kim, 2018; Solano, 2018) was used to compute the body joints and different approaches were used for clothes overlapping, while a person is moving in real environments. The first approach uses GrabCut algorithm (Rother et al., 2004) for person segmentation, allowing to fit clothes segments. A second approach uses a bi-dimensional (2D) skeletal animation tool to allow deformations on 2D textures according to the estimated poses. The third approach is similar to the previous, but uses 3D clothes models (volumes) to achieve a more realistic simulation of the process of clothes superimposition. Results and proof-of-concept are shown

    A smart home environment to support safety and risk monitoring for the elderly living independently

    Get PDF
    The elderly prefer to live independently despite vulnerability to age-related challenges. Constant monitoring is required in cases where the elderly are living alone. The home environment can be a dangerous environment for the elderly living independently due to adverse events that can occur at any time. The potential risks for the elderly living independently can be categorised as injury in the home, home environmental risks and inactivity due to unconsciousness. The main research objective was to develop a Smart Home Environment (SHE) that can support risk and safety monitoring for the elderly living independently. An unobtrusive and low cost SHE solution that uses a Raspberry Pi 3 model B, a Microsoft Kinect Sensor and an Aeotec 4-in-1 Multisensor was implemented. The Aeotec Multisensor was used to measure temperature, motion, lighting, and humidity in the home. Data from the multisensor was collected using OpenHAB as the Smart Home Operating System. The information was processed using the Raspberry Pi 3 and push notifications were sent when risk situations were detected. An experimental evaluation was conducted to determine the accuracy with which the prototype SHE detected abnormal events. Evaluation scripts were each evaluated five times. The results show that the prototype has an average accuracy, sensitivity and specificity of 94%, 96.92% and 88.93% respectively. The sensitivity shows that the chance of the prototype missing a risk situation is 3.08%, and the specificity shows that the chance of incorrectly classifying a non-risk situation is 11.07%. The prototype does not require any interaction on the part of the elderly. Relatives and caregivers can remotely monitor the elderly person living independently via the mobile application or a web portal. The total cost of the equipment used was below R3000

    Real time physics-based augmented fitting room using time-of-flight cameras

    Get PDF
    Ankara : The Department of Computer Engineering and the Graduate School of Engineering and Science of Bilkent University, 2013.Thesis (Master's) -- Bilkent University, 2013.Includes bibliographical references leaves 63-72.This thesis proposes a framework for a real-time physically-based augmented cloth tting environment. The required 3D meshes for the human avatar and apparels are modeled with speci c constraints. The models are then animated in real-time using input from a user tracked by a depth sensor. A set of motion lters are introduced in order to improve the quality of the simulation. The physical e ects such as inertia, external and forces and collision are imposed on the apparel meshes. The avatar and the apparels can be customized according to the user. The system runs in real-time on a high-end consumer PC with realistic rendering results.Gültepe, UmutM.S

    Computational Abstraction of Films for Quantitave Analysis of Cinematography

    Get PDF
    Currently, film viewers’ options for getting objective information about films before watching them, are limited. Comparisons are even harder to find and often require extensive film knowledge both by the author and the reader. Such comparisons are inherently subjective, therefore they limit the possibilities for scalable and effective statistical analyses. Apart from trailers, information about films cannot reach viewers audibly or visibly, which seems absurd considering the very nature of film. The thesis examines repeatable quantification methods for computationally abstracting films in order to extract informative data for visualizations and further statistical analy- ses. Theoretical background empowered by multidisciplinary approach and design processes are described. Visualizations of analyses are provided and evaluated for their accuracy and efficiency. Throughout the thesis foundations for the future automated quantification player/plugin, are described aiming to facilitate further developments. Theoretical structures of the website which may act as a gateway that collects and provides data for statistical cinematic research are also discussed

    Study on recognition of facial expressions of affect

    Get PDF
    Facial expression recognition is a particularly interesting field of computer vision since it brings innumerable benefits to our society. Benefits that can be translated into a large number of applications in subjects such as, neuroscience, psychology or computer science. The relevance of the topic is reflected in the vast literature already produced describing notable signs of progress. However, the development and the advancement of new approaches is still facing multiple challenges. Challenges including head-pose variations, illumination variations, identity bias, occlusions, and registration errors. One of the focus in this field is to achieve similar results when moving from a controlled environment to a more naturalistic scenario. Though facial expression recognition has been addressed in considerable different projects, it is feasible to emphasize the call for attention to the design of an interface that simulates addressing patient engagement in healthcare. Since it has been noticed a rising tendency in engaging patients in their healthcare. There are still some open questions need to be answered to make a significant impact on health care

    Automated Tracking of Hand Hygiene Stages

    Get PDF
    The European Centre for Disease Prevention and Control (ECDC) estimates that 2.5 millioncases of Hospital Acquired Infections (HAIs) occur each year in the European Union. Handhygiene is regarded as one of the most important preventive measures for HAIs. If it is implemented properly, hand hygiene can reduce the risk of cross-transmission of an infection in the healthcare environment. Good hand hygiene is not only important for healthcare settings. Therecent ongoing coronavirus pandemic has highlighted the importance of hand hygiene practices in our daily lives, with governments and health authorities around the world promoting goodhand hygiene practices. The WHO has published guidelines of hand hygiene stages to promotegood hand washing practices. A significant amount of existing research has focused on theproblem of tracking hands to enable hand gesture recognition. In this work, gesture trackingdevices and image processing are explored in the context of the hand washing environment.Hand washing videos of professional healthcare workers were carefully observed and analyzedin order to recognize hand features associated with hand hygiene stages that could be extractedautomatically. Selected hand features such as palm shape (flat or curved); palm orientation(palms facing or not); hand trajectory (linear or circular movement) were then extracted andtracked with the help of a 3D gesture tracking device - the Leap Motion Controller. These fea-tures were further coupled together to detect the execution of a required WHO - hand hygienestage,Rub hands palm to palm, with the help of the Leap sensor in real time. In certain conditions, the Leap Motion Controller enables a clear distinction to be made between the left andright hands. However, whenever the two hands came into contact with each other, sensor data from the Leap, such as palm position and palm orientation was lost for one of the two hands.Hand occlusion was found to be a major drawback with the application of the device to this usecase. Therefore, RGB digital cameras were selected for further processing and tracking of the hands. An image processing technique, using a skin detection algorithm, was applied to extractinstantaneous hand positions for further processing, to enable various hand hygiene poses to be detected. Contour and centroid detection algorithms were further applied to track the handtrajectory in hand hygiene video recordings. In addition, feature detection algorithms wereapplied to a hand hygiene pose to extract the useful hand features. The video recordings did not suffer from occlusion as is the case for the Leap sensor, but the segmentation of one handfrom another was identified as a major challenge with images because the contour detectionresulted in a continuous mass when the two hands were in contact. For future work, the datafrom gesture trackers, such as the Leap Motion Controller and cameras (with image processing)could be combined to make a robust hand hygiene gesture classification system

    ADMarker: A Multi-Modal Federated Learning System for Monitoring Digital Biomarkers of Alzheimer's Disease

    Full text link
    Alzheimer's Disease (AD) and related dementia are a growing global health challenge due to the aging population. In this paper, we present ADMarker, the first end-to-end system that integrates multi-modal sensors and new federated learning algorithms for detecting multidimensional AD digital biomarkers in natural living environments. ADMarker features a novel three-stage multi-modal federated learning architecture that can accurately detect digital biomarkers in a privacy-preserving manner. Our approach collectively addresses several major real-world challenges, such as limited data labels, data heterogeneity, and limited computing resources. We built a compact multi-modality hardware system and deployed it in a four-week clinical trial involving 91 elderly participants. The results indicate that ADMarker can accurately detect a comprehensive set of digital biomarkers with up to 93.8% accuracy and identify early AD with an average of 88.9% accuracy. ADMarker offers a new platform that can allow AD clinicians to characterize and track the complex correlation between multidimensional interpretable digital biomarkers, demographic factors of patients, and AD diagnosis in a longitudinal manner

    Designing a Contactless, AI System to Measure the Human Body using a Single Camera for the Clothing and Fashion Industry

    Get PDF
    Using a single RGB camera to obtain accurate body dimensions rather than measuring these manually or via more complex multi-camera or more expensive 3D scanners, has a high application potential for the apparel industry. In this thesis, a system that estimates upper human body measurements using a set of computer vision and machine learning techniques. The main steps involve: (1) using a portable camera; (2) improving image quality; (3) isolating the human body from the surrounding environment; (4) performing a calibration step; (5) extracting body features from the image; (6) indicating markers on the image; (7) producing refined final results. In this research, a unique geometric shape is favored, namely the ellipse, to approximate human body main cross sections. We focus on the upper body horizontal slices (i.e. from head to hips) which, we show, can be well represented by varying an ellipse’s eccentricity, this per individual. Then, evaluating each fitted ellipse’s perimeter allows us to obtain better results than the current state-of-the-art for use in the fashion and online retail industry. In our study, I selected a set of two equations, out of many other possible choices, to best estimate upper human body horizontal cross sections via perimeters of fitted ellipses. In this study, I experimented with the system on a diverse sample of 78 participants. The results for the upper human body measurements in comparison to the traditional manual method of tape measurements, when used as a reference, show ±1cm average differences, sufficient for many applications, including online retail
    corecore