199 research outputs found

    Prediction of Visual Behaviour in Immersive Contents

    Get PDF
    In the world of broadcasting and streaming, multi-view video provides the ability to present multiple perspectives of the same video sequence, therefore providing to the viewer a sense of immersion in the real-world scene. It can be compared to VR and 360° video, still, there are significant differences, notably in the way that images are acquired: instead of placing the user at the center, presenting the scene around the user in a 360° circle, it uses multiple cameras placed in a 360° circle around the real-world scene of interest, capturing all of the possible perspectives of that scene. Additionally, in relation to VR, it uses natural video sequences and displays. One issue which plagues content streaming of all kinds is the bandwidth requirement which, particularly on VR and multi-view applications, translates into an increase of the required data transmission rate. A possible solution to lower the required bandwidth, would be to limit the number of views to be streamed fully, focusing on those surrounding the area at which the user is keeping his sight. This is proposed by SmoothMV, a multi-view system that uses a non-intrusive head tracking approach to enhance navigation and Quality of Experience (QoE) of the viewer. This system relies on a novel "Hot&Cold" matrix concept to translate head positioning data into viewing angle selections. The main goal of this dissertation focus on the transformation and storage of the data acquired using SmoothMV into datasets. These will be used as training data for a proposed Neural Network, fully integrated within SmoothMV, with the purpose of predicting the interest points on the screen of the users during the playback of multi-view content. The goal behind this effort is to predict possible viewing interests from the user in the near future and optimize bandwidth usage through buffering of adjacent views which could possibly be requested by the user. After concluding the development of this dataset, work in this dissertation will focus on the formulation of a solution to present generated heatmaps of the most viewed areas per video, previously captured using SmoothMV

    Real-time human detection from depth images with heuristic approach

    Get PDF
    Abstract. The first industrial robot was built in the mid-20th century. The idea of the industrial robots was to replace humans in assembly lines, where the tasks were repetitive and easy to do. The benefits of these robots are that they are able to work around the clock and only need electricity as compensation. Over the years, robots capable of only doing repetitive tasks have evolved to operate fully autonomously in challenging environments. Some examples of these are self-driving cars and service robots that can work as customer servants. This is mainly accomplished through advancements in artificial intelligence, machine vision, and depth camera technologies. With machine vision and depth perception, robots are able to construct a fully structured environment around them and this allows them to properly react to sudden changes in their surroundings. In this project, a naive detection algorithm was implemented to separate humans from depth images. The algorithm works by removing the ground plane, after which the floating objects can be separated more easily. The floating objects are further processed, and the human detection part is then achieved using a heuristic approach. The proposed algorithm works in real time and reliably detects people standing in a relatively open environment. However, because of the naive approach, human-sized items are wrongly detected as humans in some scenarios.Tiivistelmä. Ensimmäinen teollisuusrobotti rakennettiin 1900-luvun puolivälissä. Teollisuusrobottien tarkoitus oli korvata ihmiset tehtaiden kokoonpanolinjoilla, joissa työtehtävät olivat pääsääntöisesti yksinkertaisia ja itseään toistavia. Näiden robottien etuna on, että ne kykenevät työskentelemään kellon ympäri pelkän sähkön varassa. Vuosien mittaan robotit ovat kehittyneet yksinkertaisista koneista roboteiksi, jotka kykenevät toimimaan täysin itsenäisesti haastavissakin olosuhteissa. Itseajavat autot ja asiakaspalvelijana toimivat palvelurobotit ovat näistä hyviä esimerkkejä. Tällaiset saavutukset ovat olleet mahdollisia tekoälyn, konenäön ja syvyyskameroiden kehityksen myötä. Kone- ja syvyysnäön avulla robotit pystyvät muodostamaan itselleen selkeän kuvan ympäristöstään, mikä mahdollistaa nopean reagoinnin yllättäviinkin muutoksiin ympäristössä. Tässä työssä toteutettiin naiivi havaitsemisalgoritmi erottelemaan ihmiset syvyyskuvista. Algoritmi poistaa maatason, jonka jälkeen ilmassa leijuvat esineet voidaan erotella toisistaan. Erotetut esineet jatkokäsitellään, jonka jälkeen ihmisten havaitseminen toteutetaan heuristisella menetelmällä. Työssä esitelty algoritmi toimii reaaliajassa ja pystyy luotettavasti havaitsemaan ihmiset suhteellisen avoimessa ympäristössä, vaikkakin joissain tapauksissa ihmisen kokoiset esineet luokitellaan väärin ihmisiksi naiivin lähestymistavan vuoksi

    Assistente de navegação com apontador laser para conduzir cadeiras de rodas robotizadas

    Get PDF
    Orientador: Eric RohmerDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: As soluções de robótica assistida ajudam as pessoas a recuperar sua mobilidade e autonomia perdidas em suas vidas diárias. Este documento apresenta um assistente de navegação de baixo custo projetado para pessoas tetraplégicas para dirigir uma cadeira de rodas robotizada usando a combinação da orientação da cabeça e expressões faciais (sorriso e sobrancelhas para cima) para enviar comandos para a cadeira. O assistente fornece dois modos de navegação: manual e semi-autônomo. Na navegação manual, uma webcam normal com o algoritmo OpenFace detecta a orientação da cabeça do usuário e expressões faciais (sorriso, sobrancelhas para cima) para compor comandos e atuar diretamente nos movimentos da cadeira de rodas (parar, ir à frente, virar à direita, virar à esquerda). No modo semi-autônomo, o usuário controla um laser pan-tilt com a cabeça para apontar o destino desejado no solo e valida com o comando sobrancelhas para cima que faz com que a cadeira de rodas robotizada realize uma rotação seguida de um deslocamento linear para o alvo escolhido. Embora o assistente precise de melhorias, os resultados mostraram que essa solução pode ser uma tecnologia promissora para pessoas paralisadas do pescoço para controlar uma cadeira de rodas robotizadaAbstract: Assistive robotics solutions help people to recover their lost mobility and autonomy in their daily life. This document presents a low-cost navigation assistant designed for people paralyzed from down the neck to drive a robotized wheelchair using the combination of the head's posture and facial expressions (smile and eyebrows up) to send commands to the chair. The assistant provides two navigation modes: manual and semi-autonomous. In the manual navigation, a regular webcam with the OpenFace algorithm detects the user's head orientation and facial expressions (smile, eyebrows up) to compose commands and actuate directly on the wheelchair movements (stop, go front, turn right, turn left). In the semi-autonomous, the user controls a pan-tilt laser with his/her head to point the desired destination on the ground and validates with eyebrows up command which makes the robotized wheelchair performs a rotation followed by a linear displacement to the chosen target. Although the assistant need improvements, results have shown that this solution may be a promising technology for people paralyzed from down the neck to control a robotized wheelchairMestradoEngenharia de ComputaçãoMestre em Engenharia ElétricaCAPE

    Requirement analysis and sensor specifications – First version

    Get PDF
    In this first version of the deliverable, we make the following contributions: to design the WEKIT capturing platform and the associated experience capturing API, we use a methodology for system engineering that is relevant for different domains such as: aviation, space, and medical and different professions such as: technicians, astronauts, and medical staff. Furthermore, in the methodology, we explore the system engineering process and how it can be used in the project to support the different work packages and more importantly the different deliverables that will follow the current. Next, we provide a mapping of high level functions or tasks (associated with experience transfer from expert to trainee) to low level functions such as: gaze, voice, video, body posture, hand gestures, bio-signals, fatigue levels, and location of the user in the environment. In addition, we link the low level functions to their associated sensors. Moreover, we provide a brief overview of the state-of-the-art sensors in terms of their technical specifications, possible limitations, standards, and platforms. We outline a set of recommendations pertaining to the sensors that are most relevant for the WEKIT project taking into consideration the environmental, technical and human factors described in other deliverables. We recommend Microsoft Hololens (for Augmented reality glasses), MyndBand and Neurosky chipset (for EEG), Microsoft Kinect and Lumo Lift (for body posture tracking), and Leapmotion, Intel RealSense and Myo armband (for hand gesture tracking). For eye tracking, an existing eye-tracking system can be customised to complement the augmented reality glasses, and built-in microphone of the augmented reality glasses can capture the expert’s voice. We propose a modular approach for the design of the WEKIT experience capturing system, and recommend that the capturing system should have sufficient storage or transmission capabilities. Finally, we highlight common issues associated with the use of different sensors. We consider that the set of recommendations can be useful for the design and integration of the WEKIT capturing platform and the WEKIT experience capturing API to expedite the time required to select the combination of sensors which will be used in the first prototype.WEKI

    MoPeDT: A Modular Head-Mounted Display Toolkit to Conduct Peripheral Vision Research

    Full text link
    Peripheral vision plays a significant role in human perception and orientation. However, its relevance for human-computer interaction, especially head-mounted displays, has not been fully explored yet. In the past, a few specialized appliances were developed to display visual cues in the periphery, each designed for a single specific use case only. A multi-purpose headset to exclusively augment peripheral vision did not exist yet. We introduce MoPeDT: Modular Peripheral Display Toolkit, a freely available, flexible, reconfigurable, and extendable headset to conduct peripheral vision research. MoPeDT can be built with a 3D printer and off-the-shelf components. It features multiple spatially configurable near-eye display modules and full 3D tracking inside and outside the lab. With our system, researchers and designers may easily develop and prototype novel peripheral vision interaction and visualization techniques. We demonstrate the versatility of our headset with several possible applications for spatial awareness, balance, interaction, feedback, and notifications. We conducted a small study to evaluate the usability of the system. We found that participants were largely not irritated by the peripheral cues, but the headset's comfort could be further improved. We also evaluated our system based on established heuristics for human-computer interaction toolkits to show how MoPeDT adapts to changing requirements, lowers the entry barrier for peripheral vision research, and facilitates expressive power in the combination of modular building blocks.Comment: Accepted IEEE VR 2023 conference pape

    User emotional interaction processor: a tool to support the development of GUIs through physiological user monitoring

    Get PDF
    Ever since computers have entered humans' daily lives, the activity between the human and the digital ecosystems has increased. This increase encourages the development of smarter and more user-friendly human-computer interfaces. However, to test these interfaces, the means of interaction have been limited, for the most part restricted to the conventional interface, the "manual" interface, where physical input is required, where participants who test these interfaces use a keyboard, mouse, or a touch screen, and where communication between participants and designers is required. There is another method, which will be applied in this dissertation, which does not require physical input from the participants, which is called Affective Computing. This dissertation presents the development of a tool to support the development of graphical interfaces, based on the monitoring of psychological and physiological aspects of the user (emotions and attention), aiming to improve the experience of the end user, with the ultimate goal of improving the interface design. The development of this tool will be described. The results, provided by designers from an IT company, suggest that the tool is useful but that the optimized interface generated by it still has some flaws. These flaws are mainly related to the lack of consideration of a general context in the interface generation process.Desde que os computadores entraram na vida diária dos humanos, a atividade entre o ecossistema humano e o digital tem aumentado. Este aumento estimula o desenvolvimento de interfaces humano-computador mais inteligentes e apelativas ao utilizador. No entanto, para testar estas interfaces, os meios de interação têm sido limitados, em grande parte restritos à interface convencional, a interface "manual", onde é preciso "input" físico, onde os participantes que testam estas interface, usam um teclado, um rato ou um "touch screen", e onde a comunicação dos participantes com os designers é necessária. Existe outro método, que será aplicado nesta dissertação, que não necessita de "input" físico dos participantes, que se denomina de "Affective Computing". Esta dissertação apresenta o desenvolvimento de uma ferramenta de suporte ao desenvolvimento de interfaces gráficas, baseada na monitorização de aspetos psicológicos e fisiológicos do utilizador (emoções e atenção), visando melhorar a experiência do utilizador final, com o objetivo último de melhorar o design da interface. O desenvolvimento desta ferramenta será descrito. Os resultados, dados por designers de uma empresa de IT, sugerem que esta é útil, mas que a interface otimizada gerada pela mesma tem ainda algumas falhas. Estas falhas estão, principalmente, relacionadas com a ausência de consideração de um contexto geral no processo de geração da interface
    corecore