1,519 research outputs found

    Prediction of Visual Behaviour in Immersive Contents

    Get PDF
    In the world of broadcasting and streaming, multi-view video provides the ability to present multiple perspectives of the same video sequence, therefore providing to the viewer a sense of immersion in the real-world scene. It can be compared to VR and 360° video, still, there are significant differences, notably in the way that images are acquired: instead of placing the user at the center, presenting the scene around the user in a 360° circle, it uses multiple cameras placed in a 360° circle around the real-world scene of interest, capturing all of the possible perspectives of that scene. Additionally, in relation to VR, it uses natural video sequences and displays. One issue which plagues content streaming of all kinds is the bandwidth requirement which, particularly on VR and multi-view applications, translates into an increase of the required data transmission rate. A possible solution to lower the required bandwidth, would be to limit the number of views to be streamed fully, focusing on those surrounding the area at which the user is keeping his sight. This is proposed by SmoothMV, a multi-view system that uses a non-intrusive head tracking approach to enhance navigation and Quality of Experience (QoE) of the viewer. This system relies on a novel "Hot&Cold" matrix concept to translate head positioning data into viewing angle selections. The main goal of this dissertation focus on the transformation and storage of the data acquired using SmoothMV into datasets. These will be used as training data for a proposed Neural Network, fully integrated within SmoothMV, with the purpose of predicting the interest points on the screen of the users during the playback of multi-view content. The goal behind this effort is to predict possible viewing interests from the user in the near future and optimize bandwidth usage through buffering of adjacent views which could possibly be requested by the user. After concluding the development of this dataset, work in this dissertation will focus on the formulation of a solution to present generated heatmaps of the most viewed areas per video, previously captured using SmoothMV

    Real Time Facial Expression Recognition Using Webcam and SDK Affectiva

    Get PDF
    Facial expression is an essential part of communication. For this reason, the issue of human emotions evaluation using a computer is a very interesting topic, which has gained more and more attention in recent years. It is mainly related to the possibility of applying facial expression recognition in many fields such as HCI, video games, virtual reality, and analysing customer satisfaction etc. Emotions determination (recognition process) is often performed in 3 basic phases: face detection, facial features extraction, and last stage - expression classification. Most often you can meet the so-called Ekman’s classification of 6 emotional expressions (or 7 - neutral expression) as well as other types of classification - the Russell circular model, which contains up to 24 or the Plutchik’s Wheel of Emotions. The methods used in the three phases of the recognition process have not only improved over the last 60 years, but new methods and algorithms have also emerged that can determine the ViolaJones detector with greater accuracy and lower computational demands. Therefore, there are currently various solutions in the form of the Software Development Kit (SDK). In this publication, we point to the proposition and creation of our system for real-time emotion classification. Our intention was to create a system that would use all three phases of the recognition process, work fast and stable in real time. That’s why we’ve decided to take advantage of existing Affectiva SDKs. By using the classic webcamera we can detect facial landmarks on the image automatically using the Software Development Kit (SDK) from Affectiva. Geometric feature based approach is used for feature extraction. The distance between landmarks is used as a feature, and for selecting an optimal set of features, the brute force method is used. The proposed system uses neural network algorithm for classification. The proposed system recognizes 6 (respectively 7) facial expressions, namely anger, disgust, fear, happiness, sadness, surprise and neutral. We do not want to point only to the percentage of success of our solution. We want to point out the way we have determined this measurements and the results we have achieved and how these results have significantly influenced our future research direction

    MIRACLE Handbook : Guidelines for Mixed Reality Applications for Culture and Learning Experiences

    Get PDF
    Siirretty Doriast

    PLANR.: Planar Learning Autonomous Navigation Robot

    Get PDF
    PLANR is a self-contained robot capable of mapping a space and generating 2D floor plans of a building while identifying objects of interest. It runs Robot Operating System (ROS) and houses four main hardware components. An Arduino Mega board handles the navigation, while an NVIDIA Jetson TX2, holds most of the processing power and runs ROS. An Orbbec Astra Pro stereoscopic camera is used for recognition of doors, windows and outlets and the RPLiDAR A3 laser scanner is able to give depth for wall detection and dimension measurements. The robot is intended to operate autonomously and without constant human monitoring or intervention. The user is responsible for booting up the robot and extracting the map via SSH before shutting down

    Automatic 3D human modeling: an initial stage towards 2-way inside interaction in mixed reality

    Get PDF
    3D human models play an important role in computer graphics applications from a wide range of domains, including education, entertainment, medical care simulation and military training. In many situations, we want the 3D model to have a visual appearance that matches that of a specific living person and to be able to be controlled by that person in a natural manner. Among other uses, this approach supports the notion of human surrogacy, where the virtual counterpart provides a remote presence for the human who controls the virtual character\u27s behavior. In this dissertation, a human modeling pipeline is proposed for the problem of creating a 3D digital model of a real person. Our solution involves reshaping a 3D human template with a 2D contour of the participant and then mapping the captured texture of that person to the generated mesh. Our method produces an initial contour of a participant by extracting the user image from a natural background. One particularly novel contribution in our approach is the manner in which we improve the initial vertex estimate. We do so through a variant of the ShortStraw corner-finding algorithm commonly used in sketch-based systems. Here, we develop improvements to ShortStraw, presenting an algorithm called IStraw, and then introduce adaptations of this improved version to create a corner-based contour segmentatiuon algorithm. This algorithm provides significant improvements on contour matching over previously developed systems, and does so with low computational complexity. The system presented here advances the state of the art in the following aspects. First, the human modeling process is triggered automatically by matching the participant\u27s pose with an initial pose through a tracking device and software. In our case, the pose capture and skeletal model are provided by the Microsoft Kinect and its associated SDK. Second, color image, depth data, and human tracking information from the Kinect and its SDK are used to automatically extract the contour of the participant and then generate a 3D human model with skeleton. Third, using the pose and the skeletal model, we segment the contour into eight parts and then match the contour points on each segment to a corresponding anchor set associated with a 3D human template. Finally, we map the color image of the person to the 3D model as its corresponding texture map. The whole modeling process only take several seconds and the resulting human model looks like the real person. The geometry of the 3D model matches the contour of the real person, and the model has a photorealistic texture. Furthermore, the mesh of the human model is attached to the skeleton provided in the template, so the model can support programmed animations or be controlled by real people. This human control is commonly done through a literal mapping (motion capture) or a gesture-based puppetry system. Our ultimate goal is to create a mixed reality (MR) system, in which the participants can manipulate virtual objects, and in which these virtual objects can affect the participant, e.g., by restricting their mobility. This MR system prototype design motivated the work of this dissertation, since a realistic 3D human model of the participant is an essential part of implementing this vision

    Measuring attention using Microsoft Kinect

    Get PDF
    The transfer of knowledge between individuals has increasingly become achieved with the aid of interfaces or computerized training applications. However, computer based training currently lacks the ability to monitor human behavioral changes and respond to them accordingly. This study examines the ability to predict user attention using features of body posture and head pose. Predictive abilities are assessed by an analysis of the relationship between the measured posture features and common objective measures of attention, such as reaction time and reaction time variance. Subjects were asked to participate in a series of sustained attention tasks while aspects of body movement and positioning were recorded using a Microsoft Kinect. Results showed support for identifiable patterns of behavior associated with attention while also suggesting the complex inter-relationship of measured features and susceptibility of these features to environmental conditions

    An Interactive Augmented Reality Alphabet 3-Dimensional Pop-up Book For learning and Recognizing the English Alphabet

    Get PDF
    This document describes the process developing an Augmented Reality (AR) alphabet book mobile application. Using only an android phone camera, the child could view the superimposed virtual alphabet 3 dimensional objects in a fun and interactive manner using the marker-less physical alphabet book as the interaction tool. The reason behind choosing alphabet teaching as the topic of the book is that the Alphabet knowledge is the core knowledge of any language. It is a jump-start for children to start reading and recognizing words and sentences, thus learning the alphabet is extremely important, for many researchers, emphasizing on how early, child’s education shapes the child’s successful future. Though there are, a great deal of technology based alphabet books; parents still prefer buying the old style physical books or some might use a virtual technology based book application. The problem is that though the physical book possesses many benefits, that our generation and the generations long before us, have experienced, yet from the current generation children’s point of view, they may in fact find it dull and boring. For, it is commonly recognized, that the current generation children are surrounded all around by technology and gadgets, that can make them board, easily distracted, and may refuse to willingly use a plain non-technology book to learn, and if using a virtual application, they will lose the benefits offered by a physical book. Knowing this, the use of Augmented Reality should solve such a problem. For Augmented Reality (AR) is considered the best of both worlds, where, real and virtual objects are combined in the real environment, that will allow the use of both technology based application and a traditional physical book, combining the benefits of both and meeting the child and the parent midway. Although AR technology is not new, its possible potential in education is just beginning to be investigated. The main aim of this research is to develop an interactive 3-Dimentional alphabet pop-up book, and using digital storytelling, to help teach children to learn and recognize the alphabets. The objectives of the study are to enhance the interactions of the alphabet book, by creating an android application that contains animated interactive 3-Dimentional models, interactive sounds, songs and music. Furthermore, to investigate the use of digital storytelling (music, sounds), interactions and animation effect in learning engagement, through using the augmented reality technology. The scope of this project and research is very wide, it includes the 3D modeling, texturing, rigging & animation, book design and content decision research, furthermore, Augmented Reality and Android applicatio
    corecore