11 research outputs found

    The Evolution of First Person Vision Methods: A Survey

    Full text link
    The emergence of new wearable technologies such as action cameras and smart-glasses has increased the interest of computer vision scientists in the First Person perspective. Nowadays, this field is attracting attention and investments of companies aiming to develop commercial devices with First Person Vision recording capabilities. Due to this interest, an increasing demand of methods to process these videos, possibly in real-time, is expected. Current approaches present a particular combinations of different image features and quantitative methods to accomplish specific objectives like object detection, activity recognition, user machine interaction and so on. This paper summarizes the evolution of the state of the art in First Person Vision video analysis between 1997 and 2014, highlighting, among others, most commonly used features, methods, challenges and opportunities within the field.Comment: First Person Vision, Egocentric Vision, Wearable Devices, Smart Glasses, Computer Vision, Video Analytics, Human-machine Interactio

    Crowdsourcing design guidance for contextual adaptation of text content in augmented reality

    Get PDF
    Funding Information: This work was supported by EPSRC (grants EP/R004471/1 and EP/S027432/1). Supporting data for this publication is available at https://doi.org/10.17863/CAM.62931.Augmented Reality (AR) can deliver engaging user experiences that seamlessly meld virtual content with the physical environment. However, building such experiences is challenging due to the developer's inability to assess how uncontrolled deployment contexts may infuence the user experience. To address this issue, we demonstrate a method for rapidly conducting AR experiments and real-world data collection in the user's own physical environment using a privacy-conscious mobile web application. The approach leverages the large number of distinct user contexts accessible through crowdsourcing to efciently source diverse context and perceptual preference data. The insights gathered through this method complement emerging design guidance and sample-limited lab-based studies. The utility of the method is illustrated by reexamining the design challenge of adapting AR text content to the user's environment. Finally, we demonstrate how gathered design insight can be operationalized to provide adaptive text content functionality in an AR headset.Publisher PD

    Dé-augmentation de la réalité augmentée visuelle

    Get PDF
    We anticipate a future in which people frequently have virtual content displayed in their field of view to augment reality. Situations where this virtual content interferes with users' perception of the physical world will thus be more frequent, with consequences ranging from mere annoyance to serious injuries. We argue for the need to give users agency over virtual augmentations, discussing the concept of de-augmenting augmented reality by selectively removing virtual content from the field of view. De-augmenting lets users target what actually interferes with their perception of the environment while keeping what is of interest. We contribute a framework that captures the different facets of de-augmentation. We discuss what it entails in terms of technical realization and interaction design, and end with three scenarios to illustrate what the user experience could be in a sample of domestic and professional situations.Nous anticipons un avenir dans lequel les gens auront fréquemment du contenu virtuel affiché dans leur champ de vision pour augmenter la réalité. Les situations où ce contenu virtuel interfère avec la perception du monde physique par les utilisateurs seront donc plus fréquentes, avec des conséquences allant du simple désagrément à des accidents graves. Nous soutenons la nécessité de donner aux utilisateurs la possibilité d'agir sur les augmentations virtuelles, en discutant du concept de dé-augmentation de la réalité augmentée par la suppression sélective du contenu virtuel du champ de vision. La dé-augmentation permet aux utilisateurs de cibler ce qui interfère réellement avec leur perception de l'environnement tout en conservant ce qui est intéressant. Nous proposons un cadre qui capture les différentes facettes de la dé-augmentation. Nous discutons de ce que cela implique en termes de réalisation technique et de conception de l'interaction, et nous terminons par trois scénarios pour illustrer ce que pourrait être l'expérience de l'utilisateur dans des situations domestiques et professionnelles

    Smart Museum Untuk Media Edukasi Berbasis Internet Of Things

    Get PDF
    Museum merupakan tempat disimpannya benda bersejarah serta pusat pendidikan sejarah untuk mengenalkan budaya bangsa. Pemanfaatan teknologi di museum untuk menuju kota pintar merupakan suatu tantangan. Internet of thing (IoT) merupakan kemajuan teknologi di bidang Informasi dan komunikasi (ICT) yang bisa diterapkan di museum. Pengembangan ICT saat ini tidak hanya sebagai media transmisi, akan tetapi teknologi Augmented Reality juga mulai dikembangkan. Saat ini, teknologi Augmented Reality membuat objek virtual ke dunia nyata menggunakan marker atau gambar. Pada penelitian ini, peneliti menggunakan sinyal untuk membuat objek virtual muncul di dunia nyata dengan menggunakan protokol IEEE 802.14.5 sebagai pengganti marker Augmented Reality. RSSI dan Triangulation digunakan sebagai microlokasi pengganti munculnya objek AR. Hasil penelitian didapatkan bahwa performa Wireless Sensor Network dapat digunakan untuk pengiriman data di dalam ruangan museum. Hal ini dibuktikan dengan penelitian LOS pada jarak 15 meter dengan delay 1000 ms tingkat error 1.4% dan NLOS dengan tingkat error 2.3%. Sehingga dapat disimpulkan bahwa pemanfaatan teknologi (IOT) dengan menggunakan sinyal Wireless Sensor Network sebagai pengganti marker Augmented Reality dapat diterapkan di museuem. Menurut penilaian pengguna belum berhasil. Dikarenakan delay tidak <0.1 detik dan membuat pengguna bosan. ===================================================================================================== Museum is a place to keep the historic objects and historical education center to introduce the nation's culture. Utilizing technology in a museum to become a smart city is a challenge. Internet of thing (IOT) is a technological advance in Information and communication (ICT) that can be applied in the museum. The current ICT development is not only a transmission medium, but Augmented Reality technology is also being developed. Currently, Augmented Reality technology creates virtual objects into the real world using markers or images. In this study, researcher used signals to make virtual objects appear in the real world using the IEEE 802.14.5 protocol replacing the Augmented Reality marker. RSSI and triangulation are used as a substitute microlocation for AR objects. The result is the performance of Wireless Sensor Network could be used for data transmission in the museum. LOS research at a distance of 15 meters with 1000 ms delay found 1.4% error rate and NLOS with 2.3% error rate. So it can be concluded that utilization technology (IOT) using signal wireless sensor network as a replace for marker augmented reality can be used in museum. According to the user experience assessment has not been successful. In Claim delay is not <0.1 second and user is bored

    Semi-dense filter-based visual odometry for automotive augmented reality applications

    Get PDF
    In order to integrate virtual objects convincingly into a real scene, Augmented Reality (AR) systems typically need to solve two problems: Firstly, the movement and position of the AR system within the environment needs to be known to be able to compensate the motion of the AR system in order to make placement of the virtual objects stable relative to the real world and to provide overall correct placement of virtual objects. Secondly, an AR system needs to have a notion of the geometry of the real environment to be able to properly integrate virtual objects into the real scene via techniques such as the determination of the occlusion relation between real and virtual objects or context-aware positioning of virtual content. To solve the second problem, the following two approaches have emerged: A simple solution is to create a map of the real scene a priori by whatever means and to then use this map in real-time operation of the AR system. A more challenging, but also more flexible solution is to create a map of the environment dynamically from real time data of sensors of the AR-system. Our target applications are Augmented Reality in-car infotainment systems in which a video of a forward facing camera is augmented. Using map data to determine the geometry of the environment of the vehicle is limited by the fact that currently available digital maps only provide a rather coarse and abstract picture of the world. Furthermore, map coverage and amount of detail vary greatly regionally and between different maps. Hence, the objective of the presented thesis is to obtain the geometry of the environment in real time from vehicle sensors. More specifically, the aim is to obtain the scene geometry by triangulating it from the camera images at different camera positions (i.e. stereo computation) while the vehicle moves. The problem of estimating geometry from camera images where the camera positions are not (exactly) known is investigated in the (overlapping) fields of visual odometry (VO) and structure from motion (SfM). Since Augmented Reality applications have tight latency requirements, it is necessary to obtain an estimate of the current scene geometry for each frame of the video stream without delay. Furthermore, Augmented Reality applications need detailed information about the scene geometry, which means dense (or semi-dense) depth estimation, that is one depth estimate per pixel. The capability of low-latency geometry estimation is currently only found in filter based VO methods, which model the depth estimates of the pixels as the state vector of a probabilistic filter (e.g. Kalman filter). However, such filters maintain a covariance matrix for the uncertainty of the pixel depth estimates whose complexity is quadratic in the number of estimated pixel depths, which causes infeasible complexity for dense depth estimation. To resolve this conflict, the (full) covariance matrix will be replaced by a matrix requiring only linear complexity in processing and storage. This way, filter-based VO methods can be combined with dense estimation techniques and efficiently scaled up to arbitrarily large image sizes while allowing easy parallelization. For treating the covariance matrix of the filter state, two methods are introduced and discussed. These methods are implemented as modifications to the (existing) VO method LSD-SLAM, yielding the "continuous" variant C-LSD-SLAM. In the first method, a diagonal matrix is used as the covariance matrix. In particular, the correlation between different scene point estimates is neglected. For stabilizing the resulting VO method in forward motion, a reweighting scheme is introduced based on how far scene point estimates are moved when reprojecting them from one frame to the next frame. This way, erroneous scene point estimates are prevented from causing the VO method to diverge. The second method for treating the covariance matrix models the correlation of the scene point estimates caused by camera pose uncertainty by approximating the combined influence of all camera pose estimates in a small subspace of the scene point estimates. This subspace has fixed dimension 15, which forces the complexity of the replacement of the covariance matrix to be linear in the number of scene point estimates

    Situated Analytics for Data Scientists

    Get PDF
    Much of Mark Weiser's vision of ``ubiquitous computing'' has come to fruition: We live in a world of interfaces that connect us with systems, devices, and people wherever we are. However, those of us in jobs that involve analyzing data and developing software find ourselves tied to environments that limit when and where we may conduct our work; it is ungainly and awkward to pull out a laptop during a stroll through a park, for example, but difficult to write a program on one's phone. In this dissertation, I discuss the current state of data visualization in data science and analysis workflows, the emerging domains of immersive and situated analytics, and how immersive and situated implementations and visualization techniques can be used to support data science. I will then describe the results of several years of my own empirical work with data scientists and other analytical professionals, particularly (though not exclusively) those employed with the U.S. Department of Commerce. These results, as they relate to visualization and visual analytics design based on user task performance, observations by the researcher and participants, and evaluation of observational data collected during user sessions, represent the first thread of research I will discuss in this dissertation. I will demonstrate how they might act as the guiding basis for my implementation of immersive and situated analytics systems and techniques. As a data scientist and economist myself, I am naturally inclined to want to use high-frequency observational data to the end of realizing a research goal; indeed, a large part of my research contributions---and a second ``thread'' of research to be presented in this dissertation---have been around interpreting user behavior using real-time data collected during user sessions. I argue that the relationship between immersive analytics and data science can and should be reciprocal: While immersive implementations can support data science work, methods borrowed from data science are particularly well-suited for supporting the evaluation of the embodied interactions common in immersive and situated environments. I make this argument based on both the ease and importance of collecting spatial data from user sessions from the sensors required for immersive systems to function that I have experienced during the course of my own empirical work with data scientists. As part of this thread of research working from this perspective, this dissertation will introduce a framework for interpreting user session data that I evaluate with user experience researchers working in the tech industry. Finally, this dissertation will present a synthesis of these two threads of research. I combine the design guidelines I derive from my empirical work with machine learning and signal processing techniques to interpret user behavior in real time in Wizualization, a mid-air gesture and speech-based augmented reality visual analytics system
    corecore