13,735 research outputs found

    VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language Navigation

    Full text link
    Outdoor Vision-and-Language Navigation (VLN) requires an agent to navigate through realistic 3D outdoor environments based on natural language instructions. The performance of existing VLN methods is limited by insufficient diversity in navigation environments and limited training data. To address these issues, we propose VLN-Video, which utilizes the diverse outdoor environments present in driving videos in multiple cities in the U.S. augmented with automatically generated navigation instructions and actions to improve outdoor VLN performance. VLN-Video combines the best of intuitive classical approaches and modern deep learning techniques, using template infilling to generate grounded navigation instructions, combined with an image rotation similarity-based navigation action predictor to obtain VLN style data from driving videos for pretraining deep learning VLN models. We pre-train the model on the Touchdown dataset and our video-augmented dataset created from driving videos with three proxy tasks: Masked Language Modeling, Instruction and Trajectory Matching, and Next Action Prediction, so as to learn temporally-aware and visually-aligned instruction representations. The learned instruction representation is adapted to the state-of-the-art navigator when fine-tuning on the Touchdown dataset. Empirical results demonstrate that VLN-Video significantly outperforms previous state-of-the-art models by 2.1% in task completion rate, achieving a new state-of-the-art on the Touchdown dataset.Comment: AAAI 202

    Smartphone Augmented Reality Applications for Tourism

    Get PDF
    Invisible, attentive and adaptive technologies that provide tourists with relevant services and information anytime and anywhere may no longer be a vision from the future. The new display paradigm, stemming from the synergy of new mobile devices, context-awareness and AR, has the potential to enhance tourists’ experiences and make them exceptional. However, effective and usable design is still in its infancy. In this publication we present an overview of current smartphone AR applications outlining tourism-related domain-specific design challenges. This study is part of an ongoing research project aiming at developing a better understanding of the design space for smartphone context-aware AR applications for tourists

    Performance Evaluation of Mobile U-Navigation based on GPS/WLAN Hybridization

    Get PDF
    This paper present our mobile u-navigation system. This approach utilizes hybridization of wireless local area network and Global Positioning System internal sensor which to receive signal strength from access point and the same time retrieve Global Navigation System Satellite signal. This positioning information will be switched based on type of environment in order to ensure the ubiquity of positioning system. Finally we present our results to illustrate the performance of the localization system for an indoor/ outdoor environment set-up.Comment: Journal of Convergence Information Technology(JCIT

    Deep Thermal Imaging: Proximate Material Type Recognition in the Wild through Deep Learning of Spatial Surface Temperature Patterns

    Get PDF
    We introduce Deep Thermal Imaging, a new approach for close-range automatic recognition of materials to enhance the understanding of people and ubiquitous technologies of their proximal environment. Our approach uses a low-cost mobile thermal camera integrated into a smartphone to capture thermal textures. A deep neural network classifies these textures into material types. This approach works effectively without the need for ambient light sources or direct contact with materials. Furthermore, the use of a deep learning network removes the need to handcraft the set of features for different materials. We evaluated the performance of the system by training it to recognise 32 material types in both indoor and outdoor environments. Our approach produced recognition accuracies above 98% in 14,860 images of 15 indoor materials and above 89% in 26,584 images of 17 outdoor materials. We conclude by discussing its potentials for real-time use in HCI applications and future directions.Comment: Proceedings of the 2018 CHI Conference on Human Factors in Computing System

    Spatially augmented audio delivery: applications of spatial sound awareness in sensor-equipped indoor environments

    Get PDF
    Current mainstream audio playback paradigms do not take any account of a user's physical location or orientation in the delivery of audio through headphones or speakers. Thus audio is usually presented as a static perception whereby it is naturally a dynamic 3D phenomenon audio environment. It fails to take advantage of our innate psycho-acoustical perception that we have of sound source locations around us. Described in this paper is an operational platform which we have built to augment the sound from a generic set of wireless headphones. We do this in a way that overcomes the spatial awareness limitation of audio playback in indoor 3D environments which are both location-aware and sensor-equipped. This platform provides access to an audio-spatial presentation modality which by its nature lends itself to numerous cross-dissiplinary applications. In the paper we present the platform and two demonstration applications
    • 

    corecore