458 research outputs found

    Stereo Matching in Time: 100+ FPS Video Stereo Matching for Extended Reality

    Full text link
    Real-time Stereo Matching is a cornerstone algorithm for many Extended Reality (XR) applications, such as indoor 3D understanding, video pass-through, and mixed-reality games. Despite significant advancements in deep stereo methods, achieving real-time depth inference with high accuracy on a low-power device remains a major challenge. One of the major difficulties is the lack of high-quality indoor video stereo training datasets captured by head-mounted VR/AR glasses. To address this issue, we introduce a novel video stereo synthetic dataset that comprises photorealistic renderings of various indoor scenes and realistic camera motion captured by a 6-DoF moving VR/AR head-mounted display (HMD). This facilitates the evaluation of existing approaches and promotes further research on indoor augmented reality scenarios. Our newly proposed dataset enables us to develop a novel framework for continuous video-rate stereo matching. As another contribution, our dataset enables us to proposed a new video-based stereo matching approach tailored for XR applications, which achieves real-time inference at an impressive 134fps on a standard desktop computer, or 30fps on a battery-powered HMD. Our key insight is that disparity and contextual information are highly correlated and redundant between consecutive stereo frames. By unrolling an iterative cost aggregation in time (i.e. in the temporal dimension), we are able to distribute and reuse the aggregated features over time. This approach leads to a substantial reduction in computation without sacrificing accuracy. We conducted extensive evaluations and comparisons and demonstrated that our method achieves superior performance compared to the current state-of-the-art, making it a strong contender for real-time stereo matching in VR/AR applications

    Motion parallax for 360° RGBD video

    Get PDF
    We present a method for adding parallax and real-time playback of 360° videos in Virtual Reality headsets. In current video players, the playback does not respond to translational head movement, which reduces the feeling of immersion, and causes motion sickness for some viewers. Given a 360° video and its corresponding depth (provided by current stereo 360° stitching algorithms), a naive image-based rendering approach would use the depth to generate a 3D mesh around the viewer, then translate it appropriately as the viewer moves their head. However, this approach breaks at depth discontinuities, showing visible distortions, whereas cutting the mesh at such discontinuities leads to ragged silhouettes and holes at disocclusions. We address these issues by improving the given initial depth map to yield cleaner, more natural silhouettes. We rely on a three-layer scene representation, made up of a foreground layer and two static background layers, to handle disocclusions by propagating information from multiple frames for the first background layer, and then inpainting for the second one. Our system works with input from many of today''s most popular 360° stereo capture devices (e.g., Yi Halo or GoPro Odyssey), and works well even if the original video does not provide depth information. Our user studies confirm that our method provides a more compelling viewing experience than without parallax, increasing immersion while reducing discomfort and nausea

    WinDB: HMD-free and Distortion-free Panoptic Video Fixation Learning

    Full text link
    To date, the widely-adopted way to perform fixation collection in panoptic video is based on a head-mounted display (HMD), where participants' fixations are collected while wearing an HMD to explore the given panoptic scene freely. However, this widely-used data collection method is insufficient for training deep models to accurately predict which regions in a given panoptic are most important when it contains intermittent salient events. The main reason is that there always exist "blind zooms" when using HMD to collect fixations since the participants cannot keep spinning their heads to explore the entire panoptic scene all the time. Consequently, the collected fixations tend to be trapped in some local views, leaving the remaining areas to be the "blind zooms". Therefore, fixation data collected using HMD-based methods that accumulate local views cannot accurately represent the overall global importance of complex panoramic scenes. This paper introduces the auxiliary Window with a Dynamic Blurring (WinDB) fixation collection approach for panoptic video, which doesn't need HMD and is blind-zoom-free. Thus, the collected fixations can well reflect the regional-wise importance degree. Using our WinDB approach, we have released a new PanopticVideo-300 dataset, containing 300 panoptic clips covering over 225 categories. Besides, we have presented a simple baseline design to take full advantage of PanopticVideo-300 to handle the blind-zoom-free attribute-induced fixation shifting problem

    Saliency prediction in 360° architectural scenes:Performance and impact of daylight variations

    Get PDF
    Saliency models are image-based prediction models that estimate human visual attention. Such models, when applied to architectural spaces, could pave the way for design decisions where visual attention is taken into account. In this study, we tested the performance of eleven commonly used saliency models that combine traditional and deep learning methods on 126 rendered interior scenes with associated head tracking data. The data was extracted from three experiments conducted in virtual reality between 2016 and 2018. Two of these datasets pertain to the perceptual effects of daylight and include variations of daylighting conditions for a limited set of interior spaces, thereby allowing to test the influence of light conditions on human head movement. Ground truth maps were extracted from the collected head tracking logs, and the prediction accuracy of the models was tested via the correlation coefficient between ground truth and prediction maps. To address the possible inflation of results due to the equator bias, we conducted complementary analyses by restricting the area of investigation to the equatorial image regions. Although limited to immersive virtual environments, the promising performance of some traditional models such as GBVS360eq and BMS360eq for colored and textured architectural rendered spaces offers us the prospect of their possible integration into design tools. We also observed a strong correlation in head movements for the same space lit by different types of sky, a finding whose generalization requires further investigations based on datasets more specifically developed to address this question.</p

    Saliency prediction in 360° architectural scenes: Performance and impact of daylight variations

    Get PDF
    Saliency models are image-based prediction models that estimate human visual attention. Such models, when applied to architectural spaces, could pave the way for design decisions where visual attention is taken into account. In this study, we tested the performance of eleven commonly used saliency models that combine traditional and deep learning methods on 126 rendered interior scenes with associated head tracking data. The data was extracted from three experiments conducted in virtual reality between 2016 and 2018. Two of these datasets pertain to the perceptual effects of daylight and include variations of daylighting conditions for a limited set of interior spaces, thereby allowing to test the influence of light conditions on human head movement. Ground truth maps were extracted from the collected head tracking logs, and the prediction accuracy of the models was tested via the correlation coefficient between ground truth and prediction maps. To address the possible inflation of results due to the equator bias, we conducted complementary analyses by restricting the area of investigation to the equatorial image regions. Although limited to immersive virtual environments, the promising performance of some traditional models such as GBVS360eq and BMS360eq for colored and textured architectural rendered spaces offers us the prospect of their possible integration into design tools. We also observed a strong correlation in head movements for the same space lit by different types of sky, a finding whose generalization requires further investigations based on datasets more specifically developed to address this question

    Saliency prediction in 360° architectural scenes:Performance and impact of daylight variations

    Get PDF
    Saliency models are image-based prediction models that estimate human visual attention. Such models, when applied to architectural spaces, could pave the way for design decisions where visual attention is taken into account. In this study, we tested the performance of eleven commonly used saliency models that combine traditional and deep learning methods on 126 rendered interior scenes with associated head tracking data. The data was extracted from three experiments conducted in virtual reality between 2016 and 2018. Two of these datasets pertain to the perceptual effects of daylight and include variations of daylighting conditions for a limited set of interior spaces, thereby allowing to test the influence of light conditions on human head movement. Ground truth maps were extracted from the collected head tracking logs, and the prediction accuracy of the models was tested via the correlation coefficient between ground truth and prediction maps. To address the possible inflation of results due to the equator bias, we conducted complementary analyses by restricting the area of investigation to the equatorial image regions. Although limited to immersive virtual environments, the promising performance of some traditional models such as GBVS360eq and BMS360eq for colored and textured architectural rendered spaces offers us the prospect of their possible integration into design tools. We also observed a strong correlation in head movements for the same space lit by different types of sky, a finding whose generalization requires further investigations based on datasets more specifically developed to address this question.</p

    Analysis and Development of Augmented Reality Applications for the Dissemination of Cultural Heritage

    Full text link
    Tesis por compendio[ES] La RA consiste en la superposición de elementos virtuales sobre el entorno real, de manera que el usuario percibe estos elementos como si formaran parte de la realidad que está visualizando. Las aplicaciones de RA en dispositivos móviles permiten visualizar el contenido virtual a través de la cámara del dispositivo. La RA es una herramienta de divulgación muy potente ya que permite añadir a la realidad cualquier tipo de información, desde un simple texto informativo a un modelo 3D interactivo. Tiene infinitas utilidades, puede servir de guía en un museo, puede mostrar la recreación de un monumento destruido, o como en el caso de estudio aquí presentado, ayudar a la interpretación de pinturas rupestres. Esta tesis parte de la idea de que la RA puede mejorar mucho la interpretación del arte rupestre sin alterar ni dañar las pinturas. Puede servir para atraer a un público mayor, dar a conocer la historia de las pinturas rupestres y que al mismo tiempo el visitante tenga una experiencia mucho más enriquecedora. A lo largo de la tesis se ha estudiado en profundidad la técnica de visualización de RA mediante dispositivos móviles. Se han analizado las diferentes librerías de programación mediante casos de estudio en entornos reales y examinado los factores que pueden afectar al reconocimiento de las pinturas. Se ha desarrollado una aplicación de RA aplicada a un caso real de pinturas rupestres y posteriormente ha sido evaluada por un grupo de personas. Finalmente, se ha estudiado el efecto de la luz solar y sus cambios a lo largo del día sobre el reconocimiento de imágenes en entornos al aire libre. Este trabajo proporciona un punto de partida para el desarrollo de aplicaciones de RA aplicadas a la difusión del patrimonio cultural, especialmente centrado en el arte rupestre, un entorno que sufre de unas dificultades añadidas debido a su localización, dificultad de reconocimiento de puntos característicos en las pinturas y los cambios en la luz solar, problemas que se han tratado de resolver a lo largo del estudio. Las principales conclusiones han sido muy favorables, partiendo de librerías de programación disponibles y gratuitas. Se han podido desarrollar un conjunto de aplicaciones de RA en diferentes lugares. Las valoraciones han sido muy positivas, los usuarios que han probado las aplicaciones afirman que la interpretación de las pinturas les resulta más fácil y consiguen entender mejor el propósito de las mismas. El principal inconveniente encontrado es la falta de conocimiento sobre esta técnica y la pérdida de realismo en algunos casos debido a la oclusión, es decir, que los objetos virtuales no se posicionen por detrás de los objetos reales. La buena noticia es que esta tecnología evoluciona muy rápido y durante el desarrollo de la tesis ha habido avances muy grandes, entre ellos, el desarrollo de nuevas librerías de programación desarrolladas por Google y Apple, que proporcionan las herramientas necesarias para crear aplicaciones muy potentes e immersivas, donde el usuario se sentirá parte de los entornos creados.[CA] La RA consisteix en la superposició d'elements virtuals sobre l'entorn real, de manera que l'usuari percep aquests elements com si formaren part de la realitat que està visualitzant. Les aplicacions de RA en dispositius mòbils permeten visualitzar el contingut virtual a través de la cambra del dispositiu. La RA és una eina de divulgació molt potent ja que permet afegir a la realitat qualsevol tipus d'informació, des d'un simple text informatiu a un model 3D interactiu. Té infinites utilitats, pot servir de guia en un museu, pot mostrar la recreació d'un monument destruït, o com en el cas d'estudi ací presentat, ajudar a la interpretació de pintures rupestres. Aquesta tesi parteix de la idea que la RA pot millorar molt la interpretació de l'art rupestre sense alterar ni danyar les pintures. Pot servir per a atraure a un públic major, donar a conéixer la història de les pintures rupestres i que al mateix temps el visitant tinga una experiència molt més enriquidora. Al llarg de la tesi s'ha estudiat en profunditat la tècnica de visualització de RA mitjançant dispositius mòbils. S'han analitzat les diferents llibreries de programació mitjançant casos d'estudi en entorns reals i analitzat els factors que poden afectar el reconeixement de les pintures. S'ha desenvolupat una aplicació de RA aplicada a un cas real de pintures rupestres i posteriorment ha sigut avaluada per un grup de persones. Finalment, s'ha estudiat l'efecte de la llum solar i els seus canvis al llarg del dia sobre el reconeixement d'imatges en entorns a l'aire lliure. Aquest treball proporciona un punt de partida per al desenvolupament d'aplicacions de RA aplicades a la difusió del patrimoni cultural, especialment centrat en l'art rupestre, un entorn que pateix d'unes dificultats afegides a causa de la seua localització, dificultat de reconeixement de punts característics en les pintures i els canvis en la llum solar, problemes que s'han tractat de resoldre al llarg de l'estudi. Les principals conclusions han sigut molt favorables, partint de llibreries de programació disponibles i gratuïtes. S'han pogut desenvolupar un conjunt d'aplicacions de RA en diferents llocs. Les valoracions han sigut molt positives, els usuaris que han provat les aplicacions afirmen que la interpretació de les pintures els resulta més fàcil i aconsegueixen entendre millor el propòsit d'aquestes. El principal inconvenient trobat és la falta de coneixement sobre aquesta tècnica i la perduda de realisme en alguns casos a causa de l'oclusió, és a dir, que els objectes virtuals no es posicionen per darrere dels objectes reals. La bona notícia és que aquesta tecnologia evoluciona molt ràpid i durant el desenvolupament de la tesi hi ha hagut avanços molt grans, entre ells, el desenvolupament de noves llibreries de programació per Google i Apple, que proporcionen les eines necessàries per a crear aplicacions molt potents i immersives, on l'usuari se sentirà part dels entorns creats.[EN] AR consists of superimposing virtual elements on the real environment, so that the user perceives these elements as if they were part of the reality they are looking at. AR applications on smartphones allow virtual content to be visualised through the device's camera. AR is a very powerful tool for dissemination as it allows any type of information to be added to reality, from a simple informative text to an interactive 3D model. It can be used as a guide in a museum, it can show the recreation of a destroyed monument, or, as in the case study presented here, it can help in the interpretation of cave paintings. This thesis is based on the idea that AR can greatly enhance the interpretation of rock art without affecting or damaging the paintings. It can be used to attract a wider audience, to introduce the history of the rock art paintings and at the same time provide the visitor with a much more enriching experience. Throughout the thesis, the technique of AR visualisation using mobile devices has been studied in-depth. The different programming libraries have been analysed by means of case studies in real environments as well as the factors that can affect the paintings recognition. An AR application applied to a real case of rock art paintings has been developed and subsequently evaluated by a group of people. Finally, the effect of sunlight and its changes throughout the day on image recognition in outdoor environments has been studied. This work provides a starting point for the AR applications development applied to the dissemination of cultural heritage, especially focused on rock art, an environment that suffers from additional difficulties due to its location, the difficulty of characteristic points recognition and changes in sunlight, problems that have been tried to solve throughout the study. The main outcomes have been very favourable, using freely available programming libraries, and it has been possible to develop a set of AR applications in different places. The evaluations have been very positive, with users who have tested the applications confirming that the interpretation of the paintings is easier for them and they can better understand the purpose of the paintings. The major drawback is the lack of knowledge about this technique and the loss of realism in some cases due to occlusion, i.e. the virtual objects are not positioned behind the real objects. The good news is that this technology is evolving very fast and during the development of the thesis there have been great advances, among them, the development of new programming libraries developed by Google and Apple, which provide the necessary tools to create very powerful and immersive applications, where the user will feel part of the virtual environments created.Blanco Pons, S. (2021). Analysis and Development of Augmented Reality Applications for the Dissemination of Cultural Heritage [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/178895TESISCompendi

    Methods for Augmented Reality E-commerce

    Get PDF
    A new type of e-commerce system and related techniques are presented in this dissertation that customers of this type of e-commerce could visually bring product into their physical environment for interaction. The development and user study of this e-commerce system are provided. A new modeling method, which recovers 3D model directly from 2D photos without knowing camera information, is also presented to reduce the modeling cost of this new type of e-commerce. Also an immersive AR environment with GPU based occlusion is also presented to improve the rendering and usability of AR applications. Experiment results and data show the validity of these new technologies
    • …
    corecore