1,862 research outputs found

    Adaptive User Perspective Rendering for Handheld Augmented Reality

    Full text link
    Handheld Augmented Reality commonly implements some variant of magic lens rendering, which turns only a fraction of the user's real environment into AR while the rest of the environment remains unaffected. Since handheld AR devices are commonly equipped with video see-through capabilities, AR magic lens applications often suffer from spatial distortions, because the AR environment is presented from the perspective of the camera of the mobile device. Recent approaches counteract this distortion based on estimations of the user's head position, rendering the scene from the user's perspective. To this end, approaches usually apply face-tracking algorithms on the front camera of the mobile device. However, this demands high computational resources and therefore commonly affects the performance of the application beyond the already high computational load of AR applications. In this paper, we present a method to reduce the computational demands for user perspective rendering by applying lightweight optical flow tracking and an estimation of the user's motion before head tracking is started. We demonstrate the suitability of our approach for computationally limited mobile devices and we compare it to device perspective rendering, to head tracked user perspective rendering, as well as to fixed point of view user perspective rendering

    Real-time single image depth perception in the wild with handheld devices

    Full text link
    Depth perception is paramount to tackle real-world problems, ranging from autonomous driving to consumer applications. For the latter, depth estimation from a single image represents the most versatile solution, since a standard camera is available on almost any handheld device. Nonetheless, two main issues limit its practical deployment: i) the low reliability when deployed in-the-wild and ii) the demanding resource requirements to achieve real-time performance, often not compatible with such devices. Therefore, in this paper, we deeply investigate these issues showing how they are both addressable adopting appropriate network design and training strategies -- also outlining how to map the resulting networks on handheld devices to achieve real-time performance. Our thorough evaluation highlights the ability of such fast networks to generalize well to new environments, a crucial feature required to tackle the extremely varied contexts faced in real applications. Indeed, to further support this evidence, we report experimental results concerning real-time depth-aware augmented reality and image blurring with smartphones in-the-wild.Comment: 11 pages, 9 figure

    Magnetosensitive e-skins for interactive electronics

    Get PDF
    The rapid progress of electronics and computer science in the last years has brought humans and machines closer than ever before. Current trends like the Internet of Things and artificial intelligence are closing the gap even further, by providing ubiquitous data processing and sensing. As this ongoing revolution advances, novel forms of human-machine interactions are required in an ever more connected world. A crucial component to enable these interactions is the field of flexible electronics, which aims to establish a seamless link between living and artificial entities using electronic skins (e-skins). E-skins combine the functionality of commercial electronics with the soft, stretchable and biocompatible characteristics of human skin or tissue. Until lately, the focus had been to replicate the standard functions associated with human skin, such as, temperature, pressure and chemical detection. Yet, recent developments have also introduced non-standard sensing capabilities like magnetic field detection to create the field of magnetosensitive e-skins. The addition of a supplementary information channel—an electronic sixth sense—has sparked a wide range of applications in the fields of cognitive psychology and human-machine interactions. In this thesis, we expand the concept of magnetosensitive e-skins to include the notion of directionality, which utilizes the full interaction potential of the magnetic field vector. Also, we introduce the use of flexible magnetoelectronics in virtual/augmented reality and human-computer interfaces. Three main results are attained in the course of this work: (i) we first demonstrate how magnetosensitive e-skins can be used as humanmachine interfaces driven by permanent magnet sources in the range of 5 mT. (ii) Building upon this milestone, we realize the first magnetosensitive e-skins which are driven by the earth’s magnetic field of 50 μT. (iii) We fabricate magnetosensitive e-skins which push the detection limit below 1 μT. The magnetosensitive e-skins in this work open exciting possibilities for sensory substitution experiments and sensory processing disorder therapies. Futhermore, for human-machine interactions, they provide a new interactive platform for touchless and gestural control in virtual and augmented reality scenarios beyond the limitations of optics-based systems.Der rasante Fortschritt der Elektronik und der Informatik in den letzten Jahren hat Mensch und Maschine nähergebracht als je zuvor. Aktuelle Trends wie das Internet der Dinge und künstliche Intelligenz schließen die Lücke noch weiter, indem sie eine allgegenwärtige Datenverarbeitung und -erfassung ermöglichen. Mit fortschreitender Revolution sind neue Formen der Mensch-Maschine-Interaktion in einer immer vernetzter werdenden Welt erforderlich. Eine entscheidende Komponente, um diese Interaktionen zu ermöglichen, ist das Gebiet der flexiblen Elektronik, das darauf abzielt, mithilfe elektronischer Häute (e-skins) eine nahtlose Verbindung zwischen lebenden und künstlichen Entitäten herzustellen. E-skins verbinden die Funktionalität kommerzieller Elektronik mit den weichen, dehnbaren und biokompatiblen Eigenschaften menschlicher Haut oder menschlichen Gewebes. Bis vor kurzem lag der Schwerpunkt auf der Nachbildung der mit der menschlichen Haut verbundenen Standardfunktionen wie Temperatur-, Druck- und Chemikalienerkennung. Jüngste Entwicklungen haben jedoch auch nicht standardmäßige Erfassungsfähigkeiten wie die Magnetfelderkennung eingeführt, um das Feld magnetoempfindlicher e-skins zu erzeugen. Die Hinzufügung eines zusätzlichen Informationskanals - eines elektronischen sechsten Sinns - hat eine breite Palette von Anwendungen auf den Gebieten der kognitiven Psychologie und der Mensch-Maschine-Interaktionen ausgelöst. In dieser Arbeit erweitern wir das Konzept der magnetoempfindlichen e-skins um den Begriff der Richtwirkung, bei dem das volle Wechselwirkungspotential des Magnetfeldvektors genutzt wird. Außerdem führen wir die Verwendung flexibler Magnetoelektronik in der virtuellen Realität / erweiterten Realität und in Mensch-Computer-Schnittstellen ein. Im Verlauf dieser Arbeit werden drei Hauptergebnisse erzielt: (i) Wir demonstrieren erstmals, wie magnetoempfindliche e-skins als Mensch-Maschine-Schnittstellen verwendet werden können, die von Permanentmagnetquellen im Bereich von 5 mT angetrieben werden. (ii) Aufbauend auf diesem Meilenstein realisieren wir die ersten magnetoempfindlichen e-skins, die vom Erdmagnetfeld von 50 μT angetrieben werden. (iii) Wir fertigen magnetoempfindliche e-skins, bei denen die Nachweisgrenze unter 1 μT liegt. Die magnetoempfindlichen e-skins in dieser Arbeit eröffnen aufregende Möglichkeiten für sensorische Substitutionsexperimente und Therapien bei sensorischen Verarbeitungsstörungen. Darüber hinaus bieten sie für die Mensch-Maschine-Interaktion eine neue interaktive Plattform für die berührungslose und gestische Steuerung in virtuellen und Augmented Reality-Szenarien, die über die Grenzen optikbasierter Systeme hinausgehen

    Efficient 3D Reconstruction, Streaming and Visualization of Static and Dynamic Scene Parts for Multi-client Live-telepresence in Large-scale Environments

    Full text link
    Despite the impressive progress of telepresence systems for room-scale scenes with static and dynamic scene entities, expanding their capabilities to scenarios with larger dynamic environments beyond a fixed size of a few square-meters remains challenging. In this paper, we aim at sharing 3D live-telepresence experiences in large-scale environments beyond room scale with both static and dynamic scene entities at practical bandwidth requirements only based on light-weight scene capture with a single moving consumer-grade RGB-D camera. To this end, we present a system which is built upon a novel hybrid volumetric scene representation in terms of the combination of a voxel-based scene representation for the static contents, that not only stores the reconstructed surface geometry but also contains information about the object semantics as well as their accumulated dynamic movement over time, and a point-cloud-based representation for dynamic scene parts, where the respective separation from static parts is achieved based on semantic and instance information extracted for the input frames. With an independent yet simultaneous streaming of both static and dynamic content, where we seamlessly integrate potentially moving but currently static scene entities in the static model until they are becoming dynamic again, as well as the fusion of static and dynamic data at the remote client, our system is able to achieve VR-based live-telepresence at close to real-time rates. Our evaluation demonstrates the potential of our novel approach in terms of visual quality, performance, and ablation studies regarding involved design choices

    Data-driven depth and 3D architectural layout estimation of an interior environment from monocular panoramic input

    Get PDF
    Recent years have seen significant interest in the automatic 3D reconstruction of indoor scenes, leading to a distinct and very-active sub-field within 3D reconstruction. The main objective is to convert rapidly measured data representing real-world indoor environments into models encompassing geometric, structural, and visual abstractions. This thesis focuses on the particular subject of extracting geometric information from single panoramic images, using either visual data alone or sparse registered depth information. The appeal of this setup lies in the efficiency and cost-effectiveness of data acquisition using 360o images. The challenge, however, is that creating a comprehensive model from mostly visual input is extremely difficult, due to noise, missing data, and clutter. My research has concentrated on leveraging prior information, in the form of architectural and data-driven priors derived from large annotated datasets, to develop end-to-end deep learning solutions for specific tasks in the structured reconstruction pipeline. My first contribution consists in a deep neural network architecture for estimating a depth map from a single monocular indoor panorama, operating directly on the equirectangular projection. Leveraging the characteristics of indoor 360-degree images and recognizing the impact of gravity on indoor scene design, the network efficiently encodes the scene into vertical spherical slices. By exploiting long- and short- term relationships among these slices, it recovers an equirectangular depth map directly from the corresponding RGB image. My second contribution generalizes the approach to handle multimodal input, also covering the situation in which the equirectangular input image is paired with a sparse depth map, as provided from common capture setups. Depth is inferred using an efficient single-branch network with a dynamic gating system, processing both dense visual data and sparse geometric data. Additionally, a new augmentation strategy enhances the model's robustness to various types of sparsity, including those from structured light sensors and LiDAR setups. While the first two contributions focus on per-pixel geometric information, my third contribution addresses the recovery of the 3D shape of permanent room surfaces from a single panoramic image. Unlike previous methods, this approach tackles the problem in 3D, expanding the reconstruction space. It employs a graph convolutional network to directly infer the room structure as a 3D mesh, deforming a graph- encoded tessellated sphere mapped to the spherical panorama. Gravity- aligned features are actively incorporated using a projection layer with multi-head self-attention, and specialized losses guide plausible solutions in the presence of clutter and occlusions. The benchmarks on publicly available data show that all three methods provided significant improvements over the state-of-the-art

    Käyttäjätason ohjelmistokontittaminen pilviradioliityntäverkossa

    Get PDF
    The amount of devices connected through mobile networks has been growing rapidly. This growth will create a demand for network capacity that cannot be met with traditional methods. This problem could be solved by implementing a cloud radio access network (RAN), a new concept, to adapt cloud computing technologies, such as software containers, from the software industry to RANs. This adaptation will also create a need to modify working practices in order to better comply with these new cloud computing technologies. While cloud RAN has recently received much research attention, the actual software implementations have not been widely discussed in the literature. Therefore, this thesis evaluates the feasibility of using software containers in the user-plane applications of cloud RAN in terms of networking and inter-container communications (ICC). This is accomplished by identifying potential approaches for ICC and for container networking as well as measuring the performance of these approaches. Two approaches are proposed for ICC and container networking. The approaches were evaluated in terms of throughput and latency. These approaches were found to be suitable for use in cloud RAN user-plane applications. However, since the measurements were performed in a simplified environment, implementing the approaches into a cloud RAN component will require further work.Mobiiliverkkoihin liitettävien laitteiden määrä kasvaa nopeasti. Tämä kasvu tulee luomaan verkon kapasiteetille kysynnän, johon ei kyetä vastaamaan perinteisin menetelmin. Tämä ongelma voitaineen ratkaista implementoimalla pilviradioliityntäverkko (Cloud RAN), uusi konsepti, joka sovittaa ohjelmistoalalla vakiintuneita pilvilaskentateknologioita käytettäväksi radioliityntäverkoissa (radio access network, RAN). Tämä sovitusprosessi luo tarpeen mukauttaa myös työskentelytavat yhteensopiviksi uusien pilvilaskentateknologioiden kanssa. Vaikka pilviradioliityntäverkkoa on tutkittu aktiivisesti viime aikoina, käytännön ohjelmistototeutukset eivät juuri ole olleet esillä kirjallisuudessa. Tämä diplomityö arvioi ohjelmistokonttien (software containers) soveltuvuutta käytettäväksi pilviradioliityntäverkon käyttäjätason (user-plane) applikaatioissa verkottamisen (networking) ja ohjelmistokonttien välisen kommunikoinnin (inter-container communications, ICC) suhteen. Tämä arviointi suoritetaan identifioimalla mahdollisia toteutuksia ohjelmistokonttien väliselle kommunikaatiolle ja ohjelmistokonttien verkottamiselle sekä mittaamalla näiden toteutuksien suorituskyky. Tässä diplomityössä ehdotetaan tutkittavaksi kaksi toteutusta ohjelmistokonttien väliselle kommunikaatiolle ja ohjelmistokonttien verkottamiselle. Nämä toteutukset arvioitiin välityskyvyn (throughput) ja latenssin suhteen. Näiden toteutuksien todettiin olevan soveliaita käytettäväksi pilviradioliityntäverkon käyttäjätason applikaatioissa. Kuitenkin, koska mittaukset toteutettiin yksinkertaistetussa ympäristössä, vaatii toteutuksien implementointi pilviradioliityntäverkon komponenttiin lisätyötä

    Interactive natural user interfaces

    Get PDF
    For many years, science fiction entertainment has showcased holographic technology and futuristic user interfaces that have stimulated the world\u27s imagination. Movies such as Star Wars and Minority Report portray characters interacting with free-floating 3D displays and manipulating virtual objects as though they were tangible. While these futuristic concepts are intriguing, it\u27s difficult to locate a commercial, interactive holographic video solution in an everyday electronics store. As used in this work, it should be noted that the term holography refers to artificially created, free-floating objects whereas the traditional term refers to the recording and reconstruction of 3D image data from 2D mediums. This research addresses the need for a feasible technological solution that allows users to work with projected, interactive and touch-sensitive 3D virtual environments. This research will aim to construct an interactive holographic user interface system by consolidating existing commodity hardware and interaction algorithms. In addition, this work studies the best design practices for human-centric factors related to 3D user interfaces. The problem of 3D user interfaces has been well-researched. When portrayed in science fiction, futuristic user interfaces usually consist of a holographic display, interaction controls and feedback mechanisms. In reality, holographic displays are usually represented by volumetric or multi-parallax technology. In this work, a novel holographic display is presented which leverages a mini-projector to produce a free-floating image onto a fog-like surface. The holographic user interface system will consist of a display component: to project a free-floating image; a tracking component: to allow the user to interact with the 3D display via gestures; and a software component: which drives the complete hardware system. After examining this research, readers will be well-informed on how to build an intuitive, eye-catching holographic user interface system for various application arenas

    Enabling Artificial Intelligence Analytics on The Edge

    Get PDF
    This thesis introduces a novel distributed model for handling in real-time, edge-based video analytics. The novelty of the model relies on decoupling and distributing the services into several decomposed functions, creating virtual function chains (V F C model). The model considers both computational and communication constraints. Theoretical, simulation and experimental results have shown that the V F C model can enable the support of heavy-load services to an edge environment while improving the footprint of the service compared to state-of-the art frameworks. In detail, results on the V F C model have shown that it can reduce the total edge cost, compared with a monolithic and a simple frame distribution models. For experimenting on a real-case scenario, a testbed edge environment has been developed, where the aforementioned models, as well as a general distribution framework (Apache Spark ©), have been deployed. A cloud service has also been considered. Experiments have shown that V F C can outperform all alternative approaches, by reducing operational cost and improving the QoS. Finally, a migration model, a caching model and a QoS monitoring service based on Long-Term-Short-Term models are introduced

    AdaptiX -- A Transitional XR Framework for Development and Evaluation of Shared Control Applications in Assistive Robotics

    Full text link
    With the ongoing efforts to empower people with mobility impairments and the increase in technological acceptance by the general public, assistive technologies, such as collaborative robotic arms, are gaining popularity. Yet, their widespread success is limited by usability issues, specifically the disparity between user input and software control along the autonomy continuum. To address this, shared control concepts provide opportunities to combine the targeted increase of user autonomy with a certain level of computer assistance. This paper presents the free and open-source AdaptiX XR framework for developing and evaluating shared control applications in a high-resolution simulation environment. The initial framework consists of a simulated robotic arm with an example scenario in Virtual Reality (VR), multiple standard control interfaces, and a specialized recording/replay system. AdaptiX can easily be extended for specific research needs, allowing Human-Robot Interaction (HRI) researchers to rapidly design and test novel interaction methods, intervention strategies, and multi-modal feedback techniques, without requiring an actual physical robotic arm during the early phases of ideation, prototyping, and evaluation. Also, a Robot Operating System (ROS) integration enables the controlling of a real robotic arm in a PhysicalTwin approach without any simulation-reality gap. Here, we review the capabilities and limitations of AdaptiX in detail and present three bodies of research based on the framework. AdaptiX can be accessed at https://adaptix.robot-research.de.Comment: Accepted submission at The 16th ACM SIGCHI Symposium on Engineering Interactive Computing Systems (EICS'24
    corecore