776 research outputs found

    Task-oriented cross-system design for Metaverse in 6G era

    Get PDF
    As an emerging concept, the Metaverse has the potential to revolutionize social interaction in the post-pandemic era by establishing a digital world for online education, remote healthcare, immersive business, intelligent transportation, and advanced manufacturing. The goal is ambitious, yet the methodologies and technologies to achieve the full vision of the Metaverse remain unclear. In this thesis, we first introduce the three pillars of infrastructure that lay the foundation of the Metaverse, i.e., Human-Computer Interfaces (HCIs), sensing and communication systems, and network architectures. Then, we depict the roadmap towards the Metaverse that consists of four stages with different applications. As one of the essential building blocks for the Metaverse, we also review the state-of-the-art Computer Vision for the Metaverse as well as the future scope. To support diverse applications in the Metaverse, we put forward a novel design methodology: task-oriented cross-system design, and further review the potential solutions and future challenges. Specifically, we establish a task-oriented cross-system design for a simple case, where sampling, communications, and prediction modules are jointly optimized for the synchronization of the real-world devices and digital model in the Metaverse. We use domain knowledge to design a deep reinforcement learning (DRL) algorithm to minimize the communication load subject to an average tracking error constraint. We validate our framework on a prototype composed of a real-world robotic arm and its digital model. The results show that our framework achieves a better trade-off between the average tracking error and the average communication load compared to a communication system without sampling and prediction. For example, the average communication load can be reduced to 87% when the average track error constraint is 0.002â—¦ . In addition, our policy outperforms the benchmark with the static sampling rate and prediction horizon optimized by exhaustive search, in terms of the tail probability of the tracking error. Furthermore, with the assistance of expert knowledge, the proposed algorithm achieves a better convergence time, stability, communication load, and average tracking error. Furthermore, we establish a task-oriented cross-system design framework for a general case, where the goal is to minimize the required packet rate for timely and accurate modeling of a real-world robotic arm in the Metaverse. Specifically, different modules including sensing, communications, prediction, control, and rendering are considered. To optimize a scheduling policy and prediction horizons, we design a Constraint Proximal Policy Optimization (CPPO) algorithm by integrating domain knowledge from relevant systems into the advanced reinforcement learning algorithm, Proximal Policy Optimization (PPO). Specifically, the Jacobian matrix for analyzing the motion of the robotic arm is included in the state of the CPPO algorithm, and the Conditional Value-at-Risk (CVaR) of the state-value function characterizing the long-term modeling error is adopted in the constraint. Besides, the policy is represented by a two-branch neural network determining the scheduling policy and the prediction horizons, respectively. To evaluate our algorithm, we build a prototype including a real-world robotic arm and its digital model in the Metaverse. The experimental results indicate that domain knowledge helps to reduce the convergence time and the required packet rate by up to 50%, and the cross-system design framework outperforms a baseline framework in terms of the required packet rate and the tail distribution of the modeling error

    Parallel Algorithms for Rendering Large 3D Models on a Graphics Cluster

    Get PDF
    We address the problem of distributing rendering computations for real-time display of very complex three dimensional (3D) scenes using a graphics cluster. The rendering of 3D scenes is increasingly being carried out using at least two different programs on the graphics processing unit (GPU): a vertex shader program for vertex (geometry) processing, and a fragment shader program for pixel (colour) processing. With fragment shader programs becoming more and more time consuming, distributing load solely based on geometry -- as is done in most contemporary systems -- can cause significant load imbalance and redundant work. In this thesis we propose a number of parallel rendering algorithms which divide the traditional cluster rendering pipeline into two different phases: one which primarily concerns itself over vertex operations to generate depth information, and a second which primarily concerns itself over fragment operations. By performing communication between these two phases, each node can perform fewer fragment operations with little overhead over traditional cluster rendering algorithms. We also propose a number of load-balancing algorithms which utilize the information gained earlier in the pipeline to improve the management of GPU resources. The techniques are implemented on a graphics cluster and experimental results demonstrate significant improvements in rendering performance

    One-sided transparency : a revolution in visualization.

    Get PDF
    Colorectal cancer is one of the leading causes of death in the world. Colonoscopy, the traditional procedure for detecting colorectal cancer, is very effective. It does have downsides, however - it is invasive, uncomfortable for the patient, and not available to some patients with certain conditions. Virtual colonoscopy has been developed in order to address these issues. A virtual colonoscopy (VC) is a non-invasive method for performing a colonoscopy by using medical imaging data to create a virtual representation of the colon. Previous virtual colonoscopy methods include fly-through, fly-over, flattening, and the unfolded cube method. Fly-through moves the camera through the inside of the colon, following a centerline from the length of the colon. Fly-over splits the colon into halves longitudinally, and flies a camera over each half. Flattening reduces the 3D colon model to a 2D image. The unfolded cube method flies a set of cameras along the centerline as in flythrough, but where flythrough had one camera looking along the centerline, the unfolded cube method presents views from six cameras. The six camera views are positioned in the pattern of an unfolded cube, which gives rise to the method’s name. This thesis will present a new method called one-sided transparency (OST). This is a method for visualizing virtual objects so that the interior surfaces can be viewed from the outside. OST has numerous improvements over existing VC methods, particularly when combined with fly-over methods. However, this thesis will also demonstrate that OST is not limited to fly-over nor even to VC, as it has applications in multiple fields. For quantitative evaluation, this thesis focused on comparing specific scenarios that OST excels in visualizing. Fly-through navigation has difficulties with polyps between haustral folds, and prior fly-over work had visual artifacts that degraded the quality of the final visualization. These and other specific cases are visualized using OST in order to highlight the power of this new technique. Additionally, the previous FO method had some significant drawbacks that are solved by the application of OST. These problems and their origins will be addressed, along with the way that OST solves them. This thesis will also explore potential applications for OST outside of VC. This will include a more general visualization of tubular objects. It will be shown that OST has the ability to highlight structural issues and deformities such as cracks and bumps. This has potential applications in medical fields outside of VC as well as in structural engineering. This will demonstrate OST’s usefulness as a general technique, even outside the context of VC. Finally, this thesis will present results regarding OST for VC. It will show that OST presents several advantages over previous VC methods. OST allows easier viewing of polyps in difficult locations and offers a more complete view of the colon. OST has a number of advantages over the existing fly-over method, including faster time-to-viewing, less sensitivity to centerline error, and improved accuracy in the separation of halves

    Task-Oriented Cross-System Design for Timely and Accurate Modeling in the Metaverse

    Full text link
    In this paper, we establish a task-oriented cross-system design framework to minimize the required packet rate for timely and accurate modeling of a real-world robotic arm in the Metaverse, where sensing, communication, prediction, control, and rendering are considered. To optimize a scheduling policy and prediction horizons, we design a Constraint Proximal Policy Optimization(C-PPO) algorithm by integrating domain knowledge from relevant systems into the advanced reinforcement learning algorithm, Proximal Policy Optimization(PPO). Specifically, the Jacobian matrix for analyzing the motion of the robotic arm is included in the state of the C-PPO algorithm, and the Conditional Value-at-Risk(CVaR) of the state-value function characterizing the long-term modeling error is adopted in the constraint. Besides, the policy is represented by a two-branch neural network determining the scheduling policy and the prediction horizons, respectively. To evaluate our algorithm, we build a prototype including a real-world robotic arm and its digital model in the Metaverse. The experimental results indicate that domain knowledge helps to reduce the convergence time and the required packet rate by up to 50%, and the cross-system design framework outperforms a baseline framework in terms of the required packet rate and the tail distribution of the modeling error.Comment: This paper is accepted by IEEE Journal on Selected Areas in Communications, JSAC-SI-HCM 202

    TULIP 4

    Get PDF
    Tulip is an information visualization framework dedicated to the analysis and visualization of relational data. Based on more than 15 years of research and development, Tulip is built on a suite of tools and techniques , that can be used to address a large variety of domain-specific problems. With Tulip, we aim to provide Python and/or C++ developers a complete library, supporting the design of interactive information visualization applications for relational data, that can be customized to address a wide range of visualization problems. In its current iteration, Tulip enables the development of algorithms, visual encodings, interaction techniques, data models, and domain-specific visualizations. This development pipeline makes the framework efficient for creating research prototypes as well as developing end-user applications. The recent addition of a complete Python programming layer wraps up Tulip as an ideal tool for fast prototyping and treatment automation, allowing to focus on problem solving, and as a great system for teaching purposes at all education levels

    MediaSync: Handbook on Multimedia Synchronization

    Get PDF
    This book provides an approachable overview of the most recent advances in the fascinating field of media synchronization (mediasync), gathering contributions from the most representative and influential experts. Understanding the challenges of this field in the current multi-sensory, multi-device, and multi-protocol world is not an easy task. The book revisits the foundations of mediasync, including theoretical frameworks and models, highlights ongoing research efforts, like hybrid broadband broadcast (HBB) delivery and users' perception modeling (i.e., Quality of Experience or QoE), and paves the way for the future (e.g., towards the deployment of multi-sensory and ultra-realistic experiences). Although many advances around mediasync have been devised and deployed, this area of research is getting renewed attention to overcome remaining challenges in the next-generation (heterogeneous and ubiquitous) media ecosystem. Given the significant advances in this research area, its current relevance and the multiple disciplines it involves, the availability of a reference book on mediasync becomes necessary. This book fills the gap in this context. In particular, it addresses key aspects and reviews the most relevant contributions within the mediasync research space, from different perspectives. Mediasync: Handbook on Multimedia Synchronization is the perfect companion for scholars and practitioners that want to acquire strong knowledge about this research area, and also approach the challenges behind ensuring the best mediated experiences, by providing the adequate synchronization between the media elements that constitute these experiences

    Extensions to the SMIL multimedia language

    Get PDF
    The goal of this work has been to extend the Synchronized Multimedia Integration Language (SMIL) to study the capabilities and possibilities of declarative multimedia languages for the World Wide Web (Web). The work has involved design and implementation of several extensions to SMIL. A novel approach to include 3D audio in SMIL was designed and implemented. This involved extending the SMIL 2D spatial model with an extra dimension to support a 3D space. New audio elements and a listening point were positioned in the 3D space. The extension was designed to be modular so that it was possible to use it in conjunction with other XML languages, such as XHTML and Scalable Vector Graphics (SVG) language. Web forms are one of the key features in the Web, as they offer a way to send user data to a server. A similar feature is therefore desirable in SMIL, which currently lacks forms. The XForms language, due to its modular approach, was used to add this feature to SMIL. An evaluation of this integration was carried out as part of this work. Furthermore, the SMIL player was designed to play out dynamic SMIL documents, which can be modified at run-time and the result is immediately reflected in the presentation. Dynamic SMIL enables execution of scripts to modify the presentation. XML Events and ECMAScript were chosen to provide the scripting functionality. In addition, generic methods to extend SMIL were studied based on the previous extensions. These methods include ways to attach new input and output capabilities to SMIL. To experiment with the extensions, a Synchronized Multimedia Integration Language (SMIL) player was developed. The current final version can play out SMIL 2.0 Basic profile documents with a few additional SMIL modules, such as event timing, basic animations, and brush media modules. The player includes all above-mentioned extensions. The SMIL player has been designed to work within an XML browser called X-Smiles. X-Smiles is intended for various embedded devices, such as mobile phones, Personal Digital Assistants (PDA), and digital television set-top boxes. Currently, the browser supports XHTML, SMIL, and XForms, which are developed by the current research group. The browser also supports other XML languages developed by 3rd party open-source projects. The SMIL player can also be run as a standalone player without the browser. The standalone player is portable and has been run on a desktop PC, PDA, and digital television set-top box. The core of the SMIL player is platform-independent, only media renderers require platform-dependent implementation.reviewe

    Highly Interactive Web-Based Courseware

    Get PDF
    Zukünftige Lehr-/Lernprogramme sollen als vernetzte Systeme die Lernenden befähigen, Lerninhalte zu erforschen und zu konstruieren, sowie Verständnisschwierigkeiten und Gedanken in der Lehr-/Lerngemeinschaft zu kommunizieren. Lehrmaterial soll dabei in digitale Lernobjekte übergeführt, kollaborativ von Programmierern, Pädagogen und Designern entwickelt und in einer Datenbank archiviert werden, um von Lehrern und Lernenden eingesetzt, angepasst und weiterentwickelt zu werden. Den ersten Schritt in diese Richtung machte die Lerntechnologie, indem sie Wiederverwendbarkeit und Kompabilität für hypermediale Kurse spezifizierte. Ein größeres Maß an Interaktivität wird bisher allerdings noch nicht in Betracht gezogen. Jedes interaktive Lernobjekt wird als autonome Hypermedia-Einheit angesehen, aufwändig in der Erstellung, und weder mehrstufig verschränk- noch anpassbar, oder gar adäquat spezifizierbar. Dynamische Eigenschaften, Aussehen und Verhalten sind fest vorgegeben. Die vorgestellte Arbeit konzipiert und realisiert Lerntechnologie für hypermediale Kurse unter besonderer Berücksichtigung hochgradig interaktiver Lernobjekte. Innovativ ist dabei zunächst die mehrstufige, komponenten-basierte Technologie, die verschiedenste strukturelle Abstufungen von kompletten Lernobjekten und Werkzeugsätzen bis hin zu Basiskomponenten und Skripten, einzelnen Programmanweisungen, erlaubt. Zweitens erweitert die vorgeschlagene Methodik Kollaboration und individuelle Anpassung seitens der Teilnehmer eines hypermedialen Kurses auf die Software-Ebene. Komponenten werden zu verknüpfbaren Hypermedia-Objekten, die in der Kursdatenbank verwaltet und von allen Kursteilnehmern bewertet, mit Anmerkungen versehen und modifiziert werden. Neben einer detaillierten Beschreibung der Lerntechnologie und Entwurfsmuster für interaktive Lernobjekte sowie verwandte hypermediale Kurse wird der Begriff der Interaktivität verdeutlicht, indem eine kombinierte technologische und symbolische Definition von Interaktionsgraden vorgestellt und daraus ein visuelles Skriptschema abgeleitet wird, welches Funktionalität übertragbar macht. Weiterhin wird die Evolution von Hypermedia und Lehr-/Lernprogrammen besprochen, um wesentliche Techniken für interaktive, hypermediale Kurse auszuwählen. Die vorgeschlagene Architektur unterstützt mehrsprachige, alternative Inhalte, bietet konsistente Referenzen und ist leicht zu pflegen, und besitzt selbst für interaktive Inhalte Online-Assistenten. Der Einsatz hochgradiger Interaktivität in Lehr-/Lernprogrammen wird mit hypermedialen Kursen im Bereich der Computergraphik illustriert.The grand vision of educational software is that of a networked system enabling the learner to explore, discover, and construct subject matters and communicate problems and ideas with other community members. Educational material is transformed into reusable learning objects, created collaboratively by developers, educators, and designers, preserved in a digital library, and utilized, adapted, and evolved by educators and learners. Recent advances in learning technology specified reusability and interoperability in Web-based courseware. However, great interactivity is not yet considered. Each interactive learning object represents an autonomous hypermedia entity, laborious to create, impossible to interlink and to adapt in a graduated manner, and hard to specify. Dynamic attributes, the look and feel, and functionality are predefined. This work designs and realizes learning technology for Web-based courseware with special regard to highly interactive learning objects. The innovative aspect initially lies in the multi-level, component-based technology providing a graduated structuring. Components range from complex learning objects to toolkits to primitive components and scripts. Secondly, the proposed methodologies extend community support in Web-based courseware – collaboration and personalization – to the software layer. Components become linkable hypermedia objects and part of the courseware repository, rated, annotated, and modified by all community members. In addition to a detailed description of technology and design patterns for interactive learning objects and matching Web-based courseware, the thesis clarifies the denotation of interactivity in educational software formulating combined levels of technological and symbolical interactivity, and deduces a visual scripting metaphor for transporting functionality. Further, it reviews the evolution of hypermedia and educational software to extract substantial techniques for interactive Web-based courseware. The proposed framework supports multilingual, alternative content, provides link consistency and easy maintenance, and includes state-driven online wizards also for interactive content. The impact of great interactivity in educational software is illustrated with courseware in the Computer Graphics domain

    Real-Time Volumetric Lighting using SVOs

    Get PDF
    This thesis experiments with the data structure of a sparse voxel octree (SVO)to see if it may improve the performance of empty space ray marching in volumes. While ray marching is a somewhat new technique it is used more often in traversal of volumes. It can be used for realistic volumetric effects in computer games or it can be used in the medical field when examining and visualizing MRI scans. While it has many uses it is however very computationally heavy. The usage in real-time applications is therefore limited as the hardware must be able to maintain enough frames per second to satisfy the standard. Normally one would sample the volume with a fixed sample step in order to extract the information in the volume, even if there is just empty space. The idea of a sparse octree is that it allows the ray to take greater steps past this empty space and thus only sample the actual data. This thesis will explain how to implement and use an octree when rendering smoke in a volume, and showcasing the many challenges that comes with this. The result is compared with a three dimensional texture of the same smoke
    • …
    corecore