1,233 research outputs found

    Machine Learning for Multimedia Communications

    Get PDF
    Machine learning is revolutionizing the way multimedia information is processed and transmitted to users. After intensive and powerful training, some impressive efficiency/accuracy improvements have been made all over the transmission pipeline. For example, the high model capacity of the learning-based architectures enables us to accurately model the image and video behavior such that tremendous compression gains can be achieved. Similarly, error concealment, streaming strategy or even user perception modeling have widely benefited from the recent learningoriented developments. However, learning-based algorithms often imply drastic changes to the way data are represented or consumed, meaning that the overall pipeline can be affected even though a subpart of it is optimized. In this paper, we review the recent major advances that have been proposed all across the transmission chain, and we discuss their potential impact and the research challenges that they raise

    Information Olfactation: Theory, Design, and Evaluation

    Get PDF
    Olfactory feedback for analytical tasks is a virtually unexplored area in spite of the advantages it offers for information recall, feature identification, and location detection. Here we introduce the concept of ‘Information Olfactation’ as the fragrant sibling of information visualization, and discuss how scent can be used to convey data. Building on a review of the human olfactory system and mirroring common visualization practice, we propose olfactory marks, the substrate in which they exist, and their olfactory channels that are available to designers. To exemplify this idea, we present ‘viScent(1.0)’: a six-scent stereo olfactory display capable of conveying olfactory glyphs of varying temperature and direction, as well as a corresponding software system that integrates the display with a traditional visualization display. We also conduct a comprehensive perceptual experiment on Information Olfactation: the use of olfactory marks and channels to convey data. More specifically, following the example from graphical perception studies, we design an experiment that studies the perceptual accuracy of four ``olfactory channels''---scent type, scent intensity, airflow, and temperature---for conveying three different types of data---nominal, ordinal, and quantitative. We also present details of an advanced 24-scent olfactory display: ‘viScent(2.0)’ and its software framework that we designed in order to run this experiment. Our results yield a ranking of olfactory channels for each data type that follows similar principles as rankings for visual channels, such as those derived by Mackinlay, Cleveland & McGill, and Bertin

    Steered mixture-of-experts for light field images and video : representation and coding

    Get PDF
    Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution

    Visual Techniques for Geological Fieldwork Using Mobile Devices

    Get PDF
    Visual techniques in general and 3D visualisation in particular have seen considerable adoption within the last 30 years in the geosciences and geology. Techniques such as volume visualisation, for analysing subsurface processes, and photo-coloured LiDAR point-based rendering, to digitally explore rock exposures at the earth’s surface, were applied within geology as one of the first adopting branches of science. A large amount of digital, geological surface- and volume data is nowadays available to desktop-based workflows for geological applications such as hydrocarbon reservoir exploration, groundwater modelling, CO2 sequestration and, in the future, geothermal energy planning. On the other hand, the analysis and data collection during fieldwork has yet to embrace this ”digital revolution”: sedimentary logs, geological maps and stratigraphic sketches are still captured in each geologist’s individual fieldbook, and physical rocks samples are still transported to the lab for subsequent analysis. Is this still necessary, or are there extended digital means of data collection and exploration in the field ? Are modern digital interpretation techniques accurate and intuitive enough to relevantly support fieldwork in geology and other geoscience disciplines ? This dissertation aims to address these questions and, by doing so, close the technological gap between geological fieldwork and office workflows in geology. The emergence of mobile devices and their vast array of physical sensors, combined with touch-based user interfaces, high-resolution screens and digital cameras provide a possible digital platform that can be used by field geologists. Their ubiquitous availability increases the chances to adopt digital workflows in the field without additional, expensive equipment. The use of 3D data on mobile devices in the field is furthered by the availability of 3D digital outcrop models and the increasing ease of their acquisition. This dissertation assesses the prospects of adopting 3D visual techniques and mobile devices within field geology. The research of this dissertation uses previously acquired and processed digital outcrop models in the form of textured surfaces from optical remote sensing and photogrammetry. The scientific papers in this thesis present visual techniques and algorithms to map outcrop photographs in the field directly onto the surface models. Automatic mapping allows the projection of photo interpretations of stratigraphy and sedimentary facies on the 3D textured surface while providing the domain expert with simple-touse, intuitive tools for the photo interpretation itself. The developed visual approach, combining insight from all across the computer sciences dealing with visual information, merits into the mobile device Geological Registration and Interpretation Toolset (GRIT) app, which is assessed on an outcrop analogue study of the Saltwick Formation exposed at Whitby, North Yorkshire, UK. Although being applicable to a diversity of study scenarios within petroleum geology and the geosciences, the particular target application of the visual techniques is to easily provide field-based outcrop interpretations for subsequent construction of training images for multiple point statistics reservoir modelling, as envisaged within the VOM2MPS project. Despite the success and applicability of the visual approach, numerous drawbacks and probable future extensions are discussed in the thesis based on the conducted studies. Apart from elaborating on more obvious limitations originating from the use of mobile devices and their limited computing capabilities and sensor accuracies, a major contribution of this thesis is the careful analysis of conceptual drawbacks of established procedures in modelling, representing, constructing and disseminating the available surface geometry. A more mathematically-accurate geometric description of the underlying algebraic surfaces yields improvements and future applications unaddressed within the literature of geology and the computational geosciences to this date. Also, future extensions to the visual techniques proposed in this thesis allow for expanded analysis, 3D exploration and improved geological subsurface modelling in general.publishedVersio

    Value Creation with Extended Reality Technologies - A Methodological Approach for Holistic Deployments

    Get PDF
    Mit zunehmender Rechenkapazität und Übertragungsleistung von Informationstechnologien wächst die Anzahl möglicher Anwendungs-szenarien für Extended Reality (XR)-Technologien in Unternehmen. XR-Technologien sind Hardwaresysteme, Softwaretools und Methoden zur Erstellung von Inhalten, um Virtual Reality, Augmented Reality und Mixed Reality zu erzeugen. Mit der Möglichkeit, Nutzern Inhalte auf immersive, interaktive und intelligente Weise zu vermitteln, können XR-Technologien die Produktivität in Unternehmen steigern und Wachstumschancen eröffnen. Obwohl XR-Anwendungen in der Industrie seit mehr als 25 Jahren wissenschaftlich erforscht werden, gelten nach wie vor als unausgereift. Die Hauptgründe dafür sind die zugrundeliegende Komplexität, die Fokussierung der Forschung auf die Untersuchung spezifische Anwendungsszenarien, die unzu-reichende Wirtschaftlichkeit von Einsatzszenarien und das Fehlen von geeigneten Implementierungsmodellen für XR-Technologien. Grundsätzlich wird der Mehrwert von Technologien durch deren Integration in die Wertschöpfungsarchitektur von Geschäftsmodellen freigesetzt. Daher wird in dieser Arbeit eine Methodik für den Einsatz von XR-Technologien in der Wertschöpfung vorgestellt. Das Hauptziel der Methodik ist es, die Identifikation geeigneter Einsatzszenarien zu ermöglichen und mit einem strukturierten Ablauf die Komplexität der Umsetzung zu beherrschen. Um eine ganzheitliche Anwendbarkeit zu ermöglichen, basiert die Methodik auf einem branchen- und ge-schäftsprozessunabhängigen Wertschöpfungsreferenzmodell. Dar-über hinaus bezieht sie sich auf eine ganzheitliche Morphologie von XR-Technologien und folgt einer iterativen Einführungssequenz. Das Wertschöpfungsmodell wird durch ein vorliegendes Potential, eine Wertschöpfungskette, ein Wertschöpfungsnetzwerk, physische und digitale Ressourcen sowie einen durch den Einsatz von XR-Technologien realisierten Mehrwert repräsentiert. XR-Technologien werden durch eine morphologische Struktur mit Anwendungsmerk-malen und erforderlichen technologischen Ressourcen repräsentiert. Die Umsetzung erfolgt in einer iterativen Sequenz, die für den zu-grundeliegenden Kontext anwendbare Methoden der agilen Soft-wareentwicklung beschreibt und relevante Stakeholder berücksich-tigt. Der Schwerpunkt der Methodik liegt auf einem systematischen Ansatz, der universell anwendbar ist und den Endnutzer und das Ökosystem der betrachteten Wertschöpfung berücksichtigt. Um die Methodik zu validieren, wird der Einsatz von XR-Technologien in zwei industriellen Anwendungsfällen unter realen wirtschaftlichen Bedingungen durchgeführt. Die Anwendungsfälle stammen aus unterschiedlichen Branchen, mit unterschiedlichen XR-Technologiemerkmalen sowie unterschiedlichen Formen von Wert-schöpfungsketten, um die universelle Anwendbarkeit der Methodik zu demonstrieren und relevante Herausforderungen bei der Durch-führung eines XR-Technologieeinsatzes aufzuzeigen. Mit Hilfe der vorgestellten Methodik können Unternehmen XR-Technologien zielgerichtet in ihrer Wertschöpfung einsetzen. Sie ermöglicht eine detaillierte Planung der Umsetzung, eine fundierte Auswahl von Anwendungsszenarien, die Bewertung möglicher Her-ausforderungen und Hindernisse sowie die gezielte Einbindung der relevanten Stakeholder. Im Ergebnis wird die Wertschöpfung mit wirtschaftlichem Mehrwert durch XR-Technologien optimiert

    Perception-driven approaches to real-time remote immersive visualization

    Get PDF
    In remote immersive visualization systems, real-time 3D perception through RGB-D cameras, combined with modern Virtual Reality (VR) interfaces, enhances the user’s sense of presence in a remote scene through 3D reconstruction rendered in a remote immersive visualization system. Particularly, in situations when there is a need to visualize, explore and perform tasks in inaccessible environments, too hazardous or distant. However, a remote visualization system requires the entire pipeline from 3D data acquisition to VR rendering satisfies the speed, throughput, and high visual realism. Mainly when using point-cloud, there is a fundamental quality difference between the acquired data of the physical world and the displayed data because of network latency and throughput limitations that negatively impact the sense of presence and provoke cybersickness. This thesis presents state-of-the-art research to address these problems by taking the human visual system as inspiration, from sensor data acquisition to VR rendering. The human visual system does not have a uniform vision across the field of view; It has the sharpest visual acuity at the center of the field of view. The acuity falls off towards the periphery. The peripheral vision provides lower resolution to guide the eye movements so that the central vision visits all the interesting crucial parts. As a first contribution, the thesis developed remote visualization strategies that utilize the acuity fall-off to facilitate the processing, transmission, buffering, and rendering in VR of 3D reconstructed scenes while simultaneously reducing throughput requirements and latency. As a second contribution, the thesis looked into attentional mechanisms to select and draw user engagement to specific information from the dynamic spatio-temporal environment. It proposed a strategy to analyze the remote scene concerning the 3D structure of the scene, its layout, and the spatial, functional, and semantic relationships between objects in the scene. The strategy primarily focuses on analyzing the scene with models the human visual perception uses. It sets a more significant proportion of computational resources on objects of interest and creates a more realistic visualization. As a supplementary contribution, A new volumetric point-cloud density-based Peak Signal-to-Noise Ratio (PSNR) metric is proposed to evaluate the introduced techniques. An in-depth evaluation of the presented systems, comparative examination of the proposed point cloud metric, user studies, and experiments demonstrated that the methods introduced in this thesis are visually superior while significantly reducing latency and throughput

    Policymaking prior to decision-making in the Digital Age

    Get PDF
    This thesis will examine the application of information and communication technology (ICT) innovations over recent times in the policymaking process, focusing on the policy stages prior to decision-making stage. Recent developments in technology innovation have led to a re-examination of citizen involvement in government processes and the expansion of opportunities for citizens to engage in the policymaking process
    corecore