170 research outputs found

    Multiview Video Coding for Virtual Reality

    Get PDF
    Virtual reality (VR) is one of the emerging technologies in recent years. It brings a sense of real world experience in simulated environments, hence, it is being used in many applications for example in live sporting events, music recordings and in many other interactive multimedia applications. VR makes use of multimedia content, and videos are a major part of it. VR videos are captured from multiple directions to cover the entire 360 field-of-view. It usually employs, multiple cameras of wide field-of-view such as fisheye lenses and the camera arrangement can also vary from linear to spherical set-ups. Videos in VR system are also subjected to constraints such as, variations in network bandwidth, heterogeneous mobile devices with limited decoding capacity, adaptivity for view switching in the display. The uncompressed videos from multiview cameras are redundant and impractical for storage and transmission. The existing video coding standards compresses the multiview videos effi ciently. However, VR systems place certain limitations on the video and camera arrangements, such as, it assumes rectilinear properties for video, translational motion model for prediction and the camera set-up to be linearly arranged. The aim of the thesis is to propose coding schemes which are compliant to the current video coding standards of H.264/AVC and its successor H.265/HEVC, the current state-of-the-art and multiview/scalable extensions. This thesis presents methods that compress the multiview videos which are captured from eight cameras that are arranged spherically, pointing radially outwards. The cameras produce circular fi sheye videos of 195 degree field-of-view. The final goal is to present methods, which optimize the bitrate in both storage and transmission of videos for the VR system. The presented methods can be categorized into two groups: optimizing storage bitrate and optimizing streaming bitrate of multiview videos. In the storage bitrate category, six methods were experimented. The presented methods competed against simulcast coding of individual views. The coding schemes were experimented with two data sets of 8 views each. The method of scalable coding with inter-layer prediction in all frames outperformed simulcast coding with approximately 7.9%. In the case of optimizing streaming birates, five methods were experimented. The method of scalable plus multiview skip-coding outperformed the simulcast method of coding by 36% on average. Future work will focus on pre-processing the fi sheye videos to rectilinear videos, in-order to fit them to the current translational model of the video coding standards. Moreover, the methods will be tested in comprehensive applications and system requirements

    Appearance Modelling and Reconstruction for Navigation in Minimally Invasive Surgery

    Get PDF
    Minimally invasive surgery is playing an increasingly important role for patient care. Whilst its direct patient benefit in terms of reduced trauma, improved recovery and shortened hospitalisation has been well established, there is a sustained need for improved training of the existing procedures and the development of new smart instruments to tackle the issue of visualisation, ergonomic control, haptic and tactile feedback. For endoscopic intervention, the small field of view in the presence of a complex anatomy can easily introduce disorientation to the operator as the tortuous access pathway is not always easy to predict and control with standard endoscopes. Effective training through simulation devices, based on either virtual reality or mixed-reality simulators, can help to improve the spatial awareness, consistency and safety of these procedures. This thesis examines the use of endoscopic videos for both simulation and navigation purposes. More specifically, it addresses the challenging problem of how to build high-fidelity subject-specific simulation environments for improved training and skills assessment. Issues related to mesh parameterisation and texture blending are investigated. With the maturity of computer vision in terms of both 3D shape reconstruction and localisation and mapping, vision-based techniques have enjoyed significant interest in recent years for surgical navigation. The thesis also tackles the problem of how to use vision-based techniques for providing a detailed 3D map and dynamically expanded field of view to improve spatial awareness and avoid operator disorientation. The key advantage of this approach is that it does not require additional hardware, and thus introduces minimal interference to the existing surgical workflow. The derived 3D map can be effectively integrated with pre-operative data, allowing both global and local 3D navigation by taking into account tissue structural and appearance changes. Both simulation and laboratory-based experiments are conducted throughout this research to assess the practical value of the method proposed

    A fixed-point simd array processor and its applications to video compression coding

    Get PDF
    A review of image compressing algorithms and their processor architectures -- MPEG standard -- Motion estimation algorithm -- Processor architecture review -- SIMD architecture of the pulse chip -- Implementing a convolution on pulse -- The convolution algorithm versus pulse architectural features -- Structure of the convolution software -- Motion estimation algorithms and implementations -- Motion estimation algorithm -- Gradual ful search method and full search algorithm with the pulse chip -- DCT & IDCT algorithms and implementations -- Image processing with pulse chips and a C40 processor

    Proceedings of the 7th Sound and Music Computing Conference

    Get PDF
    Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010

    Schémas de tatouage d'images, schémas de tatouage conjoint à la compression, et schémas de dissimulation de données

    Get PDF
    In this manuscript we address data-hiding in images and videos. Specifically we address robust watermarking for images, robust watermarking jointly with compression, and finally non robust data-hiding.The first part of the manuscript deals with high-rate robust watermarking. After having briefly recalled the concept of informed watermarking, we study the two major watermarking families : trellis-based watermarking and quantized-based watermarking. We propose, firstly to reduce the computational complexity of the trellis-based watermarking, with a rotation based embedding, and secondly to introduce a trellis-based quantization in a watermarking system based on quantization.The second part of the manuscript addresses the problem of watermarking jointly with a JPEG2000 compression step or an H.264 compression step. The quantization step and the watermarking step are achieved simultaneously, so that these two steps do not fight against each other. Watermarking in JPEG2000 is achieved by using the trellis quantization from the part 2 of the standard. Watermarking in H.264 is performed on the fly, after the quantization stage, choosing the best prediction through the process of rate-distortion optimization. We also propose to integrate a Tardos code to build an application for traitors tracing.The last part of the manuscript describes the different mechanisms of color hiding in a grayscale image. We propose two approaches based on hiding a color palette in its index image. The first approach relies on the optimization of an energetic function to get a decomposition of the color image allowing an easy embedding. The second approach consists in quickly obtaining a color palette of larger size and then in embedding it in a reversible way.Dans ce manuscrit nous abordons l’insertion de données dans les images et les vidéos. Plus particulièrement nous traitons du tatouage robuste dans les images, du tatouage robuste conjointement à la compression et enfin de l’insertion de données (non robuste).La première partie du manuscrit traite du tatouage robuste à haute capacité. Après avoir brièvement rappelé le concept de tatouage informé, nous étudions les deux principales familles de tatouage : le tatouage basé treillis et le tatouage basé quantification. Nous proposons d’une part de réduire la complexité calculatoire du tatouage basé treillis par une approche d’insertion par rotation, ainsi que d’autre part d’introduire une approche par quantification basée treillis au seind’un système de tatouage basé quantification.La deuxième partie du manuscrit aborde la problématique de tatouage conjointement à la phase de compression par JPEG2000 ou par H.264. L’idée consiste à faire en même temps l’étape de quantification et l’étape de tatouage, de sorte que ces deux étapes ne « luttent pas » l’une contre l’autre. Le tatouage au sein de JPEG2000 est effectué en détournant l’utilisation de la quantification basée treillis de la partie 2 du standard. Le tatouage au sein de H.264 est effectué à la volée, après la phase de quantification, en choisissant la meilleure prédiction via le processus d’optimisation débit-distorsion. Nous proposons également d’intégrer un code de Tardos pour construire une application de traçage de traîtres.La dernière partie du manuscrit décrit les différents mécanismes de dissimulation d’une information couleur au sein d’une image en niveaux de gris. Nous proposons deux approches reposant sur la dissimulation d’une palette couleur dans son image d’index. La première approche consiste à modéliser le problème puis à l’optimiser afin d’avoir une bonne décomposition de l’image couleur ainsi qu’une insertion aisée. La seconde approche consiste à obtenir, de manière rapide et sûre, une palette de plus grande dimension puis à l’insérer de manière réversible

    Activity Report: Automatic Control 2011

    Get PDF

    Application-driven visual computing towards industry 4.0 2018

    Get PDF
    245 p.La Tesis recoge contribuciones en tres campos: 1. Agentes Virtuales Interactivos: autónomos, modulares, escalables, ubicuos y atractivos para el usuario. Estos IVA pueden interactuar con los usuarios de manera natural.2. Entornos de RV/RA Inmersivos: RV en la planificación de la producción, el diseño de producto, la simulación de procesos, pruebas y verificación. El Operario Virtual muestra cómo la RV y los Co-bots pueden trabajar en un entorno seguro. En el Operario Aumentado la RA muestra información relevante al trabajador de una manera no intrusiva. 3. Gestión Interactiva de Modelos 3D: gestión online y visualización de modelos CAD multimedia, mediante conversión automática de modelos CAD a la Web. La tecnología Web3D permite la visualización e interacción de estos modelos en dispositivos móviles de baja potencia.Además, estas contribuciones han permitido analizar los desafíos presentados por Industry 4.0. La tesis ha contribuido a proporcionar una prueba de concepto para algunos de esos desafíos: en factores humanos, simulación, visualización e integración de modelos

    Basics of man-machine communication for the design of educational systems : NATO Advanced Study Institute, August 16-26, 1993, Eindhoven, The Netherlands

    Get PDF

    Basics of man-machine communication for the design of educational systems : NATO Advanced Study Institute, August 16-26, 1993, Eindhoven, The Netherlands

    Get PDF
    • …
    corecore