1,362 research outputs found

    Highly Efficient Multiview Depth Coding Based on Histogram Projection and Allowable Depth Distortion

    Get PDF
    The file attached to this record is the author's final peer reviewed version.Mismatches between the precisions of representing the disparity, depth value and rendering position in 3D video systems cause redundancies in depth map representations. In this paper, we propose a highly efficient multiview depth coding scheme based on Depth Histogram Projection (DHP) and Allowable Depth Distortion (ADD) in view synthesis. Firstly, DHP exploits the sparse representation of depth maps generated from stereo matching to reduce the residual error from INTER and INTRA predictions in depth coding. We provide a mathematical foundation for DHP-based lossless depth coding by theoretically analyzing its rate-distortion cost. Then, due to the mismatch between depth value and rendering position, there is a many-to-one mapping relationship between them in view synthesis, which induces the ADD model. Based on this ADD model and DHP, depth coding with lossless view synthesis quality is proposed to further improve the compression performance of depth coding while maintaining the same synthesized video quality. Experimental results reveal that the proposed DHP based depth coding can achieve an average bit rate saving of 20.66% to 19.52% for lossless coding on Multiview High Efficiency Video Coding (MV-HEVC) with different groups of pictures. In addition, our depth coding based on DHP and ADD achieves an average depth bit rate reduction of 46.69%, 34.12% and 28.68% for lossless view synthesis quality when the rendering precision varies from integer, half to quarter pixels, respectively. We obtain similar gains for lossless depth coding on the 3D-HEVC, HEVC Intra coding and JPEG2000 platforms

    Cross-layer Optimized Wireless Video Surveillance

    Get PDF
    A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system. The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion. In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality. The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos. In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work. Adviser: Song C

    Cross-layer Optimized Wireless Video Surveillance

    Get PDF
    A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system. The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion. In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality. The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos. In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work. Adviser: Song C

    End to end Multi-Objective Optimisation of H.264 and HEVC Codecs

    Get PDF
    All multimedia devices now incorporate video CODECs that comply with international video coding standards such as H.264 / MPEG4-AVC and the new High Efficiency Video Coding Standard (HEVC) otherwise known as H.265. Although the standard CODECs have been designed to include algorithms with optimal efficiency, large number of coding parameters can be used to fine tune their operation, within known constraints of for e.g., available computational power, bandwidth, consumer QoS requirements, etc. With large number of such parameters involved, determining which parameters will play a significant role in providing optimal quality of service within given constraints is a further challenge that needs to be met. Further how to select the values of the significant parameters so that the CODEC performs optimally under the given constraints is a further important question to be answered. This thesis proposes a framework that uses machine learning algorithms to model the performance of a video CODEC based on the significant coding parameters. Means of modelling both the Encoder and Decoder performance is proposed. We define objective functions that can be used to model the performance related properties of a CODEC, i.e., video quality, bit-rate and CPU time. We show that these objective functions can be practically utilised in video Encoder/Decoder designs, in particular in their performance optimisation within given operational and practical constraints. A Multi-objective Optimisation framework based on Genetic Algorithms is thus proposed to optimise the performance of a video codec. The framework is designed to jointly minimize the CPU Time, Bit-rate and to maximize the quality of the compressed video stream. The thesis presents the use of this framework in the performance modelling and multi-objective optimisation of the most widely used video coding standard in practice at present, H.264 and the latest video coding standard, H.265/HEVC. When a communication network is used to transmit video, performance related parameters of the communication channel will impact the end-to-end performance of the video CODEC. Network delays and packet loss will impact the quality of the video that is received at the decoder via the communication channel, i.e., even if a video CODEC is optimally configured network conditions will make the experience sub-optimal. Given the above the thesis proposes a design, integration and testing of a novel approach to simulating a wired network and the use of UDP protocol for the transmission of video data. This network is subsequently used to simulate the impact of packet loss and network delays on optimally coded video based on the framework previously proposed for the modelling and optimisation of video CODECs. The quality of received video under different levels of packet loss and network delay is simulated, concluding the impact on transmitted video based on their content and features

    Processamento de mapas de profundidade para codificação e síntese de vídeo

    Get PDF
    Dissertação (mestrado)—Universidade de Brasília, Instituto de Ciências Exatas, Departamento de Ciência da Computação, 2017.Sistemas de múltiplas vistas são amplamente empregados na criação de vídeos 3D e de aplicações de ponto de vista livre. As múltiplas vistas, contendo vídeos de textura (cor) e profundidade, devem ser eficientemente comprimidas para serem transmitidas ao cliente e podem servir para síntese de vistas no receptor. Nesse contexto, a proposta deste trabalho é desenvolver um pré-processamento baseado no modelo de Distorção de Profundidade Admissível (ADD) que atue sobre os mapas de profundidade antes da codificação destes. Esse trabalho explora o modelo ADD e, adicionalmente, propõe a escolha e substituição dos valores de profundidade para aumentar a compressão dos mesmos de acordo com a distribuição dos blocos (coding units) empregados por codificadores padrões. Este pré-processamento tem como intuito a diminuição da carga de transmissão sem gerar perdas de qualidade na síntese da vista. Os histogramas dos mapas de profundidade após o pré-processamento são modificados, pois a alteração dos valores de profundidade dependerá da localização dos blocos. Os resultados mostram que é possível alcançar ganhos de compressão de até 13.9% usando o método da Mínima Variância no Bloco-ADD (ADD-MVB) sem a introdução de perdas por distorção e preservando a qualidade das imagens sintetizadas.Multiview systems are widely used to create 3D video as well as in FreeViewpoint Video applications. The multiple views, consisting of texture images and depth maps, must be efficiently compressed and trasmitted to clients where they may be used towards the synthesis of virtual views. In this context, the Allowable Depth Distorion (ADD) has been used in a preprocessing step prior to depth coding. This work explores ADD and, additionally, the choice of depth value to increase compression for transmission in accordance to the distribution of blocks (e.g., coding units) commonly employed by standardized coders without generating synthesis quality losses. Their histograms will be modified depending on the location and where the pixel belongs in the image. Experimental results show that our proposal can achieve compression gains of up to 13.9% applying the minimum variance method within a block, without introducing losses in terms of distortion and preserving synthesized image quality

    Real Time Structured Light and Applications

    Get PDF

    Система визначення глибини зображення

    Get PDF
    Робота публікується згідно наказу Ректора НАУ від 27.05.2021 р. №311/од "Про розміщення кваліфікаційних робіт здобувачів вищої освіти в репозиторії університету" . Керівник роботи: к. т. н., ст. викл. кафедри авіаційних комп’ютерно-інтегрованих комплексів, Василенко Микола ПавловичIn today's world, there is often a question about creating a model to solve a certain problem in such a way that it performs its intended task properly and does not have a large cost. This is what almost every developer of the project wants at the production stage. Thus, the work consists in improving and acquiring better accuracy of the image depth detection system. For this, was modified and improved, namely, the main design of the model was changed and the quality of the image was improved, thanks to various methods of image filtering. Unlike the previous model, this project investigates the effect and quality of the 3D scene construction in the image, not the streaming video, under different weather conditions and at different observation points, which makes it possible to feel in more detail the impact of various phenomena on the model itself during operation and improve accuracy due to considering a single pair of images rather than a stream of large numbers at a specific frequency. The design consists of two cameras, which were selected from the principle of price-quality, and a box to fix and protect the model itself, thus forming protection from the environment in various conditions of use. The design is connected to a computer that performs the software part, which consists in creating a stereo pair – artificial adjustment of cameras, image analysis at the initial stage and after filtering, which as a result gives an opportunity to see the difference in the accuracy of constructing a 3D image, which can be used for various goals, for example to find out the size or distance to the target object.У сучасному світі часто постає питання про створення моделі вирішення певної задачі таким чином, щоб вона якісно виконувала поставлене завдання і не мала великих витрат. Це те, чого хоче практично кожен розробник проекту на етапі виробництва. Таким чином, робота полягає в удосконаленні та підвищенні точності системи визначення глибини зображення. Для цього виготовлена модель була модифікована та вдосконалена, а саме змінено основну конструкцію моделі та покращено якість зображення, завдяки різним методам фільтрації зображення. На відміну від попередньої моделі, цей проект досліджує ефект і якість побудови 3D-сцени в зображенні, а не в потоковому відео, за різних погодних умов і в різних точках спостереження, що дає можливість більш детально відчути вплив різних явищ. на самій моделі під час роботи та підвищити точність за рахунок розгляду однієї пари зображень, а не потоку великих чисел із певною частотою. Конструкція складається з двох камер, підібраних за принципом ціна-якість, і коробки для кріплення і захисту самої моделі, формуючи таким чином захист від зовнішнього середовища в різних умовах використання. Конструкція підключена до комп’ютера, який виконує програмну частину, яка полягає у створенні стереопари – штучне налаштування камер, аналіз зображення на початковому етапі та після фільтрації, що в результаті дає можливість побачити різницю в точності. побудови тривимірного зображення, яке можна використовувати для різних цілей, наприклад, щоб дізнатися розмір або відстань до цільового об'єкта

    Content based image pose manipulation

    Get PDF
    This thesis proposes the application of space-frequency transformations to the domain of pose estimation in images. This idea is explored using the Wavelet Transform with illustrative applications in pose estimation for face images, and images of planar scenes. The approach is based on examining the spatial frequency components in an image, to allow the inherent scene symmetry balance to be recovered. For face images with restricted pose variation (looking left or right), an algorithm is proposed to maximise this symmetry in order to transform the image into a fronto-parallel pose. This scheme is further employed to identify the optimal frontal facial pose from a video sequence to automate facial capture processes. These features are an important pre-requisite in facial recognition and expression classification systems. The under lying principles of this spatial-frequency approach are examined with respect to images with planar scenes. Using the Continuous Wavelet Transform, full perspective planar transformations are estimated within a featureless framework. Restoring central symmetry to the wavelet transformed images in an iterative optimisation scheme removes this perspective pose. This advances upon existing spatial approaches that require segmentation and feature matching, and frequency only techniques that are limited to affine transformation recovery. To evaluate the proposed techniques, the pose of a database of subjects portraying varying yaw orientations is estimated and the accuracy is measured against the captured ground truth information. Additionally, full perspective homographies for synthesised and imaged textured planes are estimated. Experimental results are presented for both situations that compare favourably with existing techniques in the literature
    corecore