31 research outputs found

    Selected topics in video coding and computer vision

    Get PDF
    Video applications ranging from multimedia communication to computer vision have been extensively studied in the past decades. However, the emergence of new applications continues to raise questions that are only partially answered by existing techniques. This thesis studies three selected topics related to video: intra prediction in block-based video coding, pedestrian detection and tracking in infrared imagery, and multi-view video alignment.;In the state-of-art video coding standard H.264/AVC, intra prediction is defined on the hierarchical quad-tree based block partitioning structure which fails to exploit the geometric constraint of edges. We propose a geometry-adaptive block partitioning structure and a new intra prediction algorithm named geometry-adaptive intra prediction (GAIP). A new texture prediction algorithm named geometry-adaptive intra displacement prediction (GAIDP) is also developed by extending the original intra displacement prediction (IDP) algorithm with the geometry-adaptive block partitions. Simulations on various test sequences demonstrate that intra coding performance of H.264/AVC can be significantly improved by incorporating the proposed geometry adaptive algorithms.;In recent years, due to the decreasing cost of thermal sensors, pedestrian detection and tracking in infrared imagery has become a topic of interest for night vision and all weather surveillance applications. We propose a novel approach for detecting and tracking pedestrians in infrared imagery based on a layered representation of infrared images. Pedestrians are detected from the foreground layer by a Principle Component Analysis (PCA) based scheme using the appearance cue. To facilitate the task of pedestrian tracking, we formulate the problem of shot segmentation and present a graph matching-based tracking algorithm. Simulations with both OSU Infrared Image Database and WVU Infrared Video Database are reported to demonstrate the accuracy and robustness of our algorithms.;Multi-view video alignment is a process to facilitate the fusion of non-synchronized multi-view video sequences for various applications including automatic video based surveillance and video metrology. In this thesis, we propose an accurate multi-view video alignment algorithm that iteratively aligns two sequences in space and time. To achieve an accurate sub-frame temporal alignment, we generalize the existing phase-correlation algorithm to 3-D case. We also present a novel method to obtain the ground-truth of the temporal alignment by using supplementary audio signals sampled at a much higher rate. The accuracy of our algorithm is verified by simulations using real-world sequences

    Side-information generation for temporally and spatially scalablewyner-ziv codecs

    Get PDF
    The distributed video coding paradigmenables video codecs to operate with reversed complexity, in which the complexity is shifted from the encoder toward the decoder. Its performance is heavily dependent on the quality of the side information generated by motio estimation at the decoder. We compare the rate-distortion performance of different side-information estimators, for both temporally and spatially scalableWyner-Ziv codecs. For the temporally scalable codec we compared an established method with a new algorithm that uses a linear-motion model to produce side-information. As a continuation of previous works, in this paper, we propose to use a super-resolution method to upsample the nonkey frame, for the spatial scalable codec, using the key frames as reference.We verify the performance of the spatial scalableWZcoding using the state-of-the-art video coding standard H.264/AVC

    A reduced reference video quality assessment method for provision as a service over SDN/NFV-enabled networks

    Get PDF
    139 p.The proliferation of multimedia applications and services has generarted a noteworthy upsurge in network traffic regarding video content and has created the need for trustworthy service quality assessment methods. Currently, predominent position among the technological trends in telecommunication networkds are Network Function Virtualization (NFV), Software Defined Networking (SDN) and 5G mobile networks equipped with small cells. Additionally Video Quality Assessment (VQA) methods are a very useful tool for both content providers and network operators, to understand of how users perceive quality and this study the feasibility of potential services and adapt the network available resources to satisfy the user requirements

    A reduced reference video quality assessment method for provision as a service over SDN/NFV-enabled networks

    Get PDF
    139 p.The proliferation of multimedia applications and services has generarted a noteworthy upsurge in network traffic regarding video content and has created the need for trustworthy service quality assessment methods. Currently, predominent position among the technological trends in telecommunication networkds are Network Function Virtualization (NFV), Software Defined Networking (SDN) and 5G mobile networks equipped with small cells. Additionally Video Quality Assessment (VQA) methods are a very useful tool for both content providers and network operators, to understand of how users perceive quality and this study the feasibility of potential services and adapt the network available resources to satisfy the user requirements

    Algoritmos de compressão sem perdas para imagens de microarrays e alinhamento de genomas completos

    Get PDF
    Doutoramento em InformáticaNowadays, in the 21st century, the never-ending expansion of information is a major global concern. The pace at which storage and communication resources are evolving is not fast enough to compensate this tendency. In order to overcome this issue, sophisticated and efficient compression tools are required. The goal of compression is to represent information with as few bits as possible. There are two kinds of compression, lossy and lossless. In lossless compression, information loss is not tolerated so the decoded information is exactly the same as the encoded one. On the other hand, in lossy compression some loss is acceptable. In this work we focused on lossless methods. The goal of this thesis was to create lossless compression tools that can be used in two types of data. The first type is known in the literature as microarray images. These images have 16 bits per pixel and a high spatial resolution. The other data type is commonly called Whole Genome Alignments (WGA), in particularly applied to MAF files. Regarding the microarray images, we improved existing microarray-specific methods by using some pre-processing techniques (segmentation and bitplane reduction). Moreover, we also developed a compression method based on pixel values estimates and a mixture of finite-context models. Furthermore, an approach based on binary-tree decomposition was also considered. Two compression tools were developed to compress MAF files. The first one based on a mixture of finite-context models and arithmetic coding, where only the DNA bases and alignment gaps were considered. The second tool, designated as MAFCO, is a complete compression tool that can handle all the information that can be found in MAF files. MAFCO relies on several finite-context models and allows parallel compression/decompression of MAF files.Hoje em dia, no século XXI, a expansão interminável de informação é uma grande preocupação mundial. O ritmo ao qual os recursos de armazenamento e comunicação estão a evoluir não é suficientemente rápido para compensar esta tendência. De forma a ultrapassar esta situação, são necessárias ferramentas de compressão sofisticadas e eficientes. A compressão consiste em representar informação utilizando a menor quantidade de bits possível. Existem dois tipos de compressão, com e sem perdas. Na compressão sem perdas, a perda de informação não é tolerada, por isso a informação descodificada é exatamente a mesma que a informação que foi codificada. Por outro lado, na compressão com perdas alguma perda é aceitável. Neste trabalho, focámo-nos apenas em métodos de compressão sem perdas. O objetivo desta tese consistiu na criação de ferramentas de compressão sem perdas para dois tipos de dados. O primeiro tipo de dados é conhecido na literatura como imagens de microarrays. Estas imagens têm 16 bits por píxel e uma resolução espacial elevada. O outro tipo de dados é geralmente denominado como alinhamento de genomas completos, particularmente aplicado a ficheiros MAF. Relativamente às imagens de microarrays, melhorámos alguns métodos de compressão específicos utilizando algumas técnicas de pré-processamento (segmentação e redução de planos binários). Além disso, desenvolvemos também um método de compressão baseado em estimação dos valores dos pixéis e em misturas de modelos de contexto-finito. Foi também considerada, uma abordagem baseada em decomposição em árvore binária. Foram desenvolvidas duas ferramentas de compressão para ficheiros MAF. A primeira ferramenta, é baseada numa mistura de modelos de contexto-finito e codificação aritmética, onde apenas as bases de ADN e os símbolos de alinhamento foram considerados. A segunda, designada como MAFCO, é uma ferramenta de compressão completa que consegue lidar com todo o tipo de informação que pode ser encontrada nos ficheiros MAF. MAFCO baseia-se em vários modelos de contexto-finito e permite compressão/descompressão paralela de ficheiros MAF

    SVM based approach for complexity control of HEVC intra coding

    Get PDF
    The High Efficiency Video Coding (HEVC) is adopted by various video applications in recent years. Because of its high computational demand, controlling the complexity of HEVC is of paramount importance to appeal to the varying requirements in many applications, including power-constrained video coding, video streaming, and cloud gaming. Most of the existing complexity control methods are only capable of considering a subset of the decision space, which leads to low coding efficiency. While the efficiency of machine learning methods such as Support Vector Machines (SVM) can be employed for higher precision decision making, the current SVM-based techniques for HEVC provide a fixed decision boundary which results in different coding complexities for different video content. Although this might be suitable for complexity reduction, it is not acceptable for complexity control. This paper proposes an adjustable classification approach for Coding Unit (CU) partitioning, which addresses the mentioned problems of complexity control. Firstly, a novel set of features for fast CU partitioning is designed using image processing techniques. Then, a flexible classification method based on SVM is proposed to model the CU partitioning problem. This approach allows adjusting the performance-complexity trade-off, even after the training phase. Using this model, and a novel adaptive thresholding technique, an algorithm is presented to deliver video encoding within the target coding complexity, while maximizing the coding efficiency. Experimental results justify the superiority of this method over the state-of-the-art methods, with target complexities ranging from 20% to 100%.acceptedVersionPeer reviewe

    Video compression algorithms for HEVC and beyond

    Get PDF
    PhDDue to the increasing number of new services and devices that allow the creation, distribution and consumption of video content, the amount of video information being transmitted all over the world is constantly growing. Video compression technology is essential to cope with the ever increasing volume of digital video data being distributed in today's networks, as more e cient video compression techniques allow support for higher volumes of video data under the same memory/bandwidth constraints. This is especially relevant with the introduction of new and more immersive video formats associated with signi cantly higher amounts of data. In this thesis, novel techniques for improving the e ciency of current and future video coding technologies are investigated. Several aspects that in uence the way conventional video coding methods work are considered. In particular, the properties and limitations of the Human Visual System are exploited to tune the performance of video encoders towards better subjective quality. Additionally, it is shown how the visibility of speci c types of visual artefacts can be prevented during the video encoding process, in order to avoid subjective quality degradations in the compressed content. Techniques for higher video compression e ciency are also explored, targeting to improve the compression capabilities of state-of-the-art video coding standards. Finally, the application of video coding technologies to practical use-cases is considered. Accurate estimation models are devised to control the encoding time and bit rate associated with compressed video signals, in order to meet speci c encoding time and transmission time restrictions
    corecore