135 research outputs found

    HEVC encoding assisted with noise reduction

    Get PDF
    Optimization of encoding process in video compression is an important research problem, especially in the case of modern, sophisticated compression technologies. In this paper, we consider HEVC, for which a novel method for selecting of the encoding modes is proposed. By the encoding modes we mean e.g. coding block structure, prediction types and motion vectors. The proposed selection is done basing on noise-reduced version of the input sequence, while the information about the video itself, e.g. transform coefficients, is coded basing on the unaltered input. The proposed method involves encoding of two versions of the input sequence, further, we show realization proving that the complexity is only negligibly higher than complexity of a single encoding. The proposal has been implemented in HEVC reference software from MPEG and tested experimentally. The results show that the proposal provides up to 1.5% bitrate reduction while preserving the same quality of a decoded video

    New interaction models for 360º video

    Get PDF
    Esta dissertação tem como principal objectivo a incorporação de um mecanismo de buffering num sistema de multimídia, capaz de oferecer experiências multivista adaptáveis. A incorporação deste mecanismo vem provocar melhorias na qualidade de serviço e na qualidade de experiência. O sistema recorre ao protocolo MPEG-DASH e a uma câmara convencional para detecção dos movimentos da cabeça do utilizador. O sistema incorpora ainda um mecanismo de adaptação automática da qualidade, ajustável às condições da rede. O mecanismo desenvolvido é composto por um proxy e tem o objectivo de minimizar o atraso existente na transição de vistas. O proxy será capaz de enviar três vistas em simultâneo, duas em baixa qualidade, enquanto a vista principal será enviada e apresenta ao utilizador em alta qualidade.Sempre que existe um novo pedido por parte do utilizador, o mecanismo irá comutar entre as vistas enviadas até receber a resposta por parte do servidor. Deste modo, esta dissertação pretende identificar as dificuldades que se colocam relativamente à disponibilização e transmissão eficiente deste tipo de conteúdos, assim como os compromissos necessários ao nível da qualidade de experiência do utilizador.Today, the fast technological evolution and the significant increase in the demand for multimedia content has boosted the development of the transmission mechanisms used for this purpose.This development had repercussions in several areas, such as the immersive experiences that include the 360º contents. Whether through live streaming or using on demand services, the quality of service and experience have become two points whose development has assumed high importance. The capture and reproduction of 360º content allows transmitting an immersive view of reality at a given moment. With this approach, the industry intends to provide a product with better audiovisual quality, more comfortable for the user and that allows a better interaction with the same. An example of this is the choice of the view that most appeals to us in a given event (for example, football matches or concerts). This dissertation has as main objective the incorporation of a buffering mechanism in a multimedia system, able to offer adaptive multivista experiments. The system uses the MPEG-DASH protocol for efficient use of network resources and a conventional camera for detecting the movements of the user's head, selecting the points of view that one wishes to visualize in real time. The system also incorporates an automatic quality adjustment mechanism, adjustable to the network conditions. The buffering mechanism is intended to increase the quality of experience and the quality of service, minimizing the delay in the transition of views. The mechanism will consist of a proxy capable of sending three views simultaneously. Of these views, two will be sent in low quality, while the main view will be sent and presented to the user in high quality. Whenever there is a new request from the user, the mechanism will switch between sent views until it receives the response from the server. Based on these assumptions, the dissertation intends to identify the challenges that are posed regarding the availability and efficient transmission of 360º content, as well as the necessary commitments regarding the quality of user experience. This last point is particularly significant, taking into account the network requirements and the volume of data presented by the transmissions of this type of content

    MPAI-EEV: Standardization Efforts of Artificial Intelligence based End-to-End Video Coding

    Full text link
    The rapid advancement of artificial intelligence (AI) technology has led to the prioritization of standardizing the processing, coding, and transmission of video using neural networks. To address this priority area, the Moving Picture, Audio, and Data Coding by Artificial Intelligence (MPAI) group is developing a suite of standards called MPAI-EEV for "end-to-end optimized neural video coding." The aim of this AI-based video standard project is to compress the number of bits required to represent high-fidelity video data by utilizing data-trained neural coding technologies. This approach is not constrained by how data coding has traditionally been applied in the context of a hybrid framework. This paper presents an overview of recent and ongoing standardization efforts in this area and highlights the key technologies and design philosophy of EEV. It also provides a comparison and report on some primary efforts such as the coding efficiency of the reference model. Additionally, it discusses emerging activities such as learned Unmanned-Aerial-Vehicles (UAVs) video coding which are currently planned, under development, or in the exploration phase. With a focus on UAV video signals, this paper addresses the current status of these preliminary efforts. It also indicates development timelines, summarizes the main technical details, and provides pointers to further points of reference. The exploration experiment shows that the EEV model performs better than the state-of-the-art video coding standard H.266/VVC in terms of perceptual evaluation metric

    Prediction of Visual Behaviour in Immersive Contents

    Get PDF
    In the world of broadcasting and streaming, multi-view video provides the ability to present multiple perspectives of the same video sequence, therefore providing to the viewer a sense of immersion in the real-world scene. It can be compared to VR and 360° video, still, there are significant differences, notably in the way that images are acquired: instead of placing the user at the center, presenting the scene around the user in a 360° circle, it uses multiple cameras placed in a 360° circle around the real-world scene of interest, capturing all of the possible perspectives of that scene. Additionally, in relation to VR, it uses natural video sequences and displays. One issue which plagues content streaming of all kinds is the bandwidth requirement which, particularly on VR and multi-view applications, translates into an increase of the required data transmission rate. A possible solution to lower the required bandwidth, would be to limit the number of views to be streamed fully, focusing on those surrounding the area at which the user is keeping his sight. This is proposed by SmoothMV, a multi-view system that uses a non-intrusive head tracking approach to enhance navigation and Quality of Experience (QoE) of the viewer. This system relies on a novel "Hot&Cold" matrix concept to translate head positioning data into viewing angle selections. The main goal of this dissertation focus on the transformation and storage of the data acquired using SmoothMV into datasets. These will be used as training data for a proposed Neural Network, fully integrated within SmoothMV, with the purpose of predicting the interest points on the screen of the users during the playback of multi-view content. The goal behind this effort is to predict possible viewing interests from the user in the near future and optimize bandwidth usage through buffering of adjacent views which could possibly be requested by the user. After concluding the development of this dataset, work in this dissertation will focus on the formulation of a solution to present generated heatmaps of the most viewed areas per video, previously captured using SmoothMV

    Performance analysis and application development of hybrid WiMAX-WiFi IP video surveillance systems

    Get PDF
    Traditional Closed Circuit Television (CCTV) analogue cameras installed in buildings and other areas of security interest necessitates the use of cable lines. However, analogue systems are limited by distance; and storing analogue data requires huge space or bandwidth. Wired systems are also prone to vandalism, they cannot be installed in a hostile terrain and in heritage sites, where cabling would distort original design. Currently, there is a paradigm shift towards wireless solutions (WiMAX, Wi-Fi, 3G, 4G) to complement and in some cases replace the wired system. A wireless solution of the Fourth-Generation Surveillance System (4GSS) has been proposed in this thesis. It is a hybrid WiMAX-WiFi video surveillance system. The performance analysis of the hybrid WiMAX-WiFi is compared with the conventional WiMAX surveillance models. The video surveillance models and the algorithm that exploit the advantages of both WiMAX and Wi-Fi for scenarios of fixed and mobile wireless cameras have been proposed, simulated and compared with the mathematical/analytical models. The hybrid WiMAX-WiFi video surveillance model has been extended to include a Wireless Mesh configuration on the Wi-Fi part, to improve the scalability and reliability. A performance analysis for hybrid WiMAX-WiFi system with an appropriate Mobility model has been considered for the case of mobile cameras. A security software application for mobile smartphones that sends surveillance images to either local or remote servers has been developed. The developed software has been tested, evaluated and deployed in low bandwidth Wi-Fi wireless network environments. WiMAX is a wireless metropolitan access network technology that provides broadband services to the connected customers. Major modules and units of WiMAX include the Customer Provided Equipment (CPE), the Access Service Network (ASN) which consist one or more Base Stations (BS) and the Connectivity Service Network (CSN). Various interfaces exist between each unit and module. WiMAX is based on the IEEE 802.16 family of standards. Wi-Fi, on the other hand, is a wireless access network operating in the local area network; and it is based on the IEEE 802.11 standards
    corecore