192 research outputs found

    An Analysis of VP8, a new video codec for the web

    Get PDF
    Video is an increasingly ubiquitous part of our lives. Fast and efficient video codecs are necessary to satisfy the increasing demand for video on the web and mobile devices. However, open standards and patent grants are paramount to the adoption of video codecs across different platforms and browsers. Google On2 released VP8 in May 2010 to compete with H.264, the current standard of video codecs, complete with source code, specification and a perpetual patent grant. As the amount of video being created every day is growing rapidly, the decision of which codec to encode this video with is paramount; if a low quality codec or a restrictively licensed codec is used, the video recorded might be of little to no use. We sought to study VP8 and its quality versus its resource consumption compared to H.264 -- the most popular current video codec -- so that reader may make an informed decision for themselves or for their organizations about whether to use H.264 or VP8, or something else entirely. We examined VP8 in detail, compared its theoretical complexity to H.264 and measured the efficiency of its current implementation. VP8 shares many facets of its design with H.264 and other Discrete Cosine Transform (DCT) based video codecs. However, VP8 is both simpler and less feature rich than H.264, which may allow for rapid hardware and software implementations. As it was designed for the Internet and newer mobile devices, it contains fewer legacy features, such as interlacing, than H.264 supports. To perform quality measurements, the open source VP8 implementation libvpx was used. This is the reference implementation. For H.264, the open source H.264 encoder x264 was used. This encoder has very high performance, and is often rated at the top of its field in efficiency. The JM reference encoder was used to establish a baseline quality for H.264. Our findings indicate that VP8 performs very well at low bitrates, at resolutions at and below CIF. VP8 may be able to successfully displace H.264 Baseline in the mobile streaming video domain. It offers higher quality at a lower bitrate for low resolution images due to its high performing entropy coder and non-contiguous macroblock segmentation. At higher resolutions, VP8 still outperforms H.264 Baseline, but H.264 High profile leads. At HD resolution (720p and above), H.264 is significantly better than VP8 due to its superior motion estimation and adaptive coding. There is little significant difference between the intra-coding performance between H.264 and VP8. VP8\u27s in-loop deblocking filter outperforms H.264\u27s version. H.264\u27s inter-coding, with full support for B frames and weighting outperforms VP8\u27s alternate reference scheme, although this may improve in the future. On average, VP8\u27s feature set is less complex than H.264\u27s equivalents, which, along with its open source implementation, may spur development in the future. These findings indicate that VP8 has strong fundamentals when compared with H.264, but that it lacks optimization and maturity. It will likely improve as engineers optimize VP8\u27s reference implementation, or when a competing implementation is developed. We recommend several areas that the VP8 developers should focus on in the future

    Acting rehearsal in collaborative multimodal mixed reality environments

    Get PDF
    This paper presents the use of our multimodal mixed reality telecommunication system to support remote acting rehearsal. The rehearsals involved two actors, located in London and Barcelona, and a director in another location in London. This triadic audiovisual telecommunication was performed in a spatial and multimodal collaborative mixed reality environment based on the 'destination-visitor' paradigm, which we define and put into use. We detail our heterogeneous system architecture, which spans the three distributed and technologically asymmetric sites, and features a range of capture, display, and transmission technologies. The actors' and director's experience of rehearsing a scene via the system are then discussed, exploring successes and failures of this heterogeneous form of telecollaboration. Overall, the common spatial frame of reference presented by the system to all parties was highly conducive to theatrical acting and directing, allowing blocking, gross gesture, and unambiguous instruction to be issued. The relative inexpressivity of the actors' embodiments was identified as the central limitation of the telecommunication, meaning that moments relying on performing and reacting to consequential facial expression and subtle gesture were less successful

    A Methodology for Characterizing Real-Time Multimedia Quality of Service in Limited Bandwidth Network

    Get PDF
    This paper presents how to characterize the quality of multimedia which consists of audio and video that are transmitted in real-time communication through the Internet with limited bandwidth. We developed a methodology of characterizing the multimedia Quality-of-Service (QoS) by measuring network parameters (i.e., bandwidth capacity, packet loss rate (PLR), and end-to-end delay) of testbed network and simulating the audio-video delivery according to the measured network parameters. The analysis of network parameters was aimed to describe the network characteristics. Multimedia QoS was characterized by conducting a simulation using data which was collected from the previous network characterization. A simulation network model was built using OMNet++ representing a delivery of audio-video in real-time while a background traffic was generated to represent a real condition of the network. Apllying the methodology in a network testbed in Indonesia’s rural area, the simulation results showed that audio-video could be delivered with accepted level of user satisfaction

    Videos in Context for Telecommunication and Spatial Browsing

    Get PDF
    The research presented in this thesis explores the use of videos embedded in panoramic imagery to transmit spatial and temporal information describing remote environments and their dynamics. Virtual environments (VEs) through which users can explore remote locations are rapidly emerging as a popular medium of presence and remote collaboration. However, capturing visual representation of locations to be used in VEs is usually a tedious process that requires either manual modelling of environments or the employment of specific hardware. Capturing environment dynamics is not straightforward either, and it is usually performed through specific tracking hardware. Similarly, browsing large unstructured video-collections with available tools is difficult, as the abundance of spatial and temporal information makes them hard to comprehend. At the same time, on a spectrum between 3D VEs and 2D images, panoramas lie in between, as they offer the same 2D images accessibility while preserving 3D virtual environments surrounding representation. For this reason, panoramas are an attractive basis for videoconferencing and browsing tools as they can relate several videos temporally and spatially. This research explores methods to acquire, fuse, render and stream data coming from heterogeneous cameras, with the help of panoramic imagery. Three distinct but interrelated questions are addressed. First, the thesis considers how spatially localised video can be used to increase the spatial information transmitted during video mediated communication, and if this improves quality of communication. Second, the research asks whether videos in panoramic context can be used to convey spatial and temporal information of a remote place and the dynamics within, and if this improves users' performance in tasks that require spatio-temporal thinking. Finally, the thesis considers whether there is an impact of display type on reasoning about events within videos in panoramic context. These research questions were investigated over three experiments, covering scenarios common to computer-supported cooperative work and video browsing. To support the investigation, two distinct video+context systems were developed. The first telecommunication experiment compared our videos in context interface with fully-panoramic video and conventional webcam video conferencing in an object placement scenario. The second experiment investigated the impact of videos in panoramic context on quality of spatio-temporal thinking during localization tasks. To support the experiment, a novel interface to video-collection in panoramic context was developed and compared with common video-browsing tools. The final experimental study investigated the impact of display type on reasoning about events. The study explored three adaptations of our video-collection interface to three display types. The overall conclusion is that videos in panoramic context offer a valid solution to spatio-temporal exploration of remote locations. Our approach presents a richer visual representation in terms of space and time than standard tools, showing that providing panoramic contexts to video collections makes spatio-temporal tasks easier. To this end, videos in context are suitable alternative to more difficult, and often expensive solutions. These findings are beneficial to many applications, including teleconferencing, virtual tourism and remote assistance

    Comparison of compression efficiency between HEVC/H.265 and VP9 based on subjective assessments

    Get PDF
    Current increasing effort of broadcast providers to transmit UHD (Ultra High Definition) content is likely to increase demand for ultra high definition televisions (UHDTVs). To compress UHDTV content, several alter- native encoding mechanisms exist. In addition to internationally recognized standards, open access proprietary options, such as VP9 video encoding scheme, have recently appeared and are gaining popularity. One of the main goals of these encoders is to efficiently compress video sequences beyond HDTV resolution for various scenarios, such as broadcasting or internet streaming. In this paper, a broadcast scenario rate-distortion performance analysis and mutual comparison of one of the latest video coding standards H.265/HEVC with recently released proprietary video coding scheme VP9 is presented. Also, currently one of the most popular and widely spread encoder H.264/AVC has been included into the evaluation to serve as a comparison baseline. The comparison is performed by means of subjective evaluations showing actual differences between encoding algorithms in terms of perceived quality. The results indicate a dominance of HEVC based encoding algorithm in comparison to other alternatives if a wide range of bit-rates from very low to high bit-rates corresponding to low quality up to transparent quality when compared to original and uncompressed video is considered. In addition, VP9 shows competitive results for synthetic content and bit-rates that correspond to operating points for transparent or close to transparent quality video

    Video CODEC with adaptive frame rate control for intelligent transportation system applications

    Get PDF
    Video cameras are one of the important types of devices in Intelligent Transportation Systems (ITS). The camera images are practical, widely deployable and beneficial for traffic management and congestion control. The advent of image processing has established several applications based on ITS camera images, including vehicle detection, weather monitoring, smart work zones, etc. Unlike digital video entertainment applications, the camera images in ITS applications require high video image quality but usually not a high video frame rate. Traditional block-based video compression standards, which were developed primarily with the video entertainment industry in mind, are dependent on adaptive rate control algorithms to control the video quality and the video frame rate. Modern rate control algorithms range from simple frame skipping to complicated adaptive algorithms based on optimal rate-distortion theory. In this dissertation, I presented an innovative video frame rate control scheme based on adaptive frame dropping. Video transmission schemes were also discussed and a new strategy to reduce the video traffic on the network was presented. Experimental results in a variety of network scenarios shown that the proposed technique could improve video quality in both the temporal and spatial dimensions, as quantified by standard video metrics (up to 6 percent of PSNR, 5 percent of SSIM, and 10 percent VQM compared to the original video). Another benefit of the proposed technique is that video traffic and network congestion are generally reduced. Both FPGA and embedded Linux implementations are considered for video encoder development

    Re-encoding Resistance: Towards Robust Covert Channels over WebRTC Video Streaming

    Get PDF
    Internet censorship is an ongoing phenomenon, where state level agents attempt to control the free access to information on the internet for purposes like dissent suppression and control. In response, research has been dedicated to propose and implement censorship circumvention solutions. One approach to circumvention involves the use of steganography, the process of embedding a hidden message into a cover medium (e.g., image, video, or audio file), such that sensitive or restricted information can be exchanged without a censoring agent being able to detect this exchange. Stegozoa, one such steganography tool, proposes using WebRTC video conferencing as the channel for embedding, to allow a party within a restricted area to freely receive information from a party located outside of this area, circumventing censorship. This project on itself, is an extension of an earlier implementation, and it assumes a stronger threat model, where WebRTC connections are not peer-to-peer but instead mediated by a gateway server, which may be controlled, or influenced, by the censoring agent. In this threat model, it is argued that an attacker (or censor) may inspect the data being transmitted directly, but has no incentive to change the video data. With our work, we seek to challenge this last assumption, since many applications using this WebRTC architecture can and will in fact modify the video, likely for non malicious purposes. By implementing our own test WebRTC application, we have shown that performing video re-encoding (that is decoding a VP8 format video into raw format and then back) on the transmitted data, is enough to render an implementation like Stegozoa inoperable. We argue that re-encoding is commonly a non-malicious operation, which may be justified by the application setup (for example to perform video filtering, or integrity checks, or other types of computer vision operations), and that does not affect a regular non-Stegozoa user. It is for this reason, that we proposed that re-encoding robustness is a necessary feature for steganographic systems. To this end, first we performed characterization experiments on a popular WebRTC video codec (VP8), to understand the effects of re-encoding. Similarly, we tested the effects of this operation when a hidden message is embed in a similar fashion to Stegozoa. We were able to show that, DCT coefficients, which are used commonly as the target for message embedding, change enough to cause loss of message integrity due to re-encoding, without the use of any error correction. Our experiments showed that higher frequency Discrete Cosine Transform (DCT) coefficients are more likely to remain stable for message embedding after re-encoding. We also showed that a dynamically calculated embedding space (that is the set of coefficients that may actually be used for embedding), akin to Stegozoa’s implementation, is very likely to be different after re-encoding, which creates a mismatch between sender and receiver. With these observations, we then sought to test a more robust implementation for embedding. To do so, we combined the usage of error correction (in the form of Reed-Solomon codes), and a static embedding space. We showed that message re-transmission (that is, embedding in multiple frames) and error correction are enough to send a message that will be received correctly. Our experiments showed that this can be used as a low-bandwidth non time-sensitive channel for covert communications. Finally, we combined our results to provide a set of guidelines that we believe are needed to implement a WebRTC based, VP8 encoded, censorship circumvention

    Improving Adaptive Real-Time Video Communication Via Cross-layer Optimization

    Full text link
    Effective Adaptive BitRate (ABR) algorithm or policy is of paramount importance for Real-Time Video Communication (RTVC) amid this pandemic to pursue uncompromised quality of experience (QoE). Existing ABR methods mainly separate the network bandwidth estimation and video encoder control, and fine-tune video bitrate towards estimated bandwidth, assuming the maximization of bandwidth utilization yields the optimal QoE. However, the QoE of a RTVC system is jointly determined by the quality of compressed video, fluency of video playback, and interaction delay. Solely maximizing the bandwidth utilization without comprehensively considering compound impacts incurred by both network and video application layers, does not assure the satisfactory QoE. And the decoupling of network and video layer further exacerbates the user experience due to network-codec incoordination. This work therefore proposes the Palette, a reinforcement learning based ABR scheme that unifies the processing of network and video application layers to directly maximize the QoE formulated as the weighted function of video quality, stalling rate and delay. To this aim, a cross-layer optimization is proposed to derive fine-grained compression factor of upcoming frame(s) using cross-layer observations like network conditions, video encoding parameters, and video content complexity. As a result, Palette manages to resolve the network-codec incoordination and to best catch up with the network fluctuation. Compared with state-of-the-art schemes in real-world tests, Palette not only reduces 3.1%-46.3% of the stalling rate, 20.2%-50.8% of the delay, but also improves 0.2%-7.2% of the video quality with comparable bandwidth consumption, under a variety of application scenarios

    Energy-aware adaptive solutions for multimedia delivery to wireless devices

    Get PDF
    The functionality of smart mobile devices is improving rapidly but these devices are limited in terms of practical use because of battery-life. This situation cannot be remedied by simply installing batteries with higher capacities in the devices. There are strict limitations in the design of a smartphone, in terms of physical space, that prohibit this “quick-fix” from being possible. The solution instead lies with the creation of an intelligent, dynamic mechanism for utilizing the hardware components on a device in an energy-efficient manner, while also maintaining the Quality of Service (QoS) requirements of the applications running on the device. This thesis proposes the following Energy-aware Adaptive Solutions (EASE): 1. BaSe-AMy: the Battery and Stream-aware Adaptive Multimedia Delivery (BaSe-AMy) algorithm assesses battery-life, network characteristics, video-stream properties and device hardware information, in order to dynamically reduce the power consumption of the device while streaming video. The algorithm computes the most efficient strategy for altering the characteristics of the stream, the playback of the video, and the hardware utilization of the device, dynamically, while meeting application’s QoS requirements. 2. PowerHop: an algorithm which assesses network conditions, device power consumption, neighboring node devices and QoS requirements to decide whether to adapt the transmission power or the number of hops that a device uses for communication. PowerHop’s ability to dynamically reduce the transmission power of the device’s Wireless Network Interface Card (WNIC) provides scope for reducing the power consumption of the device. In this case shorter transmission distances with multiple hops can be utilized to maintain network range. 3. A comprehensive survey of adaptive energy optimizations in multimedia-centric wireless devices is also provided. Additional contributions: 1. A custom video comparison tool was developed to facilitate objective assessment of streamed videos. 2. A new solution for high-accuracy mobile power logging was designed and implemented

    MECHANISM AND STAGES OF PACKAGING OF VP8, THE MAJOR TEGUMENT PROTEIN OF BOVINE HERPESVIRUS-1

    Get PDF
    VP8 (pUL47), the major tegument protein of bovine herpesvirus -1 (BoHV-1), is crucial for viral replication and induction of host immune responses. VP8 (pUL47) translocation from the nucleus to the cytoplasm and subsequently to the Golgi results from its phosphorylation within the nucleus by pUS3. VP8 (pUL47) phosphorylation mutant contains a significantly lower amount of VP8 (pUL47) (~30%) than wild type virus. Outside the context of infection, VP8 (pUL47) is translocated to the cytoplasm if co-transfected with pUS3 encoding plasmid, but remains cytoplasmic and is not translocated to the Golgi. Based on these previous studies, we hypothesized that VP8 (pUL47) is partially packaged in the perinuclear region, and localisation of VP8 at the Golgi for final packaging involves another viral factor, presumably a glycoprotein. Mass spectrometry studies indicated presence of VP8 (pUL47), and another tegument protein, VP22 (pUL49), in the perinuclear and mature virus particles. Co-immunoprecipitation and confocal microscopy confirmed an interaction between VP8 (pUL47) and VP22 (pUL49) and their co-localisation in the perinuclear region, respectively. In cells infected with virus lacking the VP22 (pUL49)-encoding gene, VP8 (pUL47) was absent from the perinuclear space, and the amount of VP8 (pUL47) in the purified mature virus was reduced by approximately 33%. To identify the viral factor(s) responsible for the localisation of cytoplasmic VP8 (pUL47) at the Golgi, a screening of co-precipitating glycoproteins was performed, and glycoprotein M (gM) was observed to be an interaction partner of VP8 (pUL47) during infection, as well as outside the context of infection. VP8 (pUL47) and gM (pUL10) co-localised at the Golgi in infected cells, and gM (pUL10) was sufficient for localisation of VP8 (pUL47) at the Golgi outside the context of infection. In recombinant virus lacking gene encoding gM (ΔgM- BoHV-1), the localisation of VP8 (pUL47) at the Golgi was impeded, and restored with the restoration of gM (pUL10). Analysis of purified mature virus from ΔgM- BoHV-1 infected cells indicated a reduction of approximately 65% in the amount of VP8 (pUL47). The results of this research add to the knowledge of the stages and proteins involved in the assembly of the tegument layer of BoHV-1 with focus on the major tegument protein, VP8 (pUL47)
    • 

    corecore