9 research outputs found
Algorithms and methods for video transcoding.
Video transcoding is the process of dynamic video adaptation. Dynamic video adaptation can be defined as the process of converting video from one format to another, changing the bit rate, frame rate or resolution of the encoded video, which is mainly necessitated by the end user requirements. H.264 has been the predominantly used video compression standard for the last 15 years. HEVC (High Efficiency Video Coding) is the latest video compression standard finalised in 2013, which is an improvement over H.264 video compression standard. HEVC performs significantly better than H.264 in terms of the Rate-Distortion performance. As H.264 has been widely used in the last decade, a large amount of video content exists in H.264 format. There is a need to convert H.264 video content to HEVC format to achieve better Rate-Distortion performance and to support legacy video formats on newer devices. However, the computational complexity of HEVC encoder is 2-10 times higher than that of H.264 encoder. This makes it necessary to develop low complexity video transcoding algorithms to transcode from H.264 to HEVC format. This research work proposes low complexity algorithms for H.264 to HEVC video transcoding. The proposed algorithms reduce the computational complexity of H.264 to HEVC video transcoding significantly, with negligible loss in Rate-Distortion performance. This work proposes three different video transcoding algorithms. The MV-based mode merge algorithm uses the block mode and MV variances to estimate the split/non-split decision as part of the HEVC block prediction process. The conditional probability-based mode mapping algorithm models HEVC blocks of sizes 16×16 and lower as a function of H.264 block modes, H.264 and HEVC Quantisation Parameters (QP). The motion-compensated MB residual-based mode mapping algorithm makes the split/non-split decision based on content-adaptive classification models. With a combination of the proposed set of algorithms, the computational complexity of the HEVC encoder is reduced by around 60%, with negligible loss in Rate-Distortion performance, outperforming existing state-of-art algorithms by 20-25% in terms of computational complexity. The proposed algorithms can be used in computation-constrained video transcoding applications, to support video format conversion in smart devices, migration of large-scale H.264 video content from host servers to HEVC, cloud computing-based transcoding applications, and also to support high quality videos over bandwidth-constrained networks
Recommended from our members
Adaptive intra refresh for robust wireless multi-view video
This thesis was submitted for the award of PhD and was awarded by Brunel University LondonMobile wireless communication technology is a fast developing field and every day new mobile communication techniques and means are becoming available. In this thesis multi-view video (MVV) is also refers to as 3D video. Thus, the 3D video signals through wireless communication are shaping telecommunication industry and academia. However, wireless channels are prone to high level of bit and burst errors that largely deteriorate the quality of service (QoS). Noise along the wireless transmission path can introduce distortion or make a compressed bitstream lose vital information. The error caused by noise progressively spread to subsequent frames and among multiple views due to prediction. This error may compel the receiver to pause momentarily and wait for the subsequent INTRA picture to continue decoding. The pausing of video stream affects the user's Quality of Experience (QoE). Thus, an error resilience strategy is needed to protect the compressed bitstream against transmission errors. This thesis focuses on error resilience Adaptive Intra Refresh (AIR) technique. The AIR method is developed to make the compressed 3D video more robust to channel errors. The process involves periodic injection of Intra-coded macroblocks in a cyclic pattern using H.264/AVC standard. The algorithm takes into account individual features in each macroblock and the feedback information sent by the decoder about the channel condition in order to generate an MVV-AIR map. MVV-AIR map generation regulates the order of packets arrival and identifies the motion activities in each macroblock. Based on the level of motion activity contained in each macroblock, the MVV-AIR map classifies frames as high or low motion macroblocks. A proxy MVV-AIR transcoder is used to validate the efficiency of the generated MVV-AIR map. The MVV-AIR transcoding algorithm uses spatial and views downscaling scheme to convert from MVV to single view. Various experimental results indicate that the proposed error resilient MVV-AIR transcoder technique effectively improves the quality of reconstructed 3D video in wireless networks. A comparison of MVV-AIR transcoder algorithm with some traditional error resilience techniques demonstrates that MVV-AIR algorithm performs better in an error prone channel. Results of simulation revealed significant improvements in both objective and subjective qualities. No additional computational complexity emanates from the scheme while the QoS and QoE requirements are still fully met.Tertiary Institution Trust Fund (TETFund) of Nigeri
Efficient algorithms for scalable video coding
A scalable video bitstream specifically designed for the needs of various client terminals,
network conditions, and user demands is much desired in current and future video transmission
and storage systems. The scalable extension of the H.264/AVC standard (SVC) has
been developed to satisfy the new challenges posed by heterogeneous environments, as
it permits a single video stream to be decoded fully or partially with variable quality, resolution,
and frame rate in order to adapt to a specific application. This thesis presents
novel improved algorithms for SVC, including: 1) a fast inter-frame and inter-layer coding
mode selection algorithm based on motion activity; 2) a hierarchical fast mode selection
algorithm; 3) a two-part Rate Distortion (RD) model targeting the properties of different
prediction modes for the SVC rate control scheme; and 4) an optimised Mean Absolute
Difference (MAD) prediction model.
The proposed fast inter-frame and inter-layer mode selection algorithm is based on the
empirical observation that a macroblock (MB) with slow movement is more likely to be
best matched by one in the same resolution layer. However, for a macroblock with fast
movement, motion estimation between layers is required. Simulation results show that
the algorithm can reduce the encoding time by up to 40%, with negligible degradation in
RD performance.
The proposed hierarchical fast mode selection scheme comprises four levels and makes
full use of inter-layer, temporal and spatial correlation aswell as the texture information of
each macroblock. Overall, the new technique demonstrates the same coding performance
in terms of picture quality and compression ratio as that of the SVC standard, yet produces
a saving in encoding time of up to 84%. Compared with state-of-the-art SVC fast mode
selection algorithms, the proposed algorithm achieves a superior computational time reduction
under very similar RD performance conditions.
The existing SVC rate distortion model cannot accurately represent the RD properties of
the prediction modes, because it is influenced by the use of inter-layer prediction. A separate
RD model for inter-layer prediction coding in the enhancement layer(s) is therefore
introduced. Overall, the proposed algorithms improve the average PSNR by up to 0.34dB
or produce an average saving in bit rate of up to 7.78%. Furthermore, the control accuracy
is maintained to within 0.07% on average.
As aMADprediction error always exists and cannot be avoided, an optimisedMADprediction
model for the spatial enhancement layers is proposed that considers the MAD from
previous temporal frames and previous spatial frames together, to achieve a more accurateMADprediction.
Simulation results indicate that the proposedMADprediction model
reduces the MAD prediction error by up to 79% compared with the JVT-W043 implementation
JTIT
kwartalni
JTIT
kwartalni
Can Expert-level Cognition be Rapidly Acquired? The Effect of a Human Factors-based Virtual Reality Trainer on Non-Technical Skills in the Operating Theatre
Background
Restrictions to real-life experiences in surgical training can hinder skill acquisition. Factors such as large student-to-teacher ratios, equipment limitations, or pandemics can reduce access to expert cognition and pedagogical guidance that is required by novices. Additionally, high quality pedagogy from workshops, lectures, and boot camps are not accessible enough and cannot be attended during pandemic restrictions. Therefore, Non-Technical Skills (NTS) of Operating Theatre (OT) teams need more training content that can provide simulations for training purposes. Patient safety and undesired event prevention can be improved by a scenario-driven approach that is built upon practice and feedback to scaffold cognitive skills for OT trainees. An NTS virtual reality training tool was created and compared to existing theory-based content.
Method
Eighty-two undergraduate surgical students were asked different scenarios and showed decision-making is not distinct as a factor of course year, to generally concur with previous findings. A Task Analysis of a surgical procedure and the OT environment was formed, and 3 experts in surgery were interviewed with thematic analysis of data. The design and creation of the virtual reality instrument then occurred with 360-degrees OT videos. Then, a two groups comparison of a one-hour session with before, during, and after intervention measures compared 14 3rdyear operating theatre practitioners. Verbal Protocol Analysis (VPA) of the trainees’ sessions were paired with Situation Awareness Global Assessment Technique (SAGAT) scores and rankings for a written decision-making scenario. Post-session reflections were analysed using Interpretative Phenomenological Analysis(IPA)to understand how they experienced the materials and common occurrences between participants.
Results
Thematic analysis of expert interviews revealed rich mental models, tacit knowledge, and purposeful augmentation of NTS as a countermeasure when teaching. This allowed insight into what non-technical elements were feasible when incorporated into a headset. The main VPA findings from the 14 OT trainees suggested significant increase of verbalization around Teamwork and Communication(p=0.028). Within this NTS category, significantly more verbalizations for shared mental models for the experimental condition occurred (p=0.018).Additionally, a significant increase in transformation of cue meaning to improve understanding of the environment occurred, compared to control condition (p=0.02).However, SAGAT scores showed no significant differences in 23 questions for both conditions, this may be a limit in both conditions’ presentation delivery as items in the videos are difficult to identify.
Conclusions
Significant results in specific and not all are as highlight complexities in NTS training but is a step towards improved support for OT staff to improve awareness and safety during surgery. Although supposed homogenous technical skills, large variations in participants’ decision-making strategies and perceptions of cues may have confounded the intervention effects. During intervention, the control condition used past experiences to contextually interpret theory to strengthen their schemata in more concrete rather than abstract forms. Real-life scenarios in the experimental condition reduced this need therefore applied their feedback to actual events shown, which may increase transfer of skill to real-life. More sessions over a longer period could observe stronger improvements in the same directions in the current results. Overall, the intervention was equal to or greater than the control condition promoting further research on a greater timeframe and audience
XXIII Congreso Argentino de Ciencias de la Computación - CACIC 2017 : Libro de actas
Trabajos presentados en el XXIII Congreso Argentino de Ciencias de la Computación (CACIC), celebrado en la ciudad de La Plata los días 9 al 13 de octubre de 2017, organizado por la Red de Universidades con Carreras en Informática (RedUNCI) y la Facultad de Informática de la Universidad Nacional de La Plata (UNLP).Red de Universidades con Carreras en Informática (RedUNCI