26 research outputs found

    An improved algorithm for deinterlacing video streams

    Full text link
    The MPEG-4 standard for computerized video incorporates the concept of a video object pLane While in the simplest case this can be the full rectangular frame, the standard supports a hierarchical set of arbitrary shaped planes, one for each content sensitive video object. Herein is proposed a method for extracting arbitrary planes from video that does not already contain video object plane information; Deinterlacing is the process of taking two video fields, each at half the height of the finalized image frame, and combining them into that finalized frame. As the fields are not captured simultaneously, temporal artifacts may result. Herein is proposed a method to use the above mentioned video object planes to calculate the intra-field motion of objects in the video stream and correct for such motion leading to a higher quality deinterlaced output.*; *This dissertation is a compound document (contains both a paper copy and a CD as part of the dissertation)

    HDTV transmission format conversion and migration path

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.Includes bibliographical references (leaves 77-79).by Lon E. Sunshine.Ph.D

    Fuzzy logic-based embedded system for video de-interlacing

    Get PDF
    Video de-interlacing algorithms perform a crucial task in video processing. Despite these algorithms are developed using software implementations, their implementations in hardware are required to achieve real-time operation. This paper describes the development of an embedded system for video de-interlacing. The algorithm for video de-interlacing uses three fuzzy logic-based systems to tackle three relevant features in video sequences: motion, edges, and picture repetition. The proposed strategy implements the algorithm as a hardware IP core on a FPGA-based embedded system. The paper details the proposed architecture and the design methodology to develop it. The resulting embedded system is verified on a FPGA development board and it is able to de-interlace in real-tim

    Caveats on the first-generation da Vinci Research Kit: latent technical constraints and essential calibrations

    Full text link
    Telesurgical robotic systems provide a well established form of assistance in the operating theater, with evidence of growing uptake in recent years. Until now, the da Vinci surgical system (Intuitive Surgical Inc, Sunnyvale, California) has been the most widely adopted robot of this kind, with more than 6,700 systems in current clinical use worldwide [1]. To accelerate research on robotic-assisted surgery, the retired first-generation da Vinci robots have been redeployed for research use as "da Vinci Research Kits" (dVRKs), which have been distributed to research institutions around the world to support both training and research in the sector. In the past ten years, a great amount of research on the dVRK has been carried out across a vast range of research topics. During this extensive and distributed process, common technical issues have been identified that are buried deep within the dVRK research and development architecture, and were found to be common among dVRK user feedback, regardless of the breadth and disparity of research directions identified. This paper gathers and analyzes the most significant of these, with a focus on the technical constraints of the first-generation dVRK, which both existing and prospective users should be aware of before embarking onto dVRK-related research. The hope is that this review will aid users in identifying and addressing common limitations of the systems promptly, thus helping to accelerate progress in the field.Comment: 15 pages, 7 figure

    Adaptive format conversion information as enhancement data for scalable video coding

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.Includes bibliographical references (p. 143-145).Scalable coding techniques can be used to efficiently provide multicast video service and involve transmitting a single independently coded base layer and one or more dependently coded enhancement layers. Clients can decode the base layer bitstream and none, some or all of the enhancement layer bitstreams to obtain video quality commensurate with their available resources. In many scalable coding algorithms, residual coding information is the only type of data that is coded in the enhancement layers. However, since the transmitter has access to the original sequence, it can adaptively select different format conversion methods for different regions in an intelligent manner. This adaptive format conversion information can then be transmitted as enhancement data to assist processing at the decoder. The use of adaptive format conversion has not been studied in detail and this thesis examines when and how it can be used for scalable video compression. A new scalable codec is developed in this thesis that can utilize adaptive format conversion information and/or residual coding information as enhancement data. This codec was used in various simulations to investigate different aspects of adaptive format conversion such as the effect of the base layer, a comparison of adaptive format conversion and residual coding, and the use of both adaptive format conversion and residual coding.(cont.) The experimental results show adaptive format conversion can provide video scalability at low enhancement bitrates not possible with residual coding and also assist residual coding at higher enhancement layer bitrates. This thesis also discusses the application of adaptive format conversion to the migration path for digital television. Adaptive format conversion is well-suited to the unique problems of the migration path and can provide initial video scalability as well as assist a future migration path.by Wade K. Wan.Ph.D

    Development of Fast Motion Estimation Algorithms for Video Comression

    Get PDF
    With the increasing popularity of technologies such as Internet streaming video and video conferencing, video compression has became an essential component of broadcast and entertainment media. Motion Estimation (ME) and compensation techniques, which can eliminate temporal redundancy between adjacent frames effectively, have been widely applied to popular video compression coding standards such as MPEG-2, MPEG-4. Traditional fast block matching algorithms are easily trapped into the local minima resulting in degradation on video quality to some extent after decoding. Since Evolutionary Computing Techniques are suitable for achieving global optimal solution, these techniques are introduced to do Motion Estimation procedure in this thesis. Zero Motion prejudgement is also included which aims at finding static macroblocks (MB) which do not need to perform remaining search thus reduces the computational cost. Simulation results obtained show that the proposed Clonal Particle Swarm Optimization algorithm given a very good improvement in reducing the computations overhead and achieves very good Peak Signal to Noise Ratio (PSNR) values, which makes the techniques more efficient than the conventional searching algorithms. To reduce the Motion vector overhead in Bidirectional frame prediction, in this thesis novel Bidirectional Motion Estimation algorithm based on PSO is also proposed and results shows that the proposed method can significantly reduces the computational complexity involved in the Bidirectional frame prediction and also least prediction error in all video sequence

    Novel source coding methods for optimising real time video codecs.

    Get PDF
    The quality of the decoded video is affected by errors occurring in the various layers of the protocol stack. In this thesis, disjoint errors occurring in different layers of the protocol stack are investigated with the primary objective of demonstrating the flexibility of the source coding layer. In the first part of the thesis, the errors occurring in the editing layer, due to the coexistence of different video standards in the broadcast market, are addressed. The problems investigated are ‘Field Reversal’ and ‘Mixed Pulldown’. Field Reversal is caused when the interlaced video fields are not shown in the same order as they were captured. This results in a shaky video display, as the fields are not displayed in chronological order. Additionally, Mixed Pulldown occurs when the video frame-rate is up-sampled and down-sampled, when digitised film material is being standardised to suit standard televisions. Novel image processing algorithms are proposed to solve these problems from the source coding layer. In the second part of the thesis, the errors occurring in the transmission layer due to data corruption are addressed. The usage of block level source error-resilient methods over bit level channel coding methods are investigated and improvements are suggested. The secondary objective of the thesis is to optimise the proposed algorithm’s architecture for real-time implementation, since the problems are of a commercial nature. The Field Reversal and Mixed Pulldown algorithms were tested in real time at MTV (Music Television) and are made available commercially through ‘Cerify’, a Linux-based media testing box manufactured by Tektronix Plc. The channel error-resilient algorithms were tested in a laboratory environment using Matlab and performance improvements are obtained

    Video post processing architectures

    Get PDF

    Beyond the pixels: learning and utilising video compression features for localisation of digital tampering.

    Get PDF
    Video compression is pervasive in digital society. With rising usage of deep convolutional neural networks (CNNs) in the fields of computer vision, video analysis and video tampering detection, it is important to investigate how patterns invisible to human eyes may be influencing modern computer vision techniques and how they can be used advantageously. This work thoroughly explores how video compression influences accuracy of CNNs and shows how optimal performance is achieved when compression levels in the training set closely match those of the test set. A novel method is then developed, using CNNs, to derive compression features directly from the pixels of video frames. It is then shown that these features can be readily used to detect inauthentic video content with good accuracy across multiple different video tampering techniques. Moreover, the ability to explain these features allows predictions to be made about their effectiveness against future tampering methods. The problem is motivated with a novel investigation into recent video manipulation methods, which shows that there is a consistent drive to produce convincing, photorealistic, manipulated or synthetic video. Humans, blind to the presence of video tampering, are also blind to the type of tampering. New detection techniques are required and, in order to compensate for human limitations, they should be broadly applicable to multiple tampering types. This thesis details the steps necessary to develop and evaluate such techniques
    corecore