405 research outputs found

    Image sequence restoration by median filtering

    Get PDF
    Median filters are non-linear filters that fit in the generic category of order-statistic filters. Median filters are widely used for reducing random defects, commonly characterized by impulse or salt and pepper noise in a single image. Motion estimation is the process of estimating the displacement vector between like pixels in the current frame and the reference frame. When dealing with a motion sequence, the motion vectors are the key for operating on corresponding pixels in several frames. This work explores the use of various motion estimation algorithms in combination with various median filter algorithms to provide noise suppression. The results are compared using two sets of metrics: performance-based and objective image quality-based. These results are used to determine the best motion estimation / median filter combination for image sequence restoration. The primary goals of this work are to implement a motion estimation and median filter algorithm in hardware and develop and benchmark a flexible software alternative restoration process. There are two unique median filter algorithms to this work. The first filter is a modification to a single frame adaptive median filter. The modification applied motion compensation and temporal concepts. The other is an adaptive extension to the multi-level (ML3D) filter, called adaptive multi-level (AML3D) filter. The extension provides adaptable filter window sizes to the multiple filter sets that comprise the ML3D filter. The adaptive median filter is capable of filtering an image in 26.88 seconds per frame and results in a PSNR improvement of 5.452dB. The AML3D is capable of filtering an image in 14.73 seconds per frame and results in a PSNR improvement of 6.273dB. The AML3D is a suitable alternative to the other median filters

    Low bit-rate image sequence coding

    Get PDF

    Object Tracking in Distributed Video Networks Using Multi-Dimentional Signatures

    Get PDF
    From being an expensive toy in the hands of governmental agencies, computers have evolved a long way from the huge vacuum tube-based machines to today\u27s small but more than thousand times powerful personal computers. Computers have long been investigated as the foundation for an artificial vision system. The computer vision discipline has seen a rapid development over the past few decades from rudimentary motion detection systems to complex modekbased object motion analyzing algorithms. Our work is one such improvement over previous algorithms developed for the purpose of object motion analysis in video feeds. Our work is based on the principle of multi-dimensional object signatures. Object signatures are constructed from individual attributes extracted through video processing. While past work has proceeded on similar lines, the lack of a comprehensive object definition model severely restricts the application of such algorithms to controlled situations. In conditions with varying external factors, such algorithms perform less efficiently due to inherent assumptions of constancy of attribute values. Our approach assumes a variable environment where the attribute values recorded of an object are deemed prone to variability. The variations in the accuracy in object attribute values has been addressed by incorporating weights for each attribute that vary according to local conditions at a sensor location. This ensures that attribute values with higher accuracy can be accorded more credibility in the object matching process. Variations in attribute values (such as surface color of the object) were also addressed by means of applying error corrections such as shadow elimination from the detected object profile. Experiments were conducted to verify our hypothesis. The results established the validity of our approach as higher matching accuracy was obtained with our multi-dimensional approach than with a single-attribute based comparison

    Complexity adaptation in video encoders for power limited platforms

    Get PDF
    With the emergence of video services on power limited platforms, it is necessary to consider both performance-centric and constraint-centric signal processing techniques. Traditionally, video applications have a bandwidth or computational resources constraint or both. The recent H.264/AVC video compression standard offers significantly improved efficiency and flexibility compared to previous standards, which leads to less emphasis on bandwidth. However, its high computational complexity is a problem for codecs running on power limited plat- forms. Therefore, a technique that integrates both complexity and bandwidth issues in a single framework should be considered. In this thesis we investigate complexity adaptation of a video coder which focuses on managing computational complexity and provides significant complexity savings when applied to recent standards. It consists of three sub functions specially designed for reducing complexity and a framework for using these sub functions; Variable Block Size (VBS) partitioning, fast motion estimation, skip macroblock detection, and complexity adaptation framework. Firstly, the VBS partitioning algorithm based on the Walsh Hadamard Transform (WHT) is presented. The key idea is to segment regions of an image as edges or flat regions based on the fact that prediction errors are mainly affected by edges. Secondly, a fast motion estimation algorithm called Fast Walsh Boundary Search (FWBS) is presented on the VBS partitioned images. Its results outperform other commonly used fast algorithms. Thirdly, a skip macroblock detection algorithm is proposed for use prior to motion estimation by estimating the Discrete Cosine Transform (DCT) coefficients after quantisation. A new orthogonal transform called the S-transform is presented for predicting Integer DCT coefficients from Walsh Hadamard Transform coefficients. Complexity saving is achieved by deciding which macroblocks need to be processed and which can be skipped without processing. Simulation results show that the proposed algorithm achieves significant complexity savings with a negligible loss in rate-distortion performance. Finally, a complexity adaptation framework which combines all three techniques mentioned above is proposed for maximizing the perceptual quality of coded video on a complexity constrained platform

    Classification-Based Adaptive Search Algorithm for Video Motion Estimation

    Get PDF
    A video sequence consists of a series of frames. In order to compress the video for efficient storage and transmission, the temporal redundancy among adjacent frames must be exploited. A frame is selected as reference frame and subsequent frames are predicted from the reference frame using a technique known as motion estimation. Real videos contain a mixture of motions with slow and fast contents. Among block matching motion estimation algorithms, the full search algorithm is known for its superiority in the performance over other matching techniques. However, this method is computationally very extensive. Several fast block matching algorithms (FBMAs) have been proposed in the literature with the aim to reduce computational costs while maintaining desired quality performance, but all these methods are considered to be sub-optimal. No fixed fast block matching algorithm can effi- ciently remove temporal redundancy of video sequences with wide motion contents. Adaptive fast block matching algorithm, called classification based adaptive search (CBAS) has been proposed. A Bayes classifier is applied to classify the motions into slow and fast categories. Accordingly, appropriate search strategy is applied for each class. The algorithm switches between different search patterns according to the content of motions within video frames. The proposed technique outperforms conventional stand-alone fast block matching methods in terms of both peak signal to noise ratio (PSNR) and computational complexity. In addition, a new hierarchical method for detecting and classifying shot boundaries in video sequences is proposed which is based on information theoretic classification (ITC). ITC relies on likelihood of class label transmission of a data point to the data points in its vicinity. ITC focuses on maximizing the global transmission of true class labels and classify the frames into classes of cuts and non-cuts. Applying the same rule, the non-cut frames are also classified into two categories of arbitrary shot frames and gradual transition frames. CBAS is applied on the proposed shot detection method to handle camera or object motions. Experimental evidence demonstrates that our method can detect shot breaks with high accuracy

    Model based estimation of image depth and displacement

    Get PDF
    Passive depth and displacement map determinations have become an important part of computer vision processing. Applications that make use of this type of information include autonomous navigation, robotic assembly, image sequence compression, structure identification, and 3-D motion estimation. With the reliance of such systems on visual image characteristics, a need to overcome image degradations, such as random image-capture noise, motion, and quantization effects, is clearly necessary. Many depth and displacement estimation algorithms also introduce additional distortions due to the gradient operations performed on the noisy intensity images. These degradations can limit the accuracy and reliability of the displacement or depth information extracted from such sequences. Recognizing the previously stated conditions, a new method to model and estimate a restored depth or displacement field is presented. Once a model has been established, the field can be filtered using currently established multidimensional algorithms. In particular, the reduced order model Kalman filter (ROMKF), which has been shown to be an effective tool in the reduction of image intensity distortions, was applied to the computed displacement fields. Results of the application of this model show significant improvements on the restored field. Previous attempts at restoring the depth or displacement fields assumed homogeneous characteristics which resulted in the smoothing of discontinuities. In these situations, edges were lost. An adaptive model parameter selection method is provided that maintains sharp edge boundaries in the restored field. This has been successfully applied to images representative of robotic scenarios. In order to accommodate image sequences, the standard 2-D ROMKF model is extended into 3-D by the incorporation of a deterministic component based on previously restored fields. The inclusion of past depth and displacement fields allows a means of incorporating the temporal information into the restoration process. A summary on the conditions that indicate which type of filtering should be applied to a field is provided

    Visual SLAM from image sequences acquired by unmanned aerial vehicles

    Get PDF
    This thesis shows that Kalman filter based approaches are sufficient for the task of simultaneous localization and mapping from image sequences acquired by unmanned aerial vehicles. Using solely direction measurements to solve the problem of simultaneous localization and mapping (SLAM) is an important part of autonomous systems. Because the need for real-time capable systems, recursive estimation techniques, Kalman filter based approaches are the main focus of interest. Unfortunately, the non-linearity of the triangulation using the direction measurements cause decrease of accuracy and consistency of the results. The first contribution of this work is a general derivation of the recursive update of the Kalman filter. This derivation is based on implicit measurement equations, having the classical iterative non-linear as well as the non-iterative and linear Kalman filter as specializations of our general derivation. Second, a new formulation of linear-motion models for the single camera state model and the sliding window camera state model are given, that make it possible to compute the prediction in a fully linear manner. The third major contribution is a novel method for the initialization of new object points in the Kalman filter. Empirical studies using synthetic and real data of an image sequence of a photogrammetric strip are made, that demonstrate and compare the influences of the initialization methods of new object points in the Kalman filter. Forth, the accuracy potential of monoscopic image sequences from unmanned aerial vehicles for autonomous localization and mapping is theoretically analyzed, which can be used for planning purposes.Visuelle gleichzeitige Lokalisierung und Kartierung aus Bildfolgen von unbemannten Flugkörpern Diese Arbeit zeigt, dass die Kalmanfilter basierte Lösung der Triangulation zur Lokalisierung und Kartierung aus Bildfolgen von unbemannten Flugkörpern realisierbar ist. Aufgrund von Echtzeitanforderungen autonomer Systeme erreichen rekursive Schätz-verfahren, insbesondere Kalmanfilter basierte Ansätze, große Beliebheit. Bedauerlicherweise treten dabei durch die Nichtlinearität der Triangulation einige Effekte auf, welche die Konsistenz und Genauigkeit der Lösung hinsichtlich der geschätzten Parameter maßgeblich beeinflussen. Der erste Beitrag dieser Arbeit besteht in der Herleitung eines generellen Verfahrens zum rekursiven Verbessern im Kalmanfilter mit impliziten Beobachtungsgleichungen. Wir zeigen, dass die klassischen Verfahren im Kalmanfilter eine Spezialisierung unseres Ansatzes darstellen. Im zweiten Beitrag erweitern wir die klassische Modellierung für ein Einkameramodell zu einem Mehrkameramodell im Kalmanfilter. Diese Erweiterung erlaubt es uns, die Prädiktion für eine lineares Bewegungsmodell vollkommen linear zu berechnen. In einem dritten Hauptbeitrag stellen wir ein neues Verfahren zur Initialisierung von Neupunkten im Kalmanfilter vor. Anhand von empirischen Untersuchungen unter Verwendung simulierter und realer Daten einer Bildfolge eines photogrammetrischen Streifens zeigen und vergleichen wir, welchen Einfluß die Initialisierungsmethoden für Neupunkte im Kalmanfilter haben und welche Genauigkeiten für diese Szenarien erreichbar sind. Am Beispiel von Bildfolgen eines unbemannten Flugkörpern zeigen wir in dieser Arbeit als vierten Beitrag, welche Genauigkeit zur Lokalisierung und Kartierung durch Triangulation möglich ist. Diese theoretische Analyse kann wiederum zu Planungszwecken verwendet werden

    State of the art in 2D content representation and compression

    Get PDF
    Livrable D1.3 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D3.1 du projet
    corecore