2,693 research outputs found

    Motion Scalability for Video Coding with Flexible Spatio-Temporal Decompositions

    Get PDF
    PhDThe research presented in this thesis aims to extend the scalability range of the wavelet-based video coding systems in order to achieve fully scalable coding with a wide range of available decoding points. Since the temporal redundancy regularly comprises the main portion of the global video sequence redundancy, the techniques that can be generally termed motion decorrelation techniques have a central role in the overall compression performance. For this reason the scalable motion modelling and coding are of utmost importance, and specifically, in this thesis possible solutions are identified and analysed. The main contributions of the presented research are grouped into two interrelated and complementary topics. Firstly a flexible motion model with rateoptimised estimation technique is introduced. The proposed motion model is based on tree structures and allows high adaptability needed for layered motion coding. The flexible structure for motion compensation allows for optimisation at different stages of the adaptive spatio-temporal decomposition, which is crucial for scalable coding that targets decoding on different resolutions. By utilising an adaptive choice of wavelet filterbank, the model enables high compression based on efficient mode selection. Secondly, solutions for scalable motion modelling and coding are developed. These solutions are based on precision limiting of motion vectors and creation of a layered motion structure that describes hierarchically coded motion. The solution based on precision limiting relies on layered bit-plane coding of motion vector values. The second solution builds on recently established techniques that impose scalability on a motion structure. The new approach is based on two major improvements: the evaluation of distortion in temporal Subbands and motion search in temporal subbands that finds the optimal motion vectors for layered motion structure. Exhaustive tests on the rate-distortion performance in demanding scalable video coding scenarios show benefits of application of both developed flexible motion model and various solutions for scalable motion coding

    Efficient Hybrid Image Warping for High Frame-Rate Stereoscopic Rendering

    Get PDF
    Modern virtual reality simulations require a constant high-frame rate from the rendering engine. They may also require very low latency and stereo images. Previous rendering engines for virtual reality applications have exploited spatial and temporal coherence by using image-warping to re-use previous frames or to render a stereo pair at lower cost than running the full render pipeline twice. However these previous approaches have shown artifacts or have not scaled well with image size. We present a new image-warping algorithm that has several novel contributions: an adaptive grid generation algorithm for proxy geometry for image warping; a low-pass hole-filling algorithm to address un-occlusion; and support for transparent surfaces by efficiently ray casting transparent fragments stored in per-pixel linked lists of an A-Buffer. We evaluate our algorithm with a variety of challenging test cases. The results show that it achieves better quality image-warping than state-of-the-art techniques and that it can support transparent surfaces effectively. Finally, we show that our algorithm can achieve image warping at rates suitable for practical use in a variety of applications on modern virtual reality equipment

    Reconstruction and analysis of dynamic shapes

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 122-141).Motion capture has revolutionized entertainment and influenced fields as diverse as the arts, sports, and medicine. This is despite the limitation that it tracks only a small set of surface points. On the other hand, 3D scanning techniques digitize complete surfaces of static objects, but are not applicable to moving shapes. I present methods that overcome both limitations, and can obtain the moving geometry of dynamic shapes (such as people and clothes in motion) and analyze it in order to advance computer animation. Further understanding of dynamic shapes will enable various industries to enhance virtual characters, advance robot locomotion, improve sports performance, and aid in medical rehabilitation, thus directly affecting our daily lives. My methods efficiently recover much of the expressiveness of dynamic shapes from the silhouettes alone. Furthermore, the reconstruction quality is greatly improved by including surface orientations (normals). In order to make reconstruction more practical, I strive to capture dynamic shapes in their natural environment, which I do by using hybrid inertial and acoustic sensors. After capture, the reconstructed dynamic shapes are analyzed in order to enhance their utility. My algorithms then allow animators to generate novel motions, such as transferring facial performances from one actor onto another using multi-linear models. The presented research provides some of the first and most accurate reconstructions of complex moving surfaces, and is among the few approaches that establish a relationship between different dynamic shapes.by Daniel Vlasic.Ph.D

    Fast Motion Estimation Algorithms for Block-Based Video Coding Encoders

    Get PDF
    The objective of my research is reducing the complexity of video coding standards in real-time scalable and multi-view applications

    Coded Signals for High Frequency Ultrasound Imaging

    Get PDF
    Degeneration of articular cartilage is known as a serious and painful knee disease seriously affecting people in all ages. The disease also marks the presence of osteoarthritis which is a complex musculoskeletal disorder. A successful assessment of the degeneration status is of great importance for estimating osteoarthritis progression, and thereby beneficial for implementing clinical treatments. Ultrasound has played a vital role in imaging the articular cartilage since it is capable of providing distinct information of important cartilage structures. However, various types of noise in ultrasound signals (e.g. clutter noise) are known to limit the quality of ultrasound images, especially at high frequencies where wave attenuation becomes severe. The possibility for improving the signal to noise ratio (SNR) by using coded signals is therefore the motivation behind this thesis, with the main objective is to investigate suitable codes and compression methods for cartilage imaging. The main focus of this thesis has been put on coded ultrasound signals and related signal processing methods. Transducers made from two different piezoelectric materials (PZT and PVDF) are used to image a thick cartilage sample. For each transducer, three different waveforms (Ricker wavelet, Gaussian chirped, and a 13-bit Barker) are used to excite the ultrasonic transducers. Two different wave compression methods (Matched filtering and Wiener filtering) are also explored to decode the signals received by transducers. Ahead of processing the received signals, a time calibration was used to compensate for sample tilting, yielding an improved precision in the phase/time delay. A maximum method and a center of mass method were used for calibration. The results from the experimental work show that both Chirp coded signals and Barker coded signals work well in improving the SNR, and that both transducers are able to produce high quality images of the cartilage sample. For the situations using coded excitation signals, however, the PZT transducer has high requirement for excitation repetition frequency because of its built-in delay line. Different time calibration methods have their own applicable conditions. Matched filter and Wiener filter both perform well for decoding, but the “noise” parameter in the Wiener filter has to be adjusted carefully to produce reasonable results

    Inverse tone mapping

    Get PDF
    The introduction of High Dynamic Range Imaging in computer graphics has produced a novelty in Imaging that can be compared to the introduction of colour photography or even more. Light can now be captured, stored, processed, and finally visualised without losing information. Moreover, new applications that can exploit physical values of the light have been introduced such as re-lighting of synthetic/real objects, or enhanced visualisation of scenes. However, these new processing and visualisation techniques cannot be applied to movies and pictures that have been produced by photography and cinematography in more than one hundred years. This thesis introduces a general framework for expanding legacy content into High Dynamic Range content. The expansion is achieved avoiding artefacts, producing images suitable for visualisation and re-lighting of synthetic/real objects. Moreover, it is presented a methodology based on psychophysical experiments and computational metrics to measure performances of expansion algorithms. Finally, a compression scheme, inspired by the framework, for High Dynamic Range Textures, is proposed and evaluated

    Modeling small objects under uncertainties : novel algorithms and applications.

    Get PDF
    Active Shape Models (ASM), Active Appearance Models (AAM) and Active Tensor Models (ATM) are common approaches to model elastic (deformable) objects. These models require an ensemble of shapes and textures, annotated by human experts, in order identify the model order and parameters. A candidate object may be represented by a weighted sum of basis generated by an optimization process. These methods have been very effective for modeling deformable objects in biomedical imaging, biometrics, computer vision and graphics. They have been tried mainly on objects with known features that are amenable to manual (expert) annotation. They have not been examined on objects with severe ambiguities to be uniquely characterized by experts. This dissertation presents a unified approach for modeling, detecting, segmenting and categorizing small objects under uncertainty, with focus on lung nodules that may appear in low dose CT (LDCT) scans of the human chest. The AAM, ASM and the ATM approaches are used for the first time on this application. A new formulation to object detection by template matching, as an energy optimization, is introduced. Nine similarity measures of matching have been quantitatively evaluated for detecting nodules less than 1 em in diameter. Statistical methods that combine intensity, shape and spatial interaction are examined for segmentation of small size objects. Extensions of the intensity model using the linear combination of Gaussians (LCG) approach are introduced, in order to estimate the number of modes in the LCG equation. The classical maximum a posteriori (MAP) segmentation approach has been adapted to handle segmentation of small size lung nodules that are randomly located in the lung tissue. A novel empirical approach has been devised to simultaneously detect and segment the lung nodules in LDCT scans. The level sets methods approach was also applied for lung nodule segmentation. A new formulation for the energy function controlling the level set propagation has been introduced taking into account the specific properties of the nodules. Finally, a novel approach for classification of the segmented nodules into categories has been introduced. Geometric object descriptors such as the SIFT, AS 1FT, SURF and LBP have been used for feature extraction and matching of small size lung nodules; the LBP has been found to be the most robust. Categorization implies classification of detected and segmented objects into classes or types. The object descriptors have been deployed in the detection step for false positive reduction, and in the categorization stage to assign a class and type for the nodules. The AAMI ASMI A TM models have been used for the categorization stage. The front-end processes of lung nodule modeling, detection, segmentation and classification/categorization are model-based and data-driven. This dissertation is the first attempt in the literature at creating an entirely model-based approach for lung nodule analysis

    Advanced Algebraic Concepts for Efficient Multi-Channel Signal Processing

    Get PDF
    Unsere moderne Gesellschaft ist Zeuge eines fundamentalen Wandels in der Art und Weise wie wir mit Technologie interagieren. Geräte werden zunehmend intelligenter - sie verfügen über mehr und mehr Rechenleistung und häufiger über eigene Kommunikationsschnittstellen. Das beginnt bei einfachen Haushaltsgeräten und reicht über Transportmittel bis zu großen überregionalen Systemen wie etwa dem Stromnetz. Die Erfassung, die Verarbeitung und der Austausch digitaler Informationen gewinnt daher immer mehr an Bedeutung. Die Tatsache, dass ein wachsender Anteil der Geräte heutzutage mobil und deshalb batteriebetrieben ist, begründet den Anspruch, digitale Signalverarbeitungsalgorithmen besonders effizient zu gestalten. Dies kommt auch dem Wunsch nach einer Echtzeitverarbeitung der großen anfallenden Datenmengen zugute. Die vorliegende Arbeit demonstriert Methoden zum Finden effizienter algebraischer Lösungen für eine Vielzahl von Anwendungen mehrkanaliger digitaler Signalverarbeitung. Solche Ansätze liefern nicht immer unbedingt die bestmögliche Lösung, kommen dieser jedoch häufig recht nahe und sind gleichzeitig bedeutend einfacher zu beschreiben und umzusetzen. Die einfache Beschreibungsform ermöglicht eine tiefgehende Analyse ihrer Leistungsfähigkeit, was für den Entwurf eines robusten und zuverlässigen Systems unabdingbar ist. Die Tatsache, dass sie nur gebräuchliche algebraische Hilfsmittel benötigen, erlaubt ihre direkte und zügige Umsetzung und den Test unter realen Bedingungen. Diese Grundidee wird anhand von drei verschiedenen Anwendungsgebieten demonstriert. Zunächst wird ein semi-algebraisches Framework zur Berechnung der kanonisch polyadischen (CP) Zerlegung mehrdimensionaler Signale vorgestellt. Dabei handelt es sich um ein sehr grundlegendes Werkzeug der multilinearen Algebra mit einem breiten Anwendungsspektrum von Mobilkommunikation über Chemie bis zur Bildverarbeitung. Verglichen mit existierenden iterativen Lösungsverfahren bietet das neue Framework die Möglichkeit, den Rechenaufwand und damit die Güte der erzielten Lösung zu steuern. Es ist außerdem weniger anfällig gegen eine schlechte Konditionierung der Ausgangsdaten. Das zweite Gebiet, das in der Arbeit besprochen wird, ist die unterraumbasierte hochauflösende Parameterschätzung für mehrdimensionale Signale, mit Anwendungsgebieten im RADAR, der Modellierung von Wellenausbreitung, oder bildgebenden Verfahren in der Medizin. Es wird gezeigt, dass sich derartige mehrdimensionale Signale mit Tensoren darstellen lassen. Dies erlaubt eine natürlichere Beschreibung und eine bessere Ausnutzung ihrer Struktur als das mit Matrizen möglich ist. Basierend auf dieser Idee entwickeln wir eine tensor-basierte Schätzung des Signalraums, welche genutzt werden kann um beliebige existierende Matrix-basierte Verfahren zu verbessern. Dies wird im Anschluss exemplarisch am Beispiel der ESPRIT-artigen Verfahren gezeigt, für die verbesserte Versionen vorgeschlagen werden, die die mehrdimensionale Struktur der Daten (Tensor-ESPRIT), nichzirkuläre Quellsymbole (NC ESPRIT), sowie beides gleichzeitig (NC Tensor-ESPRIT) ausnutzen. Um die endgültige Schätzgenauigkeit objektiv einschätzen zu können wird dann ein Framework für die analytische Beschreibung der Leistungsfähigkeit beliebiger ESPRIT-artiger Algorithmen diskutiert. Verglichen mit existierenden analytischen Ausdrücken ist unser Ansatz allgemeiner, da keine Annahmen über die statistische Verteilung von Nutzsignal und Rauschen benötigt werden und die Anzahl der zur Verfügung stehenden Schnappschüsse beliebig klein sein kann. Dies führt auf vereinfachte Ausdrücke für den mittleren quadratischen Schätzfehler, die Schlussfolgerungen über die Effizienz der Verfahren unter verschiedenen Bedingungen zulassen. Das dritte Anwendungsgebiet ist der bidirektionale Datenaustausch mit Hilfe von Relay-Stationen. Insbesondere liegt hier der Fokus auf Zwei-Wege-Relaying mit Hilfe von Amplify-and-Forward-Relays mit mehreren Antennen, da dieser Ansatz ein besonders gutes Kosten-Nutzen-Verhältnis verspricht. Es wird gezeigt, dass sich die nötige Kanalkenntnis mit einem einfachen algebraischen Tensor-basierten Schätzverfahren gewinnen lässt. Außerdem werden Verfahren zum Finden einer günstigen Relay-Verstärkungs-Strategie diskutiert. Bestehende Ansätze basieren entweder auf komplexen numerischen Optimierungsverfahren oder auf Ad-Hoc-Ansätzen die keine zufriedenstellende Bitfehlerrate oder Summenrate liefern. Deshalb schlagen wir algebraische Ansätze zum Finden der Relayverstärkungsmatrix vor, die von relevanten Systemmetriken inspiriert sind und doch einfach zu berechnen sind. Wir zeigen das algebraische ANOMAX-Verfahren zum Erreichen einer niedrigen Bitfehlerrate und seine Modifikation RR-ANOMAX zum Erreichen einer hohen Summenrate. Für den Spezialfall, in dem die Endgeräte nur eine Antenne verwenden, leiten wir eine semi-algebraische Lösung zum Finden der Summenraten-optimalen Strategie (RAGES) her. Anhand von numerischen Simulationen wird die Leistungsfähigkeit dieser Verfahren bezüglich Bitfehlerrate und erreichbarer Datenrate bewertet und ihre Effektivität gezeigt.Modern society is undergoing a fundamental change in the way we interact with technology. More and more devices are becoming "smart" by gaining advanced computation capabilities and communication interfaces, from household appliances over transportation systems to large-scale networks like the power grid. Recording, processing, and exchanging digital information is thus becoming increasingly important. As a growing share of devices is nowadays mobile and hence battery-powered, a particular interest in efficient digital signal processing techniques emerges. This thesis contributes to this goal by demonstrating methods for finding efficient algebraic solutions to various applications of multi-channel digital signal processing. These may not always result in the best possible system performance. However, they often come close while being significantly simpler to describe and to implement. The simpler description facilitates a thorough analysis of their performance which is crucial to design robust and reliable systems. The fact that they rely on standard algebraic methods only allows their rapid implementation and test under real-world conditions. We demonstrate this concept in three different application areas. First, we present a semi-algebraic framework to compute the Canonical Polyadic (CP) decompositions of multidimensional signals, a very fundamental tool in multilinear algebra with applications ranging from chemistry over communications to image compression. Compared to state-of-the art iterative solutions, our framework offers a flexible control of the complexity-accuracy trade-off and is less sensitive to badly conditioned data. The second application area is multidimensional subspace-based high-resolution parameter estimation with applications in RADAR, wave propagation modeling, or biomedical imaging. We demonstrate that multidimensional signals can be represented by tensors, providing a convenient description and allowing to exploit the multidimensional structure in a better way than using matrices only. Based on this idea, we introduce the tensor-based subspace estimate which can be applied to enhance existing matrix-based parameter estimation schemes significantly. We demonstrate the enhancements by choosing the family of ESPRIT-type algorithms as an example and introducing enhanced versions that exploit the multidimensional structure (Tensor-ESPRIT), non-circular source amplitudes (NC ESPRIT), and both jointly (NC Tensor-ESPRIT). To objectively judge the resulting estimation accuracy, we derive a framework for the analytical performance assessment of arbitrary ESPRIT-type algorithms by virtue of an asymptotical first order perturbation expansion. Our results are more general than existing analytical results since we do not need any assumptions about the distribution of the desired signal and the noise and we do not require the number of samples to be large. At the end, we obtain simplified expressions for the mean square estimation error that provide insights into efficiency of the methods under various conditions. The third application area is bidirectional relay-assisted communications. Due to its particularly low complexity and its efficient use of the radio resources we choose two-way relaying with a MIMO amplify and forward relay. We demonstrate that the required channel knowledge can be obtained by a simple algebraic tensor-based channel estimation scheme. We also discuss the design of the relay amplification matrix in such a setting. Existing approaches are either based on complicated numerical optimization procedures or on ad-hoc solutions that to not perform well in terms of the bit error rate or the sum-rate. Therefore, we propose algebraic solutions that are inspired by these performance metrics and therefore perform well while being easy to compute. For the MIMO case, we introduce the algebraic norm maximizing (ANOMAX) scheme, which achieves a very low bit error rate, and its extension Rank-Restored ANOMAX (RR-ANOMAX) that achieves a sum-rate close to an upper bound. Moreover, for the special case of single antenna terminals we derive the semi-algebraic RAGES scheme which finds the sum-rate optimal relay amplification matrix based on generalized eigenvectors. Numerical simulations evaluate the resulting system performance in terms of bit error rate and system sum rate which demonstrates the effectiveness of the proposed algebraic solutions

    Robust Methods for Visual Tracking and Model Alignment

    Get PDF
    The ubiquitous presence of cameras and camera networks needs the development of robust visual analytics algorithms. As the building block of many video visual surveillance tasks, a robust visual tracking algorithm plays an important role in achieving the goal of automatic and robust surveillance. In practice, it is critical to know when and where the tracking algorithm fails so that remedial measures can be taken to resume tracking. We propose a novel performance evaluation strategy for tracking systems using a time-reversed Markov chain. We also present a novel bidirectional tracker to achieve better robustness. Instead of looking only forward in the time domain, we incorporate both forward and backward processing of video frames using a time-reversibility constraint. When the objects of interest in surveillance applications have relatively stable structures, the parameterized shape model of objects can be usually built or learned from sample images, which allows us to perform more accurate tracking. We present a machine learning method to learn a scoring function without local extrema to guide the gradient descent/accent algorithm and find the optimal parameters of the shape model. These algorithms greatly improve the robustness of video analysis systems in practice
    • …
    corecore