73 research outputs found

    Network vector quantization

    Get PDF
    We present an algorithm for designing locally optimal vector quantizers for general networks. We discuss the algorithm's implementation and compare the performance of the resulting "network vector quantizers" to traditional vector quantizers (VQs) and to rate-distortion (R-D) bounds where available. While some special cases of network codes (e.g., multiresolution (MR) and multiple description (MD) codes) have been studied in the literature, we here present a unifying approach that both includes these existing solutions as special cases and provides solutions to previously unsolved examples

    Compression of DNA sequencing data

    Get PDF
    With the release of the latest generations of sequencing machines, the cost of sequencing a whole human genome has dropped to less than US$1,000. The potential applications in several fields lead to the forecast that the amount of DNA sequencing data will soon surpass the volume of other types of data, such as video data. In this dissertation, we present novel data compression technologies with the aim of enhancing storage, transmission, and processing of DNA sequencing data. The first contribution in this dissertation is a method for the compression of aligned reads, i.e., read-out sequence fragments that have been aligned to a reference sequence. The method improves compression by implicitly assembling local parts of the underlying sequences. Compared to the state of the art, our method achieves the best trade-off between memory usage and compressed size. Our second contribution is a method for the quantization and compression of quality scores, i.e., values that quantify the error probability of each read-out base. Specifically, we propose two Bayesian models that are used to precisely control the quantization. With our method it is possible to compress the data down to 0.15 bit per quality score. Notably, we can recommend a particular parametrization for one of our models which—by removing noise from the data as a side effect—does not lead to any degradation in the distortion metric. This parametrization achieves an average rate of 0.45 bit per quality score. The third contribution is the first implementation of an entropy codec compliant to MPEG-G. We show that, compared to the state of the art, our method achieves the best compression ranks on average, and that adding our method to CRAM would be beneficial both in terms of achievable compression and speed. Finally, we provide an overview of the standardization landscape, and in particular of MPEG-G, in which our contributions have been integrated.Mit der Einführung der neuesten Generationen von Sequenziermaschinen sind die Kosten für die Sequenzierung eines menschlichen Genoms auf weniger als 1.000 US-Dollar gesunken. Es wird prognostiziert, dass die Menge der Sequenzierungsdaten bald diejenige anderer Datentypen, wie z.B. Videodaten, übersteigen wird. Daher werden in dieser Arbeit neue Datenkompressionsverfahren zur Verbesserung der Speicherung, Übertragung und Verarbeitung von Sequenzierungsdaten vorgestellt. Der erste Beitrag in dieser Arbeit ist eine Methode zur Komprimierung von alignierten Reads, d.h. ausgelesenen Sequenzfragmenten, die an eine Referenzsequenz angeglichen wurden. Die Methode verbessert die Komprimierung, indem sie die Reads nutzt, um implizit lokale Teile der zugrunde liegenden Sequenzen zu schätzen. Im Vergleich zum Stand der Technik erzielt die Methode das beste Ergebnis in einer gemeinsamen Betrachtung von Speichernutzung und erzielter Komprimierung. Der zweite Beitrag ist eine Methode zur Quantisierung und Komprimierung von Qualitätswerten, welche die Fehlerwahrscheinlichkeit jeder ausgelesenen Base quantifizieren. Konkret werden zwei Bayes’sche Modelle vorgeschlagen, mit denen die Quantisierung präzise gesteuert werden kann. Mit der vorgeschlagenen Methode können die Daten auf bis zu 0,15 Bit pro Qualitätswert komprimiert werden. Besonders hervorzuheben ist, dass eine bestimmte Parametrisierung für eines der Modelle empfohlen werden kann, die – durch die Entfernung von Rauschen aus den Daten als Nebeneffekt – zu keiner Verschlechterung der Verzerrungsmetrik führt. Mit dieser Parametrisierung wird eine durchschnittliche Rate von 0,45 Bit pro Qualitätswert erreicht. Der dritte Beitrag ist die erste Implementierung eines MPEG-G-konformen Entropie-Codecs. Es wird gezeigt, dass der vorgeschlagene Codec die durchschnittlich besten Kompressionswerte im Vergleich zum Stand der Technik erzielt und dass die Aufnahme des Codecs in CRAM sowohl hinsichtlich der erreichbaren Kompression als auch der Geschwindigkeit von Vorteil wäre. Abschließend wird ein Überblick über Standards zur Komprimierung von Sequenzierungsdaten gegeben. Insbesondere wird hier auf MPEG-G eingangen, da alle Beiträge dieser Arbeit in MPEG-G integriert wurden

    Side information aware source and channel coding in wireless networks

    Get PDF
    Signals in communication networks exhibit significant correlation, which can stem from the physical nature of the underlying sources, or can be created within the system. Current layered network architectures, in which, based on Shannon’s separation theorem, data is compressed and transmitted over independent bit-pipes, are in general not able to exploit such correlation efficiently. Moreover, this strictly layered architecture was developed for wired networks and ignore the broadcast and highly dynamic nature of the wireless medium, creating a bottleneck in the wireless network design. Technologies that exploit correlated information and go beyond the layered network architecture can become a key feature of future wireless networks, as information theory promises significant gains. In this thesis, we study from an information theoretic perspective, three distinct, yet fundamental, problems involving the availability of correlated information in wireless networks and develop novel communication techniques to exploit it efficiently. We first look at two joint source-channel coding problems involving the lossy transmission of Gaussian sources in a multi-terminal and a time-varying setting in which correlated side information is present in the network. In these two problems, the optimality of Shannon’s separation breaks down and separate source and channel coding is shown to perform poorly compared to the proposed joint source-channel coding designs, which are shown to achieve the optimal performance in some setups. Then, we characterize the capacity of a class of orthogonal relay channels in the presence of channel side information at the destination, and show that joint decoding and compression of the received signal at the relay is required to optimally exploit the available side information. Our results in these three different scenarios emphasize the benefits of exploiting correlated side information at the destination when designing a communication system, even though the nature of the side information and the performance measure in the three scenarios are quite different.Open Acces

    Channel Optimized Distributed Multiple Description Coding

    Full text link
    In this paper, channel optimized distributed multiple description vector quantization (CDMD) schemes are presented for distributed source coding in symmetric and asymmetric settings. The CDMD encoder is designed using a deterministic annealing approach over noisy channels with packet loss. A minimum mean squared error asymmetric CDMD decoder is proposed for effective reconstruction of a source, utilizing the side information (SI) and its corresponding received descriptions. The proposed iterative symmetric CDMD decoder jointly reconstructs the symbols of multiple correlated sources. Two types of symmetric CDMD decoders, namely the estimated-SI and the soft-SI decoders, are presented which respectively exploit the reconstructed symbols and a posteriori probabilities of other sources as SI in iterations. In a multiple source CDMD setting, for reconstruction of a source, three methods are proposed to select another source as its SI during the decoding. The methods operate based on minimum physical distance (in a wireless sensor network setting), maximum mutual information and minimum end-to-end distortion. The performance of the proposed systems and algorithms are evaluated and compared in detail.Comment: Submitted to IEEE Transaction on Signal Processin

    Digital watermark technology in security applications

    Get PDF
    With the rising emphasis on security and the number of fraud related crimes around the world, authorities are looking for new technologies to tighten security of identity. Among many modern electronic technologies, digital watermarking has unique advantages to enhance the document authenticity. At the current status of the development, digital watermarking technologies are not as matured as other competing technologies to support identity authentication systems. This work presents improvements in performance of two classes of digital watermarking techniques and investigates the issue of watermark synchronisation. Optimal performance can be obtained if the spreading sequences are designed to be orthogonal to the cover vector. In this thesis, two classes of orthogonalisation methods that generate binary sequences quasi-orthogonal to the cover vector are presented. One method, namely "Sorting and Cancelling" generates sequences that have a high level of orthogonality to the cover vector. The Hadamard Matrix based orthogonalisation method, namely "Hadamard Matrix Search" is able to realise overlapped embedding, thus the watermarking capacity and image fidelity can be improved compared to using short watermark sequences. The results are compared with traditional pseudo-randomly generated binary sequences. The advantages of both classes of orthogonalisation inethods are significant. Another watermarking method that is introduced in the thesis is based on writing-on-dirty-paper theory. The method is presented with biorthogonal codes that have the best robustness. The advantage and trade-offs of using biorthogonal codes with this watermark coding methods are analysed comprehensively. The comparisons between orthogonal and non-orthogonal codes that are used in this watermarking method are also made. It is found that fidelity and robustness are contradictory and it is not possible to optimise them simultaneously. Comparisons are also made between all proposed methods. The comparisons are focused on three major performance criteria, fidelity, capacity and robustness. aom two different viewpoints, conclusions are not the same. For fidelity-centric viewpoint, the dirty-paper coding methods using biorthogonal codes has very strong advantage to preserve image fidelity and the advantage of capacity performance is also significant. However, from the power ratio point of view, the orthogonalisation methods demonstrate significant advantage on capacity and robustness. The conclusions are contradictory but together, they summarise the performance generated by different design considerations. The synchronisation of watermark is firstly provided by high contrast frames around the watermarked image. The edge detection filters are used to detect the high contrast borders of the captured image. By scanning the pixels from the border to the centre, the locations of detected edges are stored. The optimal linear regression algorithm is used to estimate the watermarked image frames. Estimation of the regression function provides rotation angle as the slope of the rotated frames. The scaling is corrected by re-sampling the upright image to the original size. A theoretically studied method that is able to synchronise captured image to sub-pixel level accuracy is also presented. By using invariant transforms and the "symmetric phase only matched filter" the captured image can be corrected accurately to original geometric size. The method uses repeating watermarks to form an array in the spatial domain of the watermarked image and the the array that the locations of its elements can reveal information of rotation, translation and scaling with two filtering processes

    Dynamic information and constraints in source and channel coding

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 237-251).This thesis explore dynamics in source coding and channel coding. We begin by introducing the idea of distortion side information, which does not directly depend on the source but instead affects the distortion measure. Such distortion side information is not only useful at the encoder but under certain conditions knowing it at the encoder is optimal and knowing it at the decoder is useless. Thus distortion side information is a natural complement to Wyner-Ziv side information and may be useful in exploiting properties of the human perceptual system as well as in sensor or control applications. In addition to developing the theoretical limits of source coding with distortion side information, we also construct practical quantizers based on lattices and codes on graphs. Our use of codes on graphs is also of independent interest since it highlights some issues in translating the success of turbo and LDPC codes into the realm of source coding. Finally, to explore the dynamics of side information correlated with the source, we consider fixed lag side information at the decoder. We focus on the special case of perfect side information with unit lag corresponding to source coding with feedforward (the dual of channel coding with feedback).(cont.) Using duality, we develop a linear complexity algorithm which exploits the feedforward information to achieve the rate-distortion bound. The second part of the thesis focuses on channel dynamics in communication by introducing a new system model to study delay in streaming applications. We first consider an adversarial channel model where at any time the channel may suffer a burst of degraded performance (e.g., due to signal fading, interference, or congestion) and prove a coding theorem for the minimum decoding delay required to recover from such a burst. Our coding theorem illustrates the relationship between the structure of a code, the dynamics of the channel, and the resulting decoding delay. We also consider more general channel dynamics. Specifically, we prove a coding theorem establishing that, for certain collections of channel ensembles, delay-universal codes exist that simultaneously achieve the best delay for any channel in the collection. Practical constructions with low encoding and decoding complexity are described for both cases.(cont.) Finally, we also consider architectures consisting of both source and channel coding which deal with channel dynamics by spreading information over space, frequency, multiple antennas, or alternate transmission paths in a network to avoid coding delays. Specifically, we explore whether the inherent diversity in such parallel channels should be exploited at the application layer via multiple description source coding, at the physical layer via parallel channel coding, or through some combination of joint source-channel coding. For on-off channel models application layer diversity architectures achieve better performance while for channels with a continuous range of reception quality (e.g., additive Gaussian noise channels with Rayleigh fading), the reverse is true. Joint source-channel coding achieves the best of both by performing as well as application layer diversity for on-off channels and as well as physical layer diversity for continuous channels.by Emin Martinian.Ph.D

    Distributed signal processing using nested lattice codes

    No full text
    Multi-Terminal Source Coding (MTSC) addresses the problem of compressing correlated sources without communication links among them. In this thesis, the constructive approach of this problem is considered in an algebraic framework and a system design is provided that can be applicable in a variety of settings. Wyner-Ziv problem is first investigated: coding of an independent and identically distributed (i.i.d.) Gaussian source with side information available only at the decoder in the form of a noisy version of the source to be encoded. Theoretical models are first established and derived for calculating distortion-rate functions. Then a few novel practical code implementations are proposed by using the strategy of multi-dimensional nested lattice/trellis coding. By investigating various lattices in the dimensions considered, analysis is given on how lattice properties affect performance. Also proposed are methods on choosing good sublattices in multiple dimensions. By introducing scaling factors, the relationship between distortion and scaling factor is examined for various rates. The best high-dimensional lattice using our scale-rotate method can achieve a performance less than 1 dB at low rates from the Wyner-Ziv limit; and random nested ensembles can achieve a 1.87 dB gap with the limit. Moreover, the code design is extended to incorporate with distributed compressive sensing (DCS). Theoretical framework is proposed and practical design using nested lattice/trellis is presented for various scenarios. By using nested trellis, the simulation shows a 3.42 dB gap from our derived bound for the DCS plus Wyner-Ziv framework
    • …
    corecore