421 research outputs found

    Bandwidth-Efficient Byte Stuffing

    Full text link

    DVB-H link layer

    Get PDF

    High throughput image compression and decompression on GPUs

    Get PDF
    Diese Arbeit befasst sich mit der Entwicklung eines GPU-freundlichen, intra-only, Wavelet-basierten Videokompressionsverfahrens mit hohem Durchsatz, das für visuell verlustfreie Anwendungen optimiert ist. Ausgehend von der Beobachtung, dass der JPEG 2000 Entropie-Kodierer ein Flaschenhals ist, werden verschiedene algorithmische Änderungen vorgeschlagen und bewertet. Zunächst wird der JPEG 2000 Selective Arithmetic Coding Mode auf der GPU realisiert, wobei sich die Erhöhung des Durchsatzes hierdurch als begrenzt zeigt. Stattdessen werden zwei nicht standard-kompatible Änderungen vorgeschlagen, die (1) jede Bitebebene in nur einem einzelnen Pass verarbeiten (Single-Pass-Modus) und (2) einen echten Rohcodierungsmodus einführen, der sample-weise parallelisierbar ist und keine aufwendige Kontextmodellierung erfordert. Als nächstes wird ein alternativer Entropiekodierer aus der Literatur, der Bitplane Coder with Parallel Coefficient Processing (BPC-PaCo), evaluiert. Er gibt Signaladaptivität zu Gunsten von höherer Parallelität auf und daher wird hier untersucht und gezeigt, dass ein aus verschiedensten Testsequenzen gemitteltes statisches Wahrscheinlichkeitsmodell eine kompetitive Kompressionseffizienz erreicht. Es wird zudem eine Kombination von BPC-PaCo mit dem Single-Pass-Modus vorgeschlagen, der den Speedup gegenüber dem JPEG 2000 Entropiekodierer von 2,15x (BPC-PaCo mit zwei Pässen) auf 2,6x (BPC-PaCo mit Single-Pass-Modus) erhöht auf Kosten eines um 0,3 dB auf 1,0 dB erhöhten Spitzen-Signal-Rausch-Verhältnis (PSNR). Weiter wird ein paralleler Algorithmus zur Post-Compression Ratenkontrolle vorgestellt sowie eine parallele Codestream-Erstellung auf der GPU. Es wird weiterhin ein theoretisches Laufzeitmodell formuliert, das es durch Benchmarking von einer GPU ermöglicht die Laufzeit einer Routine auf einer anderen GPU vorherzusagen. Schließlich wird der erste JPEG XS GPU Decoder vorgestellt und evaluiert. JPEG XS wurde als Low Complexity Codec konzipiert und forderte erstmals explizit GPU-Freundlichkeit bereits im Call for Proposals. Ab Bitraten über 1 bpp ist der Decoder etwa 2x schneller im Vergleich zu JPEG 2000 und 1,5x schneller als der schnellste hier vorgestellte Entropiekodierer (BPC-PaCo mit Single-Pass-Modus). Mit einer GeForce GTX 1080 wird ein Decoder Durchsatz von rund 200 fps für eine UHD-4:4:4-Sequenz erreicht.This work investigates possibilities to create a high throughput, GPU-friendly, intra-only, Wavelet-based video compression algorithm optimized for visually lossless applications. Addressing the key observation that JPEG 2000’s entropy coder is a bottleneck and might be overly complex for a high bit rate scenario, various algorithmic alterations are proposed. First, JPEG 2000’s Selective Arithmetic Coding mode is realized on the GPU, but the gains in terms of an increased throughput are shown to be limited. Instead, two independent alterations not compliant to the standard are proposed, that (1) give up the concept of intra-bit plane truncation points and (2) introduce a true raw-coding mode that is fully parallelizable and does not require any context modeling. Next, an alternative block coder from the literature, the Bitplane Coder with Parallel Coefficient Processing (BPC-PaCo), is evaluated. Since it trades signal adaptiveness for increased parallelism, it is shown here how a stationary probability model averaged from a set of test sequences yields competitive compression efficiency. A combination of BPC-PaCo with the single-pass mode is proposed and shown to increase the speedup with respect to the original JPEG 2000 entropy coder from 2.15x (BPC-PaCo with two passes) to 2.6x (proposed BPC-PaCo with single-pass mode) at the marginal cost of increasing the PSNR penalty by 0.3 dB to at most 1 dB. Furthermore, a parallel algorithm is presented that determines the optimal code block bit stream truncation points (given an available bit rate budget) and builds the entire code stream on the GPU, reducing the amount of data that has to be transferred back into host memory to a minimum. A theoretical runtime model is formulated that allows, based on benchmarking results on one GPU, to predict the runtime of a kernel on another GPU. Lastly, the first ever JPEG XS GPU-decoder realization is presented. JPEG XS was designed to be a low complexity codec and for the first time explicitly demanded GPU-friendliness already in the call for proposals. Starting at bit rates above 1 bpp, the decoder is around 2x faster compared to the original JPEG 2000 and 1.5x faster compared to JPEG 2000 with the fastest evaluated entropy coder (BPC-PaCo with single-pass mode). With a GeForce GTX 1080, a decoding throughput of around 200 fps is achieved for a UHD 4:4:4 sequence

    The Design and modeling of input and output modules for an ATM network switch

    Get PDF
    The purpose of this thesis is to design, model, and simulate both an input and an output module for an ATM network switch. These devices are used to interface an ATM switch with the physical protocol that is transporting data along the actual transmission medium. The I/O modules have been designed specifically to interface with the Synchronous Optical Network (SONET) protocol. This thesis studies the ATM protocol and examines the issues involved with designing an ATM I/O module chipset. A model of the design was then implemented in both C++ and \TTDL. These models were simulated in order to verify functionality and document performance. The intent of this work is to provide the background and models necessary to aid in the further study and development of entire ATM switch architectures. The input and output modules .ire onlv two functional pieces of a complete ATM switch. The software models that have been implemented by this thesis can be integrated with the other necessary functional blocks to form a complete model of a working ATM switch. These functional blocks can then be rearranged and altered to assist in the study of how different switch architectures can effect overall network performance and efficiency. The input and output modules have been designed to be as flexible as possible in order to easily adapt to future modifications

    Networking DEC and IBM computers

    Get PDF
    Local Area Networking of DEC and IBM computers within the structure of the ISO-OSI Seven Layer Reference Model at a raw signaling speed of 1 Mops or greater are discussed. After an introduction to the ISO-OSI Reference Model nd the IEEE-802 Draft Standard for Local Area Networks (LANs), there follows a detailed discussion and comparison of the products available from a variety of manufactures to perform this networking task. A summary of these products is presented in a table

    Novel algorithms for fair bandwidth sharing on counter rotating rings

    Get PDF
    Rings are often preferred technology for networks as ring networks can virtually create fully connected mesh networks efficiently and they are also easy to manage. However, providing fair service to all the stations on the ring is not always easy to achieve. In order to capitalize on the advantages of ring networks, new buffer insertion techniques, such as Spatial Reuse Protocol (SRP), were introduced in early 2000s. As a result, a new standard known as IEEE 802.17 Resilient Packet Ring was defined in 2004 by the IEEE Resilient Packet Ring (RPR) Working Group. Since then two addenda have been introduced; namely, IEEE 802.17a and IEEE 802.17b in 2006 and 2010, respectively. During this standardization process, weighted fairness and queue management schemes were proposed to be used in the standard. As shown in this dissertation, these schemes can be applied to solve the fairness issues noted widely in the research community as radical changes are not practical to introduce within the context of a standard. In this dissertation, the weighted fairness aspects of IEEE 802.17 RPR (in the aggressive mode of operation) are studied; various properties are demonstrated and observed via network simulations, and additional improvements are suggested. These aspects have not been well studied until now, and can be used to alleviate some of the issues observed in the fairness algorithm under some scenarios. Also, this dissertation focuses on the RPR Medium Access Control (MAC) Client implementation of the IEEE 802.17 RPR MAC in the aggressive mode of operation and introduces a new active queue management scheme for ring networks that achieves higher overall utilization of the ring bandwidth with simpler and less expensive implementation than the generic implementation provided in the standard. The two schemes introduced in this dissertation provide performance comparable to the per destination queuing implementation, which yields the best achievable performance at the expense of the cost of implementation. In addition, till now the requirements for sizing secondary transit queue of IEEE 802.17 RPR stations (in the aggressive mode of operation) have not been properly investigated. The analysis and suggested improvements presented in this dissertation are then supported by performance evaluation results and theoretical calculations. Last, but not least, the impact of using different capacity links on the same ring has not been investigated before from the ring utilization and fairness points of view. This dissertation also investigates utilizing different capacity links in RPR and proposes a mechanism to support the same

    FDMA Point-to-Multi-Point Fibre Access System for Latency Sensitive Applications

    Get PDF
    We present a demo for a multiple uplink access system with real-time services. Several terminals transmit and are detected simultaneously through FDMA. The system can allow latency-sensitive and best-effort applications to share the network
    • …
    corecore