143 research outputs found

    Lossless SIMD Compression of LiDAR Range and Attribute Scan Sequences

    Full text link
    As LiDAR sensors have become ubiquitous, the need for an efficient LiDAR data compression algorithm has increased. Modern LiDARs produce gigabytes of scan data per hour and are often used in applications with limited compute, bandwidth, and storage resources. We present a fast, lossless compression algorithm for LiDAR range and attribute scan sequences including multiple-return range, signal, reflectivity, and ambient infrared. Our algorithm -- dubbed "Jiffy" -- achieves substantial compression by exploiting spatiotemporal redundancy and sparsity. Speed is accomplished by maximizing use of single-instruction-multiple-data (SIMD) instructions. In autonomous driving, infrastructure monitoring, drone inspection, and handheld mapping benchmarks, the Jiffy algorithm consistently outcompresses competing lossless codecs while operating at speeds in excess of 65M points/sec on a single core. In a typical autonomous vehicle use case, single-threaded Jiffy achieves 6x compression of centimeter-precision range scans at 500+ scans per second. To ensure reproducibility and enable adoption, the software is freely available as an open source library

    Efficient Algorithms for Large-Scale Image Analysis

    Get PDF
    This work develops highly efficient algorithms for analyzing large images. Applications include object-based change detection and screening. The algorithms are 10-100 times as fast as existing software, sometimes even outperforming FGPA/GPU hardware, because they are designed to suit the computer architecture. This thesis describes the implementation details and the underlying algorithm engineering methodology, so that both may also be applied to other applications

    Lossless compression with latent variable models

    Get PDF
    We develop a simple and elegant method for lossless compression using latent variable models, which we call `bits back with asymmetric numeral systems' (BB-ANS). The method involves interleaving encode and decode steps, and achieves an optimal rate when compressing batches of data. We demonstrate it rstly on the MNIST test set, showing that state-of-the-art lossless compression is possible using a small variational autoencoder (VAE) model. We then make use of a novel empirical insight, that fully convolutional generative models, trained on small images, are able to generalize to images of arbitrary size, and extend BB-ANS to hierarchical latent variable models, enabling state-of-the-art lossless compression of full-size colour images from the ImageNet dataset. We describe `Craystack', a modular software framework which we have developed for rapid prototyping of compression using deep generative models

    High-throughput variable-to-fixed entropy codec using selective, stochastic code forests

    Get PDF
    Efficient high-throughput (HT) compression algorithms are paramount to meet the stringent constraints of present and upcoming data storage, processing, and transmission systems. In particular, latency, bandwidth and energy requirements are critical for those systems. Most HT codecs are designed to maximize compression speed, and secondarily to minimize compressed lengths. On the other hand, decompression speed is often equally or more critical than compression speed, especially in scenarios where decompression is performed multiple times and/or at critical parts of a system. In this work, an algorithm to design variable-to-fixed (VF) codes is proposed that prioritizes decompression speed. Stationary Markov analysis is employed to generate multiple, jointly optimized codes (denoted code forests). Their average compression efficiency is on par with the state of the art in VF codes, e.g., within 1% of Yamamoto et al.\u27s algorithm. The proposed code forest structure enables the implementation of highly efficient codecs, with decompression speeds 3.8 times faster than other state-of-the-art HT entropy codecs with equal or better compression ratios for natural data sources. Compared to these HT codecs, the proposed forests yields similar compression efficiency and speeds

    ECG compression for Holter monitoring

    Get PDF
    Cardiologists can gain useful insight into a patient's condition when they are able to correlate the patent's symptoms and activities. For this purpose, a Holter Monitor is often used - a portable electrocardiogram (ECG) recorder worn by the patient for a period of 24-72 hours. Preferably, the monitor is not cumbersome to the patient and thus it should be designed to be as small and light as possible; however, the storage requirements for such a long signal are very large and can significantly increase the recorder's size and cost, and so signal compression is often employed. At the same time, the decompressed signal must contain enough detail for the cardiologist to be able to identify irregularities. "Lossy" compressors may obscure such details, where a "lossless" compressor preserves the signal exactly as captured.The purpose of this thesis is to develop a platform upon which a Holter Monitor can be built, including a hardware-assisted lossless compression method in order to avoid the signal quality penalties of a lossy algorithm. The objective of this thesis is to develop and implement a low-complexity lossless ECG encoding algorithm capable of at least a 2:1 compression ratio in an embedded system for use in a Holter Monitor. Different lossless compression techniques were evaluated in terms of coding efficiency as well as suitability for ECG waveform application, random access within the signal and complexity of the decoding operation. For the reduction of the physical circuit size, a System On a Programmable Chip (SOPC) design was utilized. A coder based on a library of linear predictors and Rice coding was chosen and found to give a compression ratio of at least 2:1 and as high as 3:1 on real-world signals tested while having a low decoder complexity and fast random access to arbitrary parts of the signal. In the hardware-assisted implementation, the speed of encoding was a factor of between four and five faster than a software encoder running on the same CPU while allowing the CPU to perform other tasks during the encoding process

    Rinnakkainen toteutus H.265 videokoodaus standardille

    Get PDF
    The objective of this study was to research the scalability of the parallel features in the new H.265 video compression standard, also know as High Efficiency Video Coding (HEVC). Compared to its predecessor, the H.264 standard, H.265 typically achieves around 50% bitrate reduction for the same subjective video quality. Especially videos with higher resolution (Full HD and beyond) achieve better compression ratios. Also a better utilization of parallel computing resources is provided. H.265 introduces two novel parallelization features: Tiles and Wavefront Parallel Processing (WPP). In Tiles, each video frame is divided into areas that can be decoded without referencing to other areas in the same frame. In WPP, the relations between code blocks in a frame are encoded so that the decoding process can progress through the frame as a front using multiple threads. In this study, the reference implementation for the H.265 decoder was augmented to support both of these parallelization features. The performance of the parallel implementations was measured using three different setups. From the measurement results it could be seen that the introduction of more CPU cores reduced the total decode time of the video frames to a certain point. When using the Tiles feature, it was observed that the encoding geometry, i.e. how each frame was divided into individually decodable areas, had a noticeable effect on the decode times with certain thread counts. When using WPP, it was observed that what was mostly synchronization overhead, sometimes had a negative effect on the decode times when using larger (4-12) amounts of threads.Tämän tutkimuksen aiheena oli tutkia uuden H.265 videonpakkausstandardin (tunnetaan myös nimellä HEVC (engl. High Efficiency Video Coding)) rinnakkaisuusominaisuuksien skaalautuvuutta. Verrattuna edeltäjäänsä, H.264 videonpakkaustandardiin, H.265 tyypillisesti saavuttaa samalla kuvanlaadulla noin 50% pienemmän pakkauskoon. Erityisesti suuren resoluution videoilla (Full HD ja suuremmat) pakkaustehokkuuden paremmuus korostuu. Huomiota on kiinnitetty myös moniydinprosessoreiden hyödyntämiseen videokoodauksessa. H.265 tarjoaa kaksi uutta rinnakkaisuusominaisuutta: niin kutsutut Tiles- ja WPP-menetelmät (engl. \emph{Wavefront Parallel Processing}). Tiles-menetelmässä jokainen videon kuva jaetaan alueisiin, jotka voidaan purkaa viittaamatta saman kuvan muihin alueisiin. WPP-menetelmässä suhteet kuvan lohkoihin pakataan siten että purkamisprosessi pystyy etenemään kuvan läpi rintamana hyödyntäen useampia säikeitä. Tässä tutkimuksessa H.265 videodekooderin referenssitoteutusta laajennettiin tukemaan molempia näistä rinnakkaisuusominaisuuksista. Suorituskykyä mitattiin käyttäen kolmea eri mittausasetelmaa. Mittaustuloksista ilmeni, että prosessoriydinten lukumäärän kasvattaminen nopeutti videoiden purkamista tiettyyn pisteeseen asti. Tiles-menetelmää mitatessa havaittiin, että alueiden geometrialla, eli kuinka kuva jaettiin riippumattomiin alueisiin, on huomattava vaikutus purkamisnopeuteen tietyillä säiemäärillä. WPP-menetelmää mitattaessa havaittiin että korkeampiin säiemääriin (4-12) siirryttäessä purkamisnopeus alkoi hidastua. Tämä johtui pääasiassa säikeiden keskinäiseen synkronointiin kuluvasta ajasta

    Challenges and solutions in H.265/HEVC for integrating consumer electronics in professional video systems

    Get PDF

    GST: GPU-decodable supercompressed textures

    Get PDF
    Modern GPUs supporting compressed textures allow interactive application developers to save scarce GPU resources such as VRAM and bandwidth. Compressed textures use fixed compression ratios whose lossy representations are significantly poorer quality than traditional image compression formats such as JPEG. We present a new method in the class of supercompressed textures that provides an additional layer of compression to already compressed textures. Our texture representation is designed for endpoint compressed formats such as DXT and PVRTC and decoding on commodity GPUs. We apply our algorithm to commonly used formats by separating their representation into two parts that are processed independently and then entropy encoded. Our method preserves the CPU-GPU bandwidth during the decoding phase and exploits the parallelism of GPUs to provide up to 3X faster decode compared to prior texture supercompression algorithms. Along with the gains in decoding speed, our method maintains both the compression size and quality of current state of the art texture representations

    Low Cost Video For Distance Education

    Get PDF
    A distance education system has been designed for Nova Southeastern University (NSU) . The design was based on emerging low cost video technology. The report presented the design and summarizes existing distance education efforts and technologies. The design supported multimedia electronic classrooms, and enabled students to participate in multimedia classes using standard telephone networks. Results were presented in three areas: management, courseware, and, systems. In the area of management, the report recommended that the University separately establish, fund, and staff the distance education project. Supporting rationale was included. In the area of courseware, the importance of quality courseware was highlighted. It was found that the development of distance education courseware was difficult; nevertheless, quality courseware was the key to a successful distance education program. In the area of systems, component level designs were presented for a student system, a university host, and a support system. Networks connecting the systems were addressed. The student system was based on widely available multimedia systems. The host system supported up to sixteen participants in a single class. The support system was designed for the development of courseware and the support of future projects in distance education. The report included supporting Proof of Principle demonstrations. These demonstrations showed that low cost video systems had utility at speeds as low as 7. 2 kbps. They also showed that high quality student images were not crucial to the system. The report included three alternate implementation strategies. The initial capability could be operational in 1997. A multi-session, 2000 user system was projected for early in the next century
    corecore