9,493 research outputs found

    Supervised Learning Through the Lens of Compression

    Get PDF

    Language Modeling Is Compression

    Full text link
    It has long been established that predictive models can be transformed into lossless compressors and vice versa. Incidentally, in recent years, the machine learning community has focused on training increasingly large and powerful self-supervised (language) models. Since these large language models exhibit impressive predictive capabilities, they are well-positioned to be strong compressors. In this work, we advocate for viewing the prediction problem through the lens of compression and evaluate the compression capabilities of large (foundation) models. We show that large language models are powerful general-purpose predictors and that the compression viewpoint provides novel insights into scaling laws, tokenization, and in-context learning. For example, Chinchilla 70B, while trained primarily on text, compresses ImageNet patches to 43.4% and LibriSpeech samples to 16.4% of their raw size, beating domain-specific compressors like PNG (58.5%) or FLAC (30.3%), respectively. Finally, we show that the prediction-compression equivalence allows us to use any compressor (like gzip) to build a conditional generative model

    ARCHANGEL: Tamper-proofing Video Archives using Temporal Content Hashes on the Blockchain

    Get PDF
    We present ARCHANGEL; a novel distributed ledger based system for assuring the long-term integrity of digital video archives. First, we describe a novel deep network architecture for computing compact temporal content hashes (TCHs) from audio-visual streams with durations of minutes or hours. Our TCHs are sensitive to accidental or malicious content modification (tampering) but invariant to the codec used to encode the video. This is necessary due to the curatorial requirement for archives to format shift video over time to ensure future accessibility. Second, we describe how the TCHs (and the models used to derive them) are secured via a proof-of-authority blockchain distributed across multiple independent archives. We report on the efficacy of ARCHANGEL within the context of a trial deployment in which the national government archives of the United Kingdom, Estonia and Norway participated.Comment: Accepted to CVPR Blockchain Workshop 201

    UVSD: Software for Detection of Color Underwater Features

    Get PDF
    Underwater Video Spot Detector (UVSD) is a software package designed to analyze underwater video for continuous spatial measurements (path traveled, distance to the bottom, roughness of the surface etc.) Laser beams of known geometry are often used in underwater imagery to estimate the distance to the bottom. This estimation is based on the manual detection of laser spots which is labor intensive and time consuming so usually only a few frames can be processed this way. This allows for spatial measurements on single frames (distance to the bottom, size of objects on the sea-bottom), but not for the whole video transect. We propose algorithms and a software package implementing them for the semi-automatic detection of laser spots throughout a video which can significantly increase the effectiveness of spatial measurements. The algorithm for spot detection is based on the Support Vector Machines approach to Artificial Intelligence. The user is only required to specify on certain frames the points he or she thinks are laser dots (to train an SVM model), and then this model is used by the program to detect the laser dots on the rest of the video. As a result the precise (precision is only limited by quality of the video) spatial scale is set up for every frame. This can be used to improve video mosaics of the sea-bottom. The temporal correlation between spot movements changes and their shape provides the information about sediment roughness. Simultaneous spot movements indicate changing distance to the bottom; while uncorrelated changes indicate small local bumps. UVSD can be applied to quickly identify and quantify seafloor habitat patches, help visualize habitats and benthic organisms within large-scale landscapes, and estimate transect length and area surveyed along video transects
    corecore