11,758 research outputs found

    Data Discovery and Anomaly Detection Using Atypicality: Theory

    Full text link
    A central question in the era of 'big data' is what to do with the enormous amount of information. One possibility is to characterize it through statistics, e.g., averages, or classify it using machine learning, in order to understand the general structure of the overall data. The perspective in this paper is the opposite, namely that most of the value in the information in some applications is in the parts that deviate from the average, that are unusual, atypical. We define what we mean by 'atypical' in an axiomatic way as data that can be encoded with fewer bits in itself rather than using the code for the typical data. We show that this definition has good theoretical properties. We then develop an implementation based on universal source coding, and apply this to a number of real world data sets.Comment: 40 page

    Statistical lossless compression of space imagery and general data in a reconfigurable architecture

    Get PDF

    Motion estimation and CABAC VLSI co-processors for real-time high-quality H.264/AVC video coding

    Get PDF
    Real-time and high-quality video coding is gaining a wide interest in the research and industrial community for different applications. H.264/AVC, a recent standard for high performance video coding, can be successfully exploited in several scenarios including digital video broadcasting, high-definition TV and DVD-based systems, which require to sustain up to tens of Mbits/s. To that purpose this paper proposes optimized architectures for H.264/AVC most critical tasks, Motion estimation and context adaptive binary arithmetic coding. Post synthesis results on sub-micron CMOS standard-cells technologies show that the proposed architectures can actually process in real-time 720 × 480 video sequences at 30 frames/s and grant more than 50 Mbits/s. The achieved circuit complexity and power consumption budgets are suitable for their integration in complex VLSI multimedia systems based either on AHB bus centric on-chip communication system or on novel Network-on-Chip (NoC) infrastructures for MPSoC (Multi-Processor System on Chip

    A Nonstochastic Information Theory for Communication and State Estimation

    Full text link
    In communications, unknown variables are usually modelled as random variables, and concepts such as independence, entropy and information are defined in terms of the underlying probability distributions. In contrast, control theory often treats uncertainties and disturbances as bounded unknowns having no statistical structure. The area of networked control combines both fields, raising the question of whether it is possible to construct meaningful analogues of stochastic concepts such as independence, Markovness, entropy and information without assuming a probability space. This paper introduces a framework for doing so, leading to the construction of a maximin information functional for nonstochastic variables. It is shown that the largest maximin information rate through a memoryless, error-prone channel in this framework coincides with the block-coding zero-error capacity of the channel. Maximin information is then used to derive tight conditions for uniformly estimating the state of a linear time-invariant system over such a channel, paralleling recent results of Matveev and Savkin

    Arithmetic coding revisited

    Get PDF
    Over the last decade, arithmetic coding has emerged as an important compression tool. It is now the method of choice for adaptive coding on multisymbol alphabets because of its speed, low storage requirements, and effectiveness of compression. This article describes a new implementation of arithmetic coding that incorporates several improvements over a widely used earlier version by Witten, Neal, and Cleary, which has become a de facto standard. These improvements include fewer multiplicative operations, greatly extended range of alphabet sizes and symbol probabilities, and the use of low-precision arithmetic, permitting implementation by fast shift/add operations. We also describe a modular structure that separates the coding, modeling, and probability estimation components of a compression system. To motivate the improved coder, we consider the needs of a word-based text compression program. We report a range of experimental results using this and other models. Complete source code is available
    • 

    corecore