8 research outputs found

    Two-Dimensional Source Coding by Means of Subblock Enumeration

    Full text link
    A technique of lossless compression via substring enumeration (CSE) attains compression ratios as well as popular lossless compressors for one-dimensional (1D) sources. The CSE utilizes a probabilistic model built from the circular string of an input source for encoding the source.The CSE is applicable to two-dimensional (2D) sources such as images by dealing with a line of pixels of 2D source as a symbol of an extended alphabet. At the initial step of the CSE encoding process, we need to output the number of occurrences of all symbols of the extended alphabet, so that the time complexity increase exponentially when the size of source becomes large. To reduce the time complexity, we propose a new CSE which can encode a 2D source in block-by-block instead of line-by-line. The proposed CSE utilizes the flat torus of an input 2D source as a probabilistic model for encoding the source instead of the circular string of the source. Moreover, we analyze the limit of the average codeword length of the proposed CSE for general sources.Comment: 5 pages, Submitted to ISIT201

    A Universal Two-Dimensional Source Coding by Means of Subblock Enumeration

    Get PDF
    The technique of lossless compression via substring enumeration (CSE) is a kind of enumerative code and uses a probabilistic model built from the circular string of an input source for encoding a one-dimensional (1D) source. CSE is applicable to two-dimensional (2D) sources, such as images, by dealing with a line of pixels of a 2D source as a symbol of an extended alphabet. At the initial step of CSE encoding process, we need to output the number of occurrences of all symbols of the extended alphabet, so that the time complexity increases exponentially when the size of source becomes large. To reduce computational time, we can rearrange pixels of a 2D source into a 1D source string along a space-filling curve like a Hilbert curve. However, information on adjacent cells in a 2D source may be lost in the conversion. To reduce the time complexity and compress a 2D source without converting to a 1D source, we propose a new CSE which can encode a 2D source in a block-by-block fashion instead of in a line-by-line fashion. The proposed algorithm uses the flat torus of an input 2D source as a probabilistic model instead of the circular string of the source. Moreover, we prove the asymptotic optimality of the proposed algorithm for 2D general sources

    On the Structure of Bispecial Sturmian Words

    Full text link
    A balanced word is one in which any two factors of the same length contain the same number of each letter of the alphabet up to one. Finite binary balanced words are called Sturmian words. A Sturmian word is bispecial if it can be extended to the left and to the right with both letters remaining a Sturmian word. There is a deep relation between bispecial Sturmian words and Christoffel words, that are the digital approximations of Euclidean segments in the plane. In 1997, J. Berstel and A. de Luca proved that \emph{palindromic} bispecial Sturmian words are precisely the maximal internal factors of \emph{primitive} Christoffel words. We extend this result by showing that bispecial Sturmian words are precisely the maximal internal factors of \emph{all} Christoffel words. Our characterization allows us to give an enumerative formula for bispecial Sturmian words. We also investigate the minimal forbidden words for the language of Sturmian words.Comment: arXiv admin note: substantial text overlap with arXiv:1204.167

    Tilatiivis toteutus tiedon tiivistämiseen osamerkkijonoja luettelemalla

    Get PDF
    Häviöttömässä tiedon tiivistämisessä annetusta datasta luodaan tiiviste, joka vie mahdollisimman vähän tilaa suhteessa alkuperäiseen dataan. Tiivisteestä on voitava palauttaa identtinen kopio alkuperäisestä datasta. Tutkielmassa käsitellään häviötöntä tiivistysmenetelmää, joka tutkii tiivistettävää dataa, eli merkkijonoa tai tekstiä, kokonaisuutena, eikä esimerkiksi pieni osa kerrallaan. Menetelmä välittää tiivisteen purkajalle osamerkkijonojen esiintymismääriä tekstissä. Osamerkkijonot käsitellään ennalta tunnetussa järjestyksessä lyhyimmästä pisimpään, jolloin kumpikin osapuoli osaa liittää esiintymismäärän oikeaan osamerkkijonoon. Jotkut esiintymismäärät voivat olla nollia kertomassa, ettei osamerkkijono esiinny tekstissä. Tiivistyvyys saavutetaan huomaamalla, että aiemmin välitetyt osamerkkijonot rajaavat millaisia pidemmät merkkijonot voivat olla. Tällöin osa esiintymismääristä voidaan jättää välittämättä, tai välittämiseen käyttää vähemmän tilaa. Osamerkkijonoja, joiden esiintymismäärä täytyy välittää, karakterisoidaan maksimaalisuuden käsitteen avulla. Maksimaalisten osamerkkijonojen etsiminen ja osamerkkijonojen esiintymismäärien laskeminen paljaasta tekstistä on hidasta. Siksi teksti täytyy tallettaa tietorakenteeseen, joka tukee tarvittuja operaatioita tehden niistä nopeita. Tällaiset tietorakenteet vievät enemmän tilaa kuin paljas teksti. Koska tutkittavassa tiivistysmenetelmässä koko tiivistettävä teksti käsitellään kokonaisuutena, muistinkäytön tehokkuus korostuu. Tutkielmassa toteutetaan tiivistysmenetelmä käyttäen tilatiiviistä tietorakennetta nimeltä kaksisuuntainen BWT-indeksi. Tilatiiviit tietorakenteet vievät vain vähän enemmän tilaa, kuin niihin talletettu data. Tästä huolimatta ne toteuttavat talletettua dataa käsitteleviä operaatioita tehokkaasti. Toteutukselle suoritetut kokeet osoittavat muistinkäytön pysyvän kohtuullisena, jolloin suurempienkin tietomäärien tiivistys on mahdollista

    Querying and Efficiently Searching Large, Temporal Text Corpora

    Get PDF

    Resource efficient on-node spike sorting

    Get PDF
    Current implantable brain-machine interfaces are recording multi-neuron activity by utilising multi-channel, multi-electrode micro-electrodes. With the rapid increase in recording capability has come more stringent constraints on implantable system power consumption and size. This is even more so with the increasing demand for wireless systems to increase the number of channels being monitored whilst overcoming the communication bottleneck (in transmitting raw data) via transcutaneous bio-telemetries. For systems observing unit activity, real-time spike sorting within an implantable device offers a unique solution to this problem. However, achieving such data compression prior to transmission via an on-node spike sorting system has several challenges. The inherent complexity of the spike sorting problem arising from various factors (such as signal variability, local field potentials, background and multi-unit activity) have required computationally intensive algorithms (e.g. PCA, wavelet transform, superparamagnetic clustering). Hence spike sorting systems have traditionally been implemented off-line, usually run on work-stations. Owing to their complexity and not-so-well scalability, these algorithms cannot be simply transformed into a resource efficient hardware. On the contrary, although there have been several attempts in implantable hardware, an implementation to match comparable accuracy to off-line within the required power and area requirements for future BMIs have yet to be proposed. Within this context, this research aims to fill in the gaps in the design towards a resource efficient implantable real-time spike sorter which achieves performance comparable to off-line methods. The research covered in this thesis target: 1) Identifying and quantifying the trade-offs on subsequent signal processing performance and hardware resource utilisation of the parameters associated with analogue-front-end. Following the development of a behavioural model of the analogue-front-end and an optimisation tool, the sensitivity of the spike sorting accuracy to different front-end parameters are quantified. 2) Identifying and quantifying the trade-offs associated with a two-stage hybrid solution to realising real-time on-node spike sorting. Initial part of the work focuses from the perspective of template matching only, while the second part of the work considers these parameters from the point of whole system including detection, sorting, and off-line training (template building). A set of minimum requirements are established which ensure robust, accurate and resource efficient operation. 3) Developing new feature extraction and spike sorting algorithms towards highly scalable systems. Based on waveform dynamics of the observed action potentials, a derivative based feature extraction and a spike sorting algorithm are proposed. These are compared with most commonly used methods of spike sorting under varying noise levels using realistic datasets to confirm their merits. The latter is implemented and demonstrated in real-time through an MCU based platform.Open Acces
    corecore