6 research outputs found

    Minimal Forbidden Factors of Circular Words

    Full text link
    Minimal forbidden factors are a useful tool for investigating properties of words and languages. Two factorial languages are distinct if and only if they have different (antifactorial) sets of minimal forbidden factors. There exist algorithms for computing the minimal forbidden factors of a word, as well as of a regular factorial language. Conversely, Crochemore et al. [IPL, 1998] gave an algorithm that, given the trie recognizing a finite antifactorial language MM, computes a DFA recognizing the language whose set of minimal forbidden factors is MM. In the same paper, they showed that the obtained DFA is minimal if the input trie recognizes the minimal forbidden factors of a single word. We generalize this result to the case of a circular word. We discuss several combinatorial properties of the minimal forbidden factors of a circular word. As a byproduct, we obtain a formal definition of the factor automaton of a circular word. Finally, we investigate the case of minimal forbidden factors of the circular Fibonacci words.Comment: To appear in Theoretical Computer Scienc

    Constructing Antidictionaries of Long Texts in Output-Sensitive Space

    Get PDF
    A word x that is absent from a word y is called minimal if all its proper factors occur in y. Given a collection of k words y1, … , yk over an alphabet Σ, we are asked to compute the set M{y1,…,yk}ℓ of minimal absent words of length at most ℓ of the collection {y1, … , yk}. The set M{y1,…,yk}ℓ contains all the words x such that x is absent from all the words of the collection while there exist i,j, such that the maximal proper suffix of x is a factor of yi and the maximal proper prefix of x is a factor of yj. In data compression, this corresponds to computing the antidictionary of k documents. In bioinformatics, it corresponds to computing words that are absent from a genome of k chromosomes. Indeed, the set Myℓ of minimal absent words of a word y is equal to M{y1,…,yk}ℓ for any decomposition of y into a collection of words y1, … , yk such that there is an overlap of length at least ℓ − 1 between any two consecutive words in the collection. This computation generally requires Ω(n) space for n = |y| using any of the plenty available O(n) -time algorithms. This is because an Ω(n)-sized text index is constructed over y which can be impractical for large n. We do the identical computation incrementally using output-sensitive space. This goal is reasonable when ∥M{y1,…,yN}ℓ∥=o(n), for all N ∈ [1,k], where ∥S∥ denotes the sum of the lengths of words in set S. For instance, in the human genome, n ≈ 3 × 109 but ∥M{y1,…,yk}12∥≈106. We consider a constant-sized alphabet for stating our results. We show that allMy1ℓ,…,M{y1,…,yk}ℓ can be computed in O(kn+∑N=1k∥M{y1,…,yN}ℓ∥) total time using O(MaxIn+MaxOut) space, where MaxIn is the length of the longest word in {y1, … , yk} and MaxOut=max{∥M{y1,…,yN}ℓ∥:N∈[1,k]}. Proof-of-concept experimental results are also provided confirming our theoretical findings and justifying our contribution

    Internal Shortest Absent Word Queries in Constant Time and Linear Space

    Get PDF
    International audienceGiven a string T of length n over an alphabet Σ ⊂ {1, 2,. .. , n O(1) } of size σ, we are to preprocess T so that given a range [i, j], we can return a representation of a shortest string over Σ that is absent in the fragment T [i] • • • T [j] of T. We present an O(n)-space data structure that answers such queries in constant time and can be constructed in O(n log σ n) time

    Internal shortest absent word queries

    Get PDF
    Given a string T of length n over an alphabet Σ ⊂ {1, 2, . . . , nO(1)} of size σ, we are to preprocess T so that given a range [i, j], we can return a representation of a shortest string over Σ that is absent in the fragment T[i] · · · T[j] of T. For any positive integer k ∈ [1, log logσ n], we present an O((n/k) · log logσ n)-size data structure, which can be constructed in O(n logσ n) time, and answers queries in time O(log logσ k)

    Resource efficient on-node spike sorting

    Get PDF
    Current implantable brain-machine interfaces are recording multi-neuron activity by utilising multi-channel, multi-electrode micro-electrodes. With the rapid increase in recording capability has come more stringent constraints on implantable system power consumption and size. This is even more so with the increasing demand for wireless systems to increase the number of channels being monitored whilst overcoming the communication bottleneck (in transmitting raw data) via transcutaneous bio-telemetries. For systems observing unit activity, real-time spike sorting within an implantable device offers a unique solution to this problem. However, achieving such data compression prior to transmission via an on-node spike sorting system has several challenges. The inherent complexity of the spike sorting problem arising from various factors (such as signal variability, local field potentials, background and multi-unit activity) have required computationally intensive algorithms (e.g. PCA, wavelet transform, superparamagnetic clustering). Hence spike sorting systems have traditionally been implemented off-line, usually run on work-stations. Owing to their complexity and not-so-well scalability, these algorithms cannot be simply transformed into a resource efficient hardware. On the contrary, although there have been several attempts in implantable hardware, an implementation to match comparable accuracy to off-line within the required power and area requirements for future BMIs have yet to be proposed. Within this context, this research aims to fill in the gaps in the design towards a resource efficient implantable real-time spike sorter which achieves performance comparable to off-line methods. The research covered in this thesis target: 1) Identifying and quantifying the trade-offs on subsequent signal processing performance and hardware resource utilisation of the parameters associated with analogue-front-end. Following the development of a behavioural model of the analogue-front-end and an optimisation tool, the sensitivity of the spike sorting accuracy to different front-end parameters are quantified. 2) Identifying and quantifying the trade-offs associated with a two-stage hybrid solution to realising real-time on-node spike sorting. Initial part of the work focuses from the perspective of template matching only, while the second part of the work considers these parameters from the point of whole system including detection, sorting, and off-line training (template building). A set of minimum requirements are established which ensure robust, accurate and resource efficient operation. 3) Developing new feature extraction and spike sorting algorithms towards highly scalable systems. Based on waveform dynamics of the observed action potentials, a derivative based feature extraction and a spike sorting algorithm are proposed. These are compared with most commonly used methods of spike sorting under varying noise levels using realistic datasets to confirm their merits. The latter is implemented and demonstrated in real-time through an MCU based platform.Open Acces
    corecore