225 research outputs found
Managing Unbounded-Length Keys in Comparison-Driven Data Structures with Applications to On-Line Indexing
This paper presents a general technique for optimally transforming any
dynamic data structure that operates on atomic and indivisible keys by
constant-time comparisons, into a data structure that handles unbounded-length
keys whose comparison cost is not a constant. Examples of these keys are
strings, multi-dimensional points, multiple-precision numbers, multi-key data
(e.g.~records), XML paths, URL addresses, etc. The technique is more general
than what has been done in previous work as no particular exploitation of the
underlying structure of is required. The only requirement is that the insertion
of a key must identify its predecessor or its successor.
Using the proposed technique, online suffix tree can be constructed in worst
case time per input symbol (as opposed to amortized
time per symbol, achieved by previously known algorithms). To our knowledge,
our algorithm is the first that achieves worst case time per input
symbol. Searching for a pattern of length in the resulting suffix tree
takes time, where is the
number of occurrences of the pattern. The paper also describes more
applications and show how to obtain alternative methods for dealing with suffix
sorting, dynamic lowest common ancestors and order maintenance
Gbit/second lossless data compression hardware
This thesis investigates how to improve the performance of lossless data compression hardware
as a tool to reduce the cost per bit stored in a computer system or transmitted over a
communication network.
Lossless data compression allows the exact reconstruction of the original data after
decompression. Its deployment in some high-bandwidth applications has been hampered due to
performance limitations in the compressing hardware that needs to match the performance of the
original system to avoid becoming a bottleneck. Advancing the area of lossless data compression
hardware, hence, offers a valid motivation with the potential of doubling the performance of the
system that incorporates it with minimum investment.
This work starts by presenting an analysis of current compression methods with the objective of
identifying the factors that limit performance and also the factors that increase it. [Continues.
Scalable String and Suffix Sorting: Algorithms, Techniques, and Tools
This dissertation focuses on two fundamental sorting problems: string sorting
and suffix sorting. The first part considers parallel string sorting on
shared-memory multi-core machines, the second part external memory suffix
sorting using the induced sorting principle, and the third part distributed
external memory suffix sorting with a new distributed algorithmic big data
framework named Thrill.Comment: 396 pages, dissertation, Karlsruher Instituts f\"ur Technologie
(2018). arXiv admin note: text overlap with arXiv:1101.3448 by other author
Communication in membrana Systems with symbol Objects.
Esta tesis está dedicada a los sistemas de membranas con objetos-símbolo como marco teórico de los sistemas paralelos y distribuidos de procesamiento de multiconjuntos.Una computación de parada puede aceptar, generar o procesar un número, un vector o una palabra; por tanto el sistema define globalmente (a través de los resultados de todas sus computaciones) un conjunto de números, de vectores, de palabras (es decir, un lenguaje), o bien una función. En esta tesis estudiamos la capacidad de estos sistemas para resolver problemas particulares, así como su potencia computacional. Por ejemplo, las familias de lenguajes definidas por diversas clases de estos sistemas se comparan con las familias clásicas, esto es, lenguajes regulares, independientes del contexto, generados por sistemas 0L tabulados extendidos, generados por gramáticas matriciales sin chequeo de apariciones, recursivamente enumerables, etc. Se prestará especial atención a la comunicación de objetos entre regiones y a las distintas formas de cooperación entre ellos.Se pretende (Sección 3.4) realizar una formalización los sistemas de membranas y construir una herramienta tipo software para la variante que usa cooperación no distribuida, el navegador de configuraciones, es decir, un simulador, en el cual el usuario selecciona la siguiente configuración entre todas las posibles, estando permitido volver hacia atrás. Se considerarán diversos modelos distribuidos. En el modelo de evolución y comunicación (Capítulo 4) separamos las reglas tipo-reescritura y las reglas de transporte (llamadas symport y antiport). Los sistemas de bombeo de protones (proton pumping, Secciones 4.8, 4.9) constituyen una variante de los sistemas de evolución y comunicación con un modo restrictivo de cooperación. Un modelo especial de computación con membranas es el modelo puramente comunicativo, en el cual los objetos traspasan juntos una membrana. Estudiamos la potencia computacional de las sistemas de membranas con symport/antiport de 2 o 3 objetos (Capítulo 5) y la potencia computacional de las sistemas de membranas con alfabeto limitado (Capítulo 6).El determinismo (Secciones 4.7, 5.5, etc.) es una característica especial (restrictiva) de los sistemas computacionales. Se pondrá especial énfasis en analizar si esta restricción reduce o no la potencia computacional de los mismos. Los resultados obtenidos para sistemas de bombeo del protones están transferidos (Sección 7.3) a sistemas con catalizadores bistabiles. Unos ejemplos de aplicación concreta de los sistemas de membranas (Secciones 7.1, 7.2) son la resolución de problemas NP-completos en tiempo polinomial y la resolución de problemas de ordenación.This thesis deals with membrane systems with symbol objects as a theoretical framework of distributed parallel multiset processing systems.A halting computation can accept, generate or process a number, a vector or a word, so the system globally defines (by the results of all its computations) a set of numbers or a set of vectors or a set of words, (i.e., a language), or a function. The ability of these systems to solve particular problems is investigated, as well as their computational power, e.g., the language families defined by different classes of these systems are compared to the classical ones, i.e., regular, context-free, languages generated by extended tabled 0L systems, languages generated by matrix grammars without appearance checking, recursively enumerable languages, etc. Special attention is paid to communication of objects between the regions and to the ways of cooperation between the objects.An attempt to formalize the membrane systems is made (Section 3.4), and a software tool is constructed for the non-distributed cooperative variant, the configuration browser, i.e., a simulator, where the user chooses the next configuration among the possible ones and can go back. Different distributed models are considered. In the evolution-communication model (Chapter 4) rewriting-like rules are separated from transport rules. Proton pumping systems (Sections 4.8, 4.9) are a variant of the evolution-communication systems with a restricted way of cooperation. A special membrane computing model is a purely communicative one: the objects are moved together through a membrane. We study the computational power of membrane systems with symport/antiport of 2 or 3 objects (Chapter 5) and the computational power of membrane systems with a limited alphabet (Chapter 6).Determinism (Sections 4.7, 5.5, etc.) is a special property of computational systems; the question of whether this restriction reduces the computational power is addressed. The results on proton pumping systems can be carried over (Section 7.3) to the systems with bi-stable catalysts. Some particular examples of membrane systems applications are solving NP-complete problems in polynomial time, and solving the sorting problem
A novel approach for the hardware implementation of a PPMC statistical data compressor
This thesis aims to understand how to design high-performance compression
algorithms suitable for hardware implementation and to provide hardware support for
an efficient compression algorithm.
Lossless data compression techniques have been developed to exploit the available
bandwidth of applications in data communications and computer systems by reducing
the amount of data they transmit or store. As the amount of data to handle is ever
increasing, traditional methods for compressing data become· insufficient. To
overcome this problem, more powerful methods have been developed. Among those
are the so-called statistical data compression methods that compress data based on
their statistics. However, their high complexity and space requirements have prevented
their hardware implementation and the full exploitation of their potential benefits.
This thesis looks into the feasibility of the hardware implementation of one of these
statistical data compression methods by exploring the potential for reorganising and
restructuring the method for hardware implementation and investigating ways of
achieving efficient and effective designs to achieve an efficient and cost-effective
algorithm. [Continues.
Two-Way Visibly Pushdown Automata and Transducers
Automata-logic connections are pillars of the theory of regular languages.
Such connections are harder to obtain for transducers, but important results
have been obtained recently for word-to-word transformations, showing that the
three following models are equivalent: deterministic two-way transducers,
monadic second-order (MSO) transducers, and deterministic one-way automata
equipped with a finite number of registers. Nested words are words with a
nesting structure, allowing to model unranked trees as their depth-first-search
linearisations. In this paper, we consider transformations from nested words to
words, allowing in particular to produce unranked trees if output words have a
nesting structure. The model of visibly pushdown transducers allows to describe
such transformations, and we propose a simple deterministic extension of this
model with two-way moves that has the following properties: i) it is a simple
computational model, that naturally has a good evaluation complexity; ii) it is
expressive: it subsumes nested word-to-word MSO transducers, and the exact
expressiveness of MSO transducers is recovered using a simple syntactic
restriction; iii) it has good algorithmic/closure properties: the model is
closed under composition with a unambiguous one-way letter-to-letter transducer
which gives closure under regular look-around, and has a decidable equivalence
problem
- …