533 research outputs found

    Succinct Indexable Dictionaries with Applications to Encoding kk-ary Trees, Prefix Sums and Multisets

    Full text link
    We consider the {\it indexable dictionary} problem, which consists of storing a set S{0,...,m1}S \subseteq \{0,...,m-1\} for some integer mm, while supporting the operations of \Rank(x), which returns the number of elements in SS that are less than xx if xSx \in S, and -1 otherwise; and \Select(i) which returns the ii-th smallest element in SS. We give a data structure that supports both operations in O(1) time on the RAM model and requires B(n,m)+o(n)+O(lglgm){\cal B}(n,m) + o(n) + O(\lg \lg m) bits to store a set of size nn, where {\cal B}(n,m) = \ceil{\lg {m \choose n}} is the minimum number of bits required to store any nn-element subset from a universe of size mm. Previous dictionaries taking this space only supported (yes/no) membership queries in O(1) time. In the cell probe model we can remove the O(lglgm)O(\lg \lg m) additive term in the space bound, answering a question raised by Fich and Miltersen, and Pagh. We present extensions and applications of our indexable dictionary data structure, including: An information-theoretically optimal representation of a kk-ary cardinal tree that supports standard operations in constant time, A representation of a multiset of size nn from {0,...,m1}\{0,...,m-1\} in B(n,m+n)+o(n){\cal B}(n,m+n) + o(n) bits that supports (appropriate generalizations of) \Rank and \Select operations in constant time, and A representation of a sequence of nn non-negative integers summing up to mm in B(n,m+n)+o(n){\cal B}(n,m+n) + o(n) bits that supports prefix sum queries in constant time.Comment: Final version of SODA 2002 paper; supersedes Leicester Tech report 2002/1

    Query Flattening and the Nested Data Parallelism Paradigm

    Get PDF
    This work is based on the observation that languages for two seemingly distant domains are closely related. Orthogonal query languages based on comprehension syntax admit various forms of query nesting to construct nested query results and express complex predicates. Languages for nested data parallelism allow to nest parallel iterators and thereby admit the parallel evaluation of computations that are themselves parallel. Both kinds of languages center around the application of side-effect-free functions to each element of a collection. The motivation for this work is the seamless integration of relational database queries with programming languages. In frameworks for language-integrated database queries, a host language's native collection-programming API is used to express queries. To mediate between native collection programming and relational queries, we define an expressive, orthogonal query calculus that supports nesting and order. The challenge of query flattening is to translate this calculus to bundles of efficient relational queries restricted to flat, unordered multisets. Prior approaches to query flattening either support only query languages that lack in expressiveness or employ a complex, monolithic translation that is hard to comprehend and generates inefficient code that is hard to optimize. To improve on those approaches, we draw on the similarity to nested data parallelism. Blelloch's flattening transformation is a static program transformation that translates nested data parallelism to flat data parallel programs over flat arrays. Based on the flattening transformation, we describe a pipeline of small, comprehensible lowering steps that translates our nested query calculus to a bundle of relational queries. The pipeline is based on a number of well-defined intermediate languages. Our translation adopts the key concepts of the flattening transformation but is designed with specifics of relational query processing in mind. Based on this translation, we revisit all aspects of query flattening. Our translation is fully compositional and can translate any term of the input language. Like prior work, the translation by itself produces inefficient code due to compositionality that is not fit for execution without optimization. In contrast to prior work, we show that query optimization is orthogonal to flattening and can be performed before flattening. We employ well-known work on logical query optimization for nested query languages and demonstrate that this body of work integrates well with our approach. Furthermore, we describe an improved encoding of ordered and nested collections in terms of flat, unordered multisets. Our approach emits idiomatic relational queries in which the effort required to maintain the non-relational semantics of the source language (order and nesting) is minimized. A set of experiments provides evidence that our approach to query flattening can handle complex, list-based queries with nested results and nested intermediate data well. We apply our approach to a number of flat and nested benchmark queries and compare their runtime with hand-written SQL queries. In these experiments, our SQL code generated from a list-based nested query language usually performs as well as hand-written queries

    Communication in membrana Systems with symbol Objects.

    Get PDF
    Esta tesis está dedicada a los sistemas de membranas con objetos-símbolo como marco teórico de los sistemas paralelos y distribuidos de procesamiento de multiconjuntos.Una computación de parada puede aceptar, generar o procesar un número, un vector o una palabra; por tanto el sistema define globalmente (a través de los resultados de todas sus computaciones) un conjunto de números, de vectores, de palabras (es decir, un lenguaje), o bien una función. En esta tesis estudiamos la capacidad de estos sistemas para resolver problemas particulares, así como su potencia computacional. Por ejemplo, las familias de lenguajes definidas por diversas clases de estos sistemas se comparan con las familias clásicas, esto es, lenguajes regulares, independientes del contexto, generados por sistemas 0L tabulados extendidos, generados por gramáticas matriciales sin chequeo de apariciones, recursivamente enumerables, etc. Se prestará especial atención a la comunicación de objetos entre regiones y a las distintas formas de cooperación entre ellos.Se pretende (Sección 3.4) realizar una formalización los sistemas de membranas y construir una herramienta tipo software para la variante que usa cooperación no distribuida, el navegador de configuraciones, es decir, un simulador, en el cual el usuario selecciona la siguiente configuración entre todas las posibles, estando permitido volver hacia atrás. Se considerarán diversos modelos distribuidos. En el modelo de evolución y comunicación (Capítulo 4) separamos las reglas tipo-reescritura y las reglas de transporte (llamadas symport y antiport). Los sistemas de bombeo de protones (proton pumping, Secciones 4.8, 4.9) constituyen una variante de los sistemas de evolución y comunicación con un modo restrictivo de cooperación. Un modelo especial de computación con membranas es el modelo puramente comunicativo, en el cual los objetos traspasan juntos una membrana. Estudiamos la potencia computacional de las sistemas de membranas con symport/antiport de 2 o 3 objetos (Capítulo 5) y la potencia computacional de las sistemas de membranas con alfabeto limitado (Capítulo 6).El determinismo (Secciones 4.7, 5.5, etc.) es una característica especial (restrictiva) de los sistemas computacionales. Se pondrá especial énfasis en analizar si esta restricción reduce o no la potencia computacional de los mismos. Los resultados obtenidos para sistemas de bombeo del protones están transferidos (Sección 7.3) a sistemas con catalizadores bistabiles. Unos ejemplos de aplicación concreta de los sistemas de membranas (Secciones 7.1, 7.2) son la resolución de problemas NP-completos en tiempo polinomial y la resolución de problemas de ordenación.This thesis deals with membrane systems with symbol objects as a theoretical framework of distributed parallel multiset processing systems.A halting computation can accept, generate or process a number, a vector or a word, so the system globally defines (by the results of all its computations) a set of numbers or a set of vectors or a set of words, (i.e., a language), or a function. The ability of these systems to solve particular problems is investigated, as well as their computational power, e.g., the language families defined by different classes of these systems are compared to the classical ones, i.e., regular, context-free, languages generated by extended tabled 0L systems, languages generated by matrix grammars without appearance checking, recursively enumerable languages, etc. Special attention is paid to communication of objects between the regions and to the ways of cooperation between the objects.An attempt to formalize the membrane systems is made (Section 3.4), and a software tool is constructed for the non-distributed cooperative variant, the configuration browser, i.e., a simulator, where the user chooses the next configuration among the possible ones and can go back. Different distributed models are considered. In the evolution-communication model (Chapter 4) rewriting-like rules are separated from transport rules. Proton pumping systems (Sections 4.8, 4.9) are a variant of the evolution-communication systems with a restricted way of cooperation. A special membrane computing model is a purely communicative one: the objects are moved together through a membrane. We study the computational power of membrane systems with symport/antiport of 2 or 3 objects (Chapter 5) and the computational power of membrane systems with a limited alphabet (Chapter 6).Determinism (Sections 4.7, 5.5, etc.) is a special property of computational systems; the question of whether this restriction reduces the computational power is addressed. The results on proton pumping systems can be carried over (Section 7.3) to the systems with bi-stable catalysts. Some particular examples of membrane systems applications are solving NP-complete problems in polynomial time, and solving the sorting problem

    Connector algebras for C/E and P/T nets interactions

    Get PDF
    A quite fourishing research thread in the recent literature on component based system is concerned with the algebraic properties of different classes of connectors. In a recent paper, an algebra of stateless connectors was presented that consists of five kinds of basic connectors, namely symmetry, synchronization, mutual exclusion, hiding and inaction, plus their duals and it was shown how they can be freely composed in series and in parallel to model sophisticated "glues". In this paper we explore the expressiveness of stateful connectors obtained by adding one-place buffers or unbounded buffers to the stateless connectors. The main results are: i) we show how different classes of connectors exactly correspond to suitable classes of Petri nets equipped with compositional interfaces, called nets with boundaries; ii) we show that the difference between strong and weak semantics in stateful connectors is reflected in the semantics of nets with boundaries by moving from the classic step semantics (strong case) to a novel banking semantics (weak case), where a step can be executed by taking some "debit" tokens to be given back during the same step; iii) we show that the corresponding bisimilarities are congruences (w.r.t. composition of connectors in series and in parallel); iv) we show that suitable monoidality laws, like those arising when representing stateful connectors in the tile model, can nicely capture concurrency aspects; and v) as a side result, we provide a basic algebra, with a finite set of symbols, out of which we can compose all P/T nets, fulfilling a long standing quest

    Membrane systems with limited parallelism

    Get PDF
    Membrane computing is an emerging research field that belongs to the more general area of molecular computing, which deals with computational models inspired from bio-molecular processes. Membrane computing aims at defining models, called membrane systems or P systems, which abstract the functioning and structure of the cell. A membrane system consists of a hierarchical arrangement of membranes delimiting regions, which represent various compartments of a cell, and with each region containing bio-chemical elements of various types and having associated evolution rules, which represent bio-chemical processes taking place inside the cell. This work is a continuation of the investigations aiming to bridge membrane computing (where in a compartmental cell-like structure the chemicals to evolve are placed in compartments defined by membranes) and brane calculi (where one considers again a compartmental cell-like structure with the chemicals/proteins placed on the membranes themselves). We use objects both in compartments and on membranes (the latter are called proteins), with the objects from membranes evolving under the control of the proteins. Several possibilities are considered (objects only moved across membranes or also changed during this operation, with the proteins only assisting the move/change or also changing themselves). Somewhat expected, computational universality is obtained for several combinations of such possibilities. We also present a method for solving the NP-complete SAT problem using P systems with proteins on membranes. The SAT problem is solved in O(nm) time, where n is the number of boolean variables and m is the number of clauses for an instance written in conjunctive normal form. Thus, we can say that the solution for each given instance is obtained in linear time. We succeeded in solving SAT by a uniform construction of a deterministic P system which uses rules involving objects in regions, proteins on membranes, and membrane division. Then, we investigate the computational power of P systems with proteins on membranes in some particular cases: when only one protein is placed on a membrane, when the systems have a minimal number of rules, when the computation evolves in accepting or computing mode, etc. This dissertation introduces also another new variant of membrane systems that uses context-free rewriting rules for the evolution of objects placed inside compartments of a cell, and symport rules for communication between membranes. The strings circulate across membranes depending on their membership to regular languages given by means of regular expressions. We prove that these rewriting-symport P systems generate all recursively enumerable languages. We investigate the computational power of these newly introduced P systems for three particular forms of the regular expressions that are used by the symport rules. A characterization of ET0L languages is obtained in this context

    New Algorithms and Lower Bounds for Sequential-Access Data Compression

    Get PDF
    This thesis concerns sequential-access data compression, i.e., by algorithms that read the input one or more times from beginning to end. In one chapter we consider adaptive prefix coding, for which we must read the input character by character, outputting each character's self-delimiting codeword before reading the next one. We show how to encode and decode each character in constant worst-case time while producing an encoding whose length is worst-case optimal. In another chapter we consider one-pass compression with memory bounded in terms of the alphabet size and context length, and prove a nearly tight tradeoff between the amount of memory we can use and the quality of the compression we can achieve. In a third chapter we consider compression in the read/write streams model, which allows us passes and memory both polylogarithmic in the size of the input. We first show how to achieve universal compression using only one pass over one stream. We then show that one stream is not sufficient for achieving good grammar-based compression. Finally, we show that two streams are necessary and sufficient for achieving entropy-only bounds.Comment: draft of PhD thesi
    corecore