1,598 research outputs found
Cell-Probe Lower Bounds from Online Communication Complexity
In this work, we introduce an online model for communication complexity.
Analogous to how online algorithms receive their input piece-by-piece, our
model presents one of the players, Bob, his input piece-by-piece, and has the
players Alice and Bob cooperate to compute a result each time before the next
piece is revealed to Bob. This model has a closer and more natural
correspondence to dynamic data structures than classic communication models do,
and hence presents a new perspective on data structures.
We first present a tight lower bound for the online set intersection problem
in the online communication model, demonstrating a general approach for proving
online communication lower bounds. The online communication model prevents a
batching trick that classic communication complexity allows, and yields a
stronger lower bound. We then apply the online communication model to prove
data structure lower bounds for two dynamic data structure problems: the Group
Range problem and the Dynamic Connectivity problem for forests. Both of the
problems admit a worst case -time data structure. Using online
communication complexity, we prove a tight cell-probe lower bound for each:
spending (even amortized) time per operation results in at best an
probability of correctly answering a
-fraction of the queries
Recommended from our members
Partitioned Blockmap Indexes for Multidimensional Data Access
Given recent increases in the size of main memory in modern machines, it is now common to to store large data sets in RAM for faster processing. Multidimensional access methods aim to provide efficient access to large data sets when queries apply predicates to some of the data dimensions. We examine multidimensional access methods in the context of an in-memory column store tuned for on-line analytical processing or scientific data analysis. We propose a multidimensional data structure that contains a novel combination of a grid array and several bitmaps. The base data is clustered in an order matching that of the index structure. The bitmaps contain one bit per block of data, motivating the term "blockmap." The proposed data structures are compact, typically taking less than one bit of space per row of data. Partition boundaries can be chosen in a way that reflects both the query workload and the data distribution, and boundaries are not required to evenly divide the data if there is a bias in the query distribution. We examine the theoretical performance of the data structure and experimentally measure its performance on three modern CPUs and one GPU processor. We demonstrate that efficient multidimensional access can be achieved with minimal space overhead
Heuristic Solutions for Loading in Flexible Manufacturing Systems
Production planning in flexible manufacturing system deals with the efficient organization of the production resources in order to meet a given production schedule. It is a complex problem and typically leads to several hierarchical subproblems that need to be solved sequentially or simultaneously. Loading is one of the planning subproblems that has to addressed. It involves assigning the necessary operations and tools among the various machines in some optimal fashion to achieve the production of all selected part types. In this paper, we first formulate the loading problem as a 0-1 mixed integer program and then propose heuristic procedures based on Lagrangian relaxation and tabu search to solve the problem. Computational results are presented for all the algorithms and finally, conclusions drawn based on the results are discussed
Efficient Indexing for Structured and Unstructured Data
The collection of digital data is growing at an exponential rate. Data originates from wide range of data sources such as text feeds, biological sequencers, internet traffic over routers, through sensors and many other sources. To mine intelligent information from these sources, users have to query the data. Indexing techniques aim to reduce the query time by preprocessing the data. Diversity of data sources in real world makes it imperative to develop application specific indexing solutions based on the data to be queried. Data can be structured i.e., relational tables or unstructured i.e., free text. Moreover, increasingly many applications need to seamlessly analyze both kinds of data making data integration a central issue. Integrating text with structured data needs to account for missing values, errors in the data etc. Probabilistic models have been proposed recently for this purpose. These models are also useful for applications where uncertainty is inherent in data e.g. sensor networks. This dissertation aims to propose efficient indexing solutions for several problems that lie at the intersection of database and information retrieval such as joining ranked inputs, full-text documents searching etc. Other well-known problems of ranked retrieval and pattern matching are also studied under probabilistic settings. For each problem, the worst-case theoretical bounds of the proposed solutions are established and/or their practicality is demonstrated by thorough experimentation
- …