1,937 research outputs found
Recommended from our members
Average case analysis of marking algorithms
The Lindstrom marking algorithm uses bounded workspace. Its time complexity is O(n^2) in all cases, but it has been assumed that the average case time complexity O(n lg n). It is proven that the average case time complexity is H(n^2). Similarly, the average size of the Wegbreit bit stack is shown to be H(n)
vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design
The most widely used machine learning frameworks require users to carefully
tune their memory usage so that the deep neural network (DNN) fits into the
DRAM capacity of a GPU. This restriction hampers a researcher's flexibility to
study different machine learning algorithms, forcing them to either use a less
desirable network architecture or parallelize the processing across multiple
GPUs. We propose a runtime memory manager that virtualizes the memory usage of
DNNs such that both GPU and CPU memory can simultaneously be utilized for
training larger DNNs. Our virtualized DNN (vDNN) reduces the average GPU memory
usage of AlexNet by up to 89%, OverFeat by 91%, and GoogLeNet by 95%, a
significant reduction in memory requirements of DNNs. Similar experiments on
VGG-16, one of the deepest and memory hungry DNNs to date, demonstrate the
memory-efficiency of our proposal. vDNN enables VGG-16 with batch size 256
(requiring 28 GB of memory) to be trained on a single NVIDIA Titan X GPU card
containing 12 GB of memory, with 18% performance loss compared to a
hypothetical, oracular GPU with enough memory to hold the entire DNN.Comment: Published as a conference paper at the 49th IEEE/ACM International
Symposium on Microarchitecture (MICRO-49), 201
Compressed Data Structures for Dynamic Sequences
We consider the problem of storing a dynamic string over an alphabet
in compressed form. Our representation
supports insertions and deletions of symbols and answers three fundamental
queries: returns the -th symbol in ,
counts how many times a symbol occurs among the
first positions in , and finds the position
where a symbol occurs for the -th time. We present the first
fully-dynamic data structure for arbitrarily large alphabets that achieves
optimal query times for all three operations and supports updates with
worst-case time guarantees. Ours is also the first fully-dynamic data structure
that needs only bits, where is the -th order
entropy and is the string length. Moreover our representation supports
extraction of a substring in optimal time
Garbage collection in distributed systems
PhD ThesisThe provision of system-wide heap storage has a number of advantages.
However, when the technique is applied to distributed systems
automatically recovering inaccessible variables becomes a serious problem.
This thesis presents a survey of such garbage collection techniques but
finds that no existing algorithm is entirely suitable. A new, general
purpose algorithm is developed and presented which allows individual
systems to garbage collect largely independently. The effects of these
garbage collections are combined, using recursively structured control
mechanisms, to achieve garbage collection of the entire heap with the
minimum of overheads. Experimental results show that new algorithm
recovers most inaccessible variables more quickly than a straightforward
garbage collection, giving an improved memory utilisation
Lightweight LCP Construction for Very Large Collections of Strings
The longest common prefix array is a very advantageous data structure that,
combined with the suffix array and the Burrows-Wheeler transform, allows to
efficiently compute some combinatorial properties of a string useful in several
applications, especially in biological contexts. Nowadays, the input data for
many problems are big collections of strings, for instance the data coming from
"next-generation" DNA sequencing (NGS) technologies. In this paper we present
the first lightweight algorithm (called extLCP) for the simultaneous
computation of the longest common prefix array and the Burrows-Wheeler
transform of a very large collection of strings having any length. The
computation is realized by performing disk data accesses only via sequential
scans, and the total disk space usage never needs more than twice the output
size, excluding the disk space required for the input. Moreover, extLCP allows
to compute also the suffix array of the strings of the collection, without any
other further data structure is needed. Finally, we test our algorithm on real
data and compare our results with another tool capable to work in external
memory on large collections of strings.Comment: This manuscript version is made available under the CC-BY-NC-ND 4.0
license http://creativecommons.org/licenses/by-nc-nd/4.0/ The final version
of this manuscript is in press in Journal of Discrete Algorithm
Garbage Collection of Linked Data Structures: An Example in a Network Oriented Database Management System
A unified view of the numerous existing algorithms for performing garbage collection of linked data structure has been presented. An implementation of a garbage collection tool in a network oriented database management system has been described
Quantum query complexity of state conversion
State conversion generalizes query complexity to the problem of converting
between two input-dependent quantum states by making queries to the input. We
characterize the complexity of this problem by introducing a natural
information-theoretic norm that extends the Schur product operator norm. The
complexity of converting between two systems of states is given by the distance
between them, as measured by this norm.
In the special case of function evaluation, the norm is closely related to
the general adversary bound, a semi-definite program that lower-bounds the
number of input queries needed by a quantum algorithm to evaluate a function.
We thus obtain that the general adversary bound characterizes the quantum query
complexity of any function whatsoever. This generalizes and simplifies the
proof of the same result in the case of boolean input and output. Also in the
case of function evaluation, we show that our norm satisfies a remarkable
composition property, implying that the quantum query complexity of the
composition of two functions is at most the product of the query complexities
of the functions, up to a constant. Finally, our result implies that discrete
and continuous-time query models are equivalent in the bounded-error setting,
even for the general state-conversion problem.Comment: 19 pages, 2 figures; heavily revised with new results and simpler
proof
Computational speedups using small quantum devices
Suppose we have a small quantum computer with only M qubits. Can such a
device genuinely speed up certain algorithms, even when the problem size is
much larger than M? Here we answer this question to the affirmative. We present
a hybrid quantum-classical algorithm to solve 3SAT problems involving n>>M
variables that significantly speeds up its fully classical counterpart. This
question may be relevant in view of the current quest to build small quantum
computers.Comment: 5+12 page
Active Self-Assembly of Algorithmic Shapes and Patterns in Polylogarithmic Time
We describe a computational model for studying the complexity of
self-assembled structures with active molecular components. Our model captures
notions of growth and movement ubiquitous in biological systems. The model is
inspired by biology's fantastic ability to assemble biomolecules that form
systems with complicated structure and dynamics, from molecular motors that
walk on rigid tracks and proteins that dynamically alter the structure of the
cell during mitosis, to embryonic development where large-scale complicated
organisms efficiently grow from a single cell. Using this active self-assembly
model, we show how to efficiently self-assemble shapes and patterns from simple
monomers. For example, we show how to grow a line of monomers in time and
number of monomer states that is merely logarithmic in the length of the line.
Our main results show how to grow arbitrary connected two-dimensional
geometric shapes and patterns in expected time that is polylogarithmic in the
size of the shape, plus roughly the time required to run a Turing machine
deciding whether or not a given pixel is in the shape. We do this while keeping
the number of monomer types logarithmic in shape size, plus those monomers
required by the Kolmogorov complexity of the shape or pattern. This work thus
highlights the efficiency advantages of active self-assembly over passive
self-assembly and motivates experimental effort to construct general-purpose
active molecular self-assembly systems
- …