154,276 research outputs found
Fast Monotone Summation over Disjoint Sets
We study the problem of computing an ensemble of multiple sums where the
summands in each sum are indexed by subsets of size of an -element
ground set. More precisely, the task is to compute, for each subset of size
of the ground set, the sum over the values of all subsets of size that are
disjoint from the subset of size . We present an arithmetic circuit that,
without subtraction, solves the problem using arithmetic
gates, all monotone; for constant , this is within the factor
of the optimal. The circuit design is based on viewing the summation as a "set
nucleation" task and using a tree-projection approach to implement the
nucleation. Applications include improved algorithms for counting heaviest
-paths in a weighted graph, computing permanents of rectangular matrices,
and dynamic feature selection in machine learning
A constant-time algorithm for middle levels Gray codes
For any integer a middle levels Gray code is a cyclic listing of
all -element and -element subsets of such that
any two consecutive subsets differ in adding or removing a single element. The
question whether such a Gray code exists for any has been the subject
of intensive research during the last 30 years, and has been answered
affirmatively only recently [T. M\"utze. Proof of the middle levels conjecture.
Proc. London Math. Soc., 112(4):677--713, 2016]. In a follow-up paper [T.
M\"utze and J. Nummenpalo. An efficient algorithm for computing a middle levels
Gray code. To appear in ACM Transactions on Algorithms, 2018] this existence
proof was turned into an algorithm that computes each new set in the Gray code
in time on average. In this work we present an algorithm for
computing a middle levels Gray code in optimal time and space: each new set is
generated in time on average, and the required space is
On sets of integers which contain no three terms in geometric progression
The problem of looking for subsets of the natural numbers which contain no
3-term arithmetic progressions has a rich history. Roth's theorem famously
shows that any such subset cannot have positive upper density. In contrast,
Rankin in 1960 suggested looking at subsets without three-term geometric
progressions, and constructed such a subset with density about 0.719. More
recently, several authors have found upper bounds for the upper density of such
sets. We significantly improve upon these bounds, and demonstrate a method of
constructing sets with a greater upper density than Rankin's set. This
construction is optimal in the sense that our method gives a way of effectively
computing the greatest possible upper density of a geometric-progression-free
set. We also show that geometric progressions in Z/nZ behave more like Roth's
theorem in that one cannot take any fixed positive proportion of the integers
modulo a sufficiently large value of n while avoiding geometric progressions.Comment: 16 page
Computing optimal cocomo effort multiplier values and optimal casebase subsets using monte carlo methods
There have been many studies performed and techniques applied to solve the problem of estimating man-month effort for software projects. Despite all the effort expended to solving this problem the results achieved from the various techniques have not been embraced by the software community as very reliable or accurate. This thesis uses Monte Carlo methods to obtain optimal values for COCOMO effort multipliers which minimize the average of the absolute values of the relative errors (AARE) of man-month estimate for two industry supplied casebases. For example, when using three COCOMO cost drivers (complexity, language experience, application experience) and the COCOMO effort multiplier values, AARE values were 60% for casebase 1 and 53% for casebase 2; using Monte Carlo to obtain optimal effort multiplier values, AARE values were 34% for casebase 1 and 41% for casebase 2. By repeatedly removing the cases which contributed the greatest Absolute Relative Error, the Monte Carlo method was also used to determine optimal casebase subsets with AARE values of less than 10%. This latter approach identifies casebase cases for which the cost drivers may have been rated incorrectly or cases which are not rated consistently with respect to a subset of cases
Optimal Data-Dependent Hashing for Approximate Near Neighbors
We show an optimal data-dependent hashing scheme for the approximate near
neighbor problem. For an -point data set in a -dimensional space our data
structure achieves query time and space , where for the Euclidean space and
approximation . For the Hamming space, we obtain an exponent of
.
Our result completes the direction set forth in [AINR14] who gave a
proof-of-concept that data-dependent hashing can outperform classical Locality
Sensitive Hashing (LSH). In contrast to [AINR14], the new bound is not only
optimal, but in fact improves over the best (optimal) LSH data structures
[IM98,AI06] for all approximation factors .
From the technical perspective, we proceed by decomposing an arbitrary
dataset into several subsets that are, in a certain sense, pseudo-random.Comment: 36 pages, 5 figures, an extended abstract appeared in the proceedings
of the 47th ACM Symposium on Theory of Computing (STOC 2015
Sparse Regression Codes for Multi-terminal Source and Channel Coding
We study a new class of codes for Gaussian multi-terminal source and channel
coding. These codes are designed using the statistical framework of
high-dimensional linear regression and are called Sparse Superposition or
Sparse Regression codes. Codewords are linear combinations of subsets of
columns of a design matrix. These codes were recently introduced by Barron and
Joseph and shown to achieve the channel capacity of AWGN channels with
computationally feasible decoding. They have also recently been shown to
achieve the optimal rate-distortion function for Gaussian sources. In this
paper, we demonstrate how to implement random binning and superposition coding
using sparse regression codes. In particular, with minimum-distance
encoding/decoding it is shown that sparse regression codes attain the optimal
information-theoretic limits for a variety of multi-terminal source and channel
coding problems.Comment: 9 pages, appeared in the Proceedings of the 50th Annual Allerton
Conference on Communication, Control, and Computing - 201
- …