154,276 research outputs found

    Fast Monotone Summation over Disjoint Sets

    Full text link
    We study the problem of computing an ensemble of multiple sums where the summands in each sum are indexed by subsets of size pp of an nn-element ground set. More precisely, the task is to compute, for each subset of size qq of the ground set, the sum over the values of all subsets of size pp that are disjoint from the subset of size qq. We present an arithmetic circuit that, without subtraction, solves the problem using O((np+nq)logn)O((n^p+n^q)\log n) arithmetic gates, all monotone; for constant pp, qq this is within the factor logn\log n of the optimal. The circuit design is based on viewing the summation as a "set nucleation" task and using a tree-projection approach to implement the nucleation. Applications include improved algorithms for counting heaviest kk-paths in a weighted graph, computing permanents of rectangular matrices, and dynamic feature selection in machine learning

    A constant-time algorithm for middle levels Gray codes

    Get PDF
    For any integer n1n\geq 1 a middle levels Gray code is a cyclic listing of all nn-element and (n+1)(n+1)-element subsets of {1,2,,2n+1}\{1,2,\ldots,2n+1\} such that any two consecutive subsets differ in adding or removing a single element. The question whether such a Gray code exists for any n1n\geq 1 has been the subject of intensive research during the last 30 years, and has been answered affirmatively only recently [T. M\"utze. Proof of the middle levels conjecture. Proc. London Math. Soc., 112(4):677--713, 2016]. In a follow-up paper [T. M\"utze and J. Nummenpalo. An efficient algorithm for computing a middle levels Gray code. To appear in ACM Transactions on Algorithms, 2018] this existence proof was turned into an algorithm that computes each new set in the Gray code in time O(n)\mathcal{O}(n) on average. In this work we present an algorithm for computing a middle levels Gray code in optimal time and space: each new set is generated in time O(1)\mathcal{O}(1) on average, and the required space is O(n)\mathcal{O}(n)

    On sets of integers which contain no three terms in geometric progression

    Full text link
    The problem of looking for subsets of the natural numbers which contain no 3-term arithmetic progressions has a rich history. Roth's theorem famously shows that any such subset cannot have positive upper density. In contrast, Rankin in 1960 suggested looking at subsets without three-term geometric progressions, and constructed such a subset with density about 0.719. More recently, several authors have found upper bounds for the upper density of such sets. We significantly improve upon these bounds, and demonstrate a method of constructing sets with a greater upper density than Rankin's set. This construction is optimal in the sense that our method gives a way of effectively computing the greatest possible upper density of a geometric-progression-free set. We also show that geometric progressions in Z/nZ behave more like Roth's theorem in that one cannot take any fixed positive proportion of the integers modulo a sufficiently large value of n while avoiding geometric progressions.Comment: 16 page

    Computing optimal cocomo effort multiplier values and optimal casebase subsets using monte carlo methods

    Get PDF
    There have been many studies performed and techniques applied to solve the problem of estimating man-month effort for software projects. Despite all the effort expended to solving this problem the results achieved from the various techniques have not been embraced by the software community as very reliable or accurate. This thesis uses Monte Carlo methods to obtain optimal values for COCOMO effort multipliers which minimize the average of the absolute values of the relative errors (AARE) of man-month estimate for two industry supplied casebases. For example, when using three COCOMO cost drivers (complexity, language experience, application experience) and the COCOMO effort multiplier values, AARE values were 60% for casebase 1 and 53% for casebase 2; using Monte Carlo to obtain optimal effort multiplier values, AARE values were 34% for casebase 1 and 41% for casebase 2. By repeatedly removing the cases which contributed the greatest Absolute Relative Error, the Monte Carlo method was also used to determine optimal casebase subsets with AARE values of less than 10%. This latter approach identifies casebase cases for which the cost drivers may have been rated incorrectly or cases which are not rated consistently with respect to a subset of cases

    Optimal Data-Dependent Hashing for Approximate Near Neighbors

    Full text link
    We show an optimal data-dependent hashing scheme for the approximate near neighbor problem. For an nn-point data set in a dd-dimensional space our data structure achieves query time O(dnρ+o(1))O(d n^{\rho+o(1)}) and space O(n1+ρ+o(1)+dn)O(n^{1+\rho+o(1)} + dn), where ρ=12c21\rho=\tfrac{1}{2c^2-1} for the Euclidean space and approximation c>1c>1. For the Hamming space, we obtain an exponent of ρ=12c1\rho=\tfrac{1}{2c-1}. Our result completes the direction set forth in [AINR14] who gave a proof-of-concept that data-dependent hashing can outperform classical Locality Sensitive Hashing (LSH). In contrast to [AINR14], the new bound is not only optimal, but in fact improves over the best (optimal) LSH data structures [IM98,AI06] for all approximation factors c>1c>1. From the technical perspective, we proceed by decomposing an arbitrary dataset into several subsets that are, in a certain sense, pseudo-random.Comment: 36 pages, 5 figures, an extended abstract appeared in the proceedings of the 47th ACM Symposium on Theory of Computing (STOC 2015

    Sparse Regression Codes for Multi-terminal Source and Channel Coding

    Full text link
    We study a new class of codes for Gaussian multi-terminal source and channel coding. These codes are designed using the statistical framework of high-dimensional linear regression and are called Sparse Superposition or Sparse Regression codes. Codewords are linear combinations of subsets of columns of a design matrix. These codes were recently introduced by Barron and Joseph and shown to achieve the channel capacity of AWGN channels with computationally feasible decoding. They have also recently been shown to achieve the optimal rate-distortion function for Gaussian sources. In this paper, we demonstrate how to implement random binning and superposition coding using sparse regression codes. In particular, with minimum-distance encoding/decoding it is shown that sparse regression codes attain the optimal information-theoretic limits for a variety of multi-terminal source and channel coding problems.Comment: 9 pages, appeared in the Proceedings of the 50th Annual Allerton Conference on Communication, Control, and Computing - 201
    corecore