57,720 research outputs found

    LASSO: Listing All Subset Sums Obediently for Evaluating Unbounded Subset Sums

    Get PDF
    In this study we present a novel algorithm, LASSO, for solving the unbounded and bounded subset sum problem. The LASSO algorithm was designed to solve the unbounded SSP quickly and to return all subsets summing to a target sum. As speed was the highest priority, we benchmarked the run time performance of LASSO against implementations of some common approaches to the bounded SSP, as well as the only comparable implementation for solving the unbounded SSP that we could find. In solving the bounded SSP, our algorithm had a significantly faster run time than the competing algorithms when the target sum returned at least one subset. When the target returned no subsets, LASSO had a poorer run time growth rate than the competing algorithms solving bounded subset sum. For solving the USSP LASSO was significantly faster than the only comparable algorithm for this problem, both in run time and run time growth rate

    Data Sketches for Disaggregated Subset Sum and Frequent Item Estimation

    Full text link
    We introduce and study a new data sketch for processing massive datasets. It addresses two common problems: 1) computing a sum given arbitrary filter conditions and 2) identifying the frequent items or heavy hitters in a data set. For the former, the sketch provides unbiased estimates with state of the art accuracy. It handles the challenging scenario when the data is disaggregated so that computing the per unit metric of interest requires an expensive aggregation. For example, the metric of interest may be total clicks per user while the raw data is a click stream with multiple rows per user. Thus the sketch is suitable for use in a wide range of applications including computing historical click through rates for ad prediction, reporting user metrics from event streams, and measuring network traffic for IP flows. We prove and empirically show the sketch has good properties for both the disaggregated subset sum estimation and frequent item problems. On i.i.d. data, it not only picks out the frequent items but gives strongly consistent estimates for the proportion of each frequent item. The resulting sketch asymptotically draws a probability proportional to size sample that is optimal for estimating sums over the data. For non i.i.d. data, we show that it typically does much better than random sampling for the frequent item problem and never does worse. For subset sum estimation, we show that even for pathological sequences, the variance is close to that of an optimal sampling design. Empirically, despite the disadvantage of operating on disaggregated data, our method matches or bests priority sampling, a state of the art method for pre-aggregated data and performs orders of magnitude better on skewed data compared to uniform sampling. We propose extensions to the sketch that allow it to be used in combining multiple data sets, in distributed systems, and for time decayed aggregation

    Exact Algorithms for 0-1 Integer Programs with Linear Equality Constraints

    Full text link
    In this paper, we show O(1.415n)O(1.415^n)-time and O(1.190n)O(1.190^n)-space exact algorithms for 0-1 integer programs where constraints are linear equalities and coefficients are arbitrary real numbers. Our algorithms are quadratically faster than exhaustive search and almost quadratically faster than an algorithm for an inequality version of the problem by Impagliazzo, Lovett, Paturi and Schneider (arXiv:1401.5512), which motivated our work. Rather than improving the time and space complexity, we advance to a simple direction as inclusion of many NP-hard problems in terms of exact exponential algorithms. Specifically, we extend our algorithms to linear optimization problems

    Anonymous Networking amidst Eavesdroppers

    Full text link
    The problem of security against timing based traffic analysis in wireless networks is considered in this work. An analytical measure of anonymity in eavesdropped networks is proposed using the information theoretic concept of equivocation. For a physical layer with orthogonal transmitter directed signaling, scheduling and relaying techniques are designed to maximize achievable network performance for any given level of anonymity. The network performance is measured by the achievable relay rates from the sources to destinations under latency and medium access constraints. In particular, analytical results are presented for two scenarios: For a two-hop network with maximum anonymity, achievable rate regions for a general m x 1 relay are characterized when nodes generate independent Poisson transmission schedules. The rate regions are presented for both strict and average delay constraints on traffic flow through the relay. For a multihop network with an arbitrary anonymity requirement, the problem of maximizing the sum-rate of flows (network throughput) is considered. A selective independent scheduling strategy is designed for this purpose, and using the analytical results for the two-hop network, the achievable throughput is characterized as a function of the anonymity level. The throughput-anonymity relation for the proposed strategy is shown to be equivalent to an information theoretic rate-distortion function
    • …
    corecore