1,602 research outputs found

    Learning from networked examples

    Get PDF
    Many machine learning algorithms are based on the assumption that training examples are drawn independently. However, this assumption does not hold anymore when learning from a networked sample because two or more training examples may share some common objects, and hence share the features of these shared objects. We show that the classic approach of ignoring this problem potentially can have a harmful effect on the accuracy of statistics, and then consider alternatives. One of these is to only use independent examples, discarding other information. However, this is clearly suboptimal. We analyze sample error bounds in this networked setting, providing significantly improved results. An important component of our approach is formed by efficient sample weighting schemes, which leads to novel concentration inequalities

    Algorithms to Approximate Column-Sparse Packing Problems

    Full text link
    Column-sparse packing problems arise in several contexts in both deterministic and stochastic discrete optimization. We present two unifying ideas, (non-uniform) attenuation and multiple-chance algorithms, to obtain improved approximation algorithms for some well-known families of such problems. As three main examples, we attain the integrality gap, up to lower-order terms, for known LP relaxations for k-column sparse packing integer programs (Bansal et al., Theory of Computing, 2012) and stochastic k-set packing (Bansal et al., Algorithmica, 2012), and go "half the remaining distance" to optimal for a major integrality-gap conjecture of Furedi, Kahn and Seymour on hypergraph matching (Combinatorica, 1993).Comment: Extended abstract appeared in SODA 2018. Full version in ACM Transactions of Algorithm

    Nonnegative k-sums, fractional covers, and probability of small deviations

    Get PDF
    More than twenty years ago, Manickam, Mikl\'{o}s, and Singhi conjectured that for any integers n,kn, k satisfying n4kn \geq 4k, every set of nn real numbers with nonnegative sum has at least (n1k1)\binom{n-1}{k-1} kk-element subsets whose sum is also nonnegative. In this paper we discuss the connection of this problem with matchings and fractional covers of hypergraphs, and with the question of estimating the probability that the sum of nonnegative independent random variables exceeds its expectation by a given amount. Using these connections together with some probabilistic techniques, we verify the conjecture for n33k2n \geq 33k^2. This substantially improves the best previously known exponential lower bound neckloglogkn \geq e^{ck \log\log k}. In addition we prove a tight stability result showing that for every kk and all sufficiently large nn, every set of nn reals with a nonnegative sum that does not contain a member whose sum with any other k1k-1 members is nonnegative, contains at least (n1k1)+(nk1k1)1\binom{n-1}{k-1}+\binom{n-k-1}{k-1}-1 subsets of cardinality kk with nonnegative sum.Comment: 15 pages, a section of Hilton-Milner type result adde

    Asymmetric Lee Distance Codes for DNA-Based Storage

    Full text link
    We consider a new family of codes, termed asymmetric Lee distance codes, that arise in the design and implementation of DNA-based storage systems and systems with parallel string transmission protocols. The codewords are defined over a quaternary alphabet, although the results carry over to other alphabet sizes; furthermore, symbol confusability is dictated by their underlying binary representation. Our contributions are two-fold. First, we demonstrate that the new distance represents a linear combination of the Lee and Hamming distance and derive upper bounds on the size of the codes under this metric based on linear programming techniques. Second, we propose a number of code constructions which imply lower bounds

    On complexity of optimized crossover for binary representations

    Get PDF
    We consider the computational complexity of producing the best possible offspring in a crossover, given two solutions of the parents. The crossover operators are studied on the class of Boolean linear programming problems, where the Boolean vector of variables is used as the solution representation. By means of efficient reductions of the optimized gene transmitting crossover problems (OGTC) we show the polynomial solvability of the OGTC for the maximum weight set packing problem, the minimum weight set partition problem and for one of the versions of the simple plant location problem. We study a connection between the OGTC for linear Boolean programming problem and the maximum weight independent set problem on 2-colorable hypergraph and prove the NP-hardness of several special cases of the OGTC problem in Boolean linear programming.Comment: Dagstuhl Seminar 06061 "Theory of Evolutionary Algorithms", 200
    corecore