6 research outputs found

    Covering Points by Disjoint Boxes with Outliers

    Get PDF
    For a set of n points in the plane, we consider the axis--aligned (p,k)-Box Covering problem: Find p axis-aligned, pairwise-disjoint boxes that together contain n-k points. In this paper, we consider the boxes to be either squares or rectangles, and we want to minimize the area of the largest box. For general p we show that the problem is NP-hard for both squares and rectangles. For a small, fixed number p, we give algorithms that find the solution in the following running times: For squares we have O(n+k log k) time for p=1, and O(n log n+k^p log^p k time for p = 2,3. For rectangles we get O(n + k^3) for p = 1 and O(n log n+k^{2+p} log^{p-1} k) time for p = 2,3. In all cases, our algorithms use O(n) space.Comment: updated version: - changed problem from 'cover exactly n-k points' to 'cover at least n-k points' to avoid having non-feasible solutions. Results are unchanged. - added Proof to Lemma 11, clarified some sections - corrected typos and small errors - updated affiliations of two author

    Covering many points with a small-area box

    Get PDF
    Let PP be a set of nn points in the plane. We show how to find, for a given integer k>0k>0, the smallest-area axis-parallel rectangle that covers kk points of PP in O(nk2logn+nlog2n)O(nk^2 \log n+ n\log^2 n) time. We also consider the problem of, given a value α>0\alpha>0, covering as many points of PP as possible with an axis-parallel rectangle of area at most α\alpha. For this problem we give a probabilistic (1ε)(1-\varepsilon)-approximation that works in near-linear time: In O((n/ε4)log3nlog(1/ε))O((n/\varepsilon^4)\log^3 n \log (1/\varepsilon)) time we find an axis-parallel rectangle of area at most α\alpha that, with high probability, covers at least (1ε)κ(1-\varepsilon)\mathrm{\kappa^*} points, where κ\mathrm{\kappa^*} is the maximum possible number of points that could be covered

    Covering points by disjoint boxes with outliers

    No full text
    For a set of n points in the plane, we consider the axis-aligned (p, k)-Box COVERING problem: Find p axis-aligned, pairwise-disjoint boxes that together contain at least n k points. In this paper, we consider the boxes to be either squares or rectangles, and we want to minimize the area of the largest box. For general p we show that the problem is NP-hard for both squares and rectangles. For a small, fixed number p. we give algorithms that find the solution in the following running times: For squares we have O (n + k log k) time for p = 1, and O (n log n + k(p) log(p) k) time for p = 2, 3. For rectangles we get O (n + k(3)) for p = 1 and O (n log n + k(2+p) log(p-1) k) time for p = 2, 3. In all cases, our algorithms use O (n) space.X116Nsciescopu

    Learning From Almost No Data

    Get PDF
    The tremendous recent growth in the fields of artificial intelligence and machine learning has largely been tied to the availability of big data and massive amounts of compute. The increasingly popular approach of training large neural networks on large datasets has provided great returns, but it leaves behind the multitude of researchers, companies, and practitioners who do not have access to sufficient funding, compute power, or volume of data. This thesis aims to rectify this growing imbalance by probing the limits of what machine learning and deep learning methods can achieve with small data. What knowledge does a dataset contain? At the highest level, a dataset is just a collection of samples: images, text, etc. Yet somehow, when we train models on these datasets, they are able to find patterns, make inferences, detect similarities, and otherwise generalize to samples that they have previously never seen. This suggests that datasets may contain some kind of intrinsic knowledge about the systems or distributions from which they are sampled. Moreover, it appears that this knowledge is somehow distributed and duplicated across the samples; we intuitively expect that removing an image from a large training set will have virtually no impact on the final model performance. We develop a framework to explain efficient generalization around three principles: information sharing, information repackaging, and information injection. We use this framework to propose `less than one'-shot learning, an extreme form of few-shot learning where a learner must recognize N classes from M < N training examples. To achieve this extreme level of efficiency, we develop new framework-consistent methods and theory for lost data restoration, for dataset size reduction, and for few-shot learning with deep neural networks and other popular machine learning models
    corecore