22,990 research outputs found

    Efficient Optimally Lazy Algorithms for Minimal-Interval Semantics

    Full text link
    Minimal-interval semantics associates with each query over a document a set of intervals, called witnesses, that are incomparable with respect to inclusion (i.e., they form an antichain): witnesses define the minimal regions of the document satisfying the query. Minimal-interval semantics makes it easy to define and compute several sophisticated proximity operators, provides snippets for user presentation, and can be used to rank documents. In this paper we provide algorithms for computing conjunction and disjunction that are linear in the number of intervals and logarithmic in the number of operands; for additional operators, such as ordered conjunction and Brouwerian difference, we provide linear algorithms. In all cases, space is linear in the number of operands. More importantly, we define a formal notion of optimal laziness, and either prove it, or prove its impossibility, for each algorithm. We cast our results in a general framework of antichains of intervals on total orders, making our algorithms directly applicable to other domains.Comment: 24 pages, 4 figures. A preliminary (now outdated) version was presented at SPIRE 200

    Bounded regret in stochastic multi-armed bandits

    Full text link
    We study the stochastic multi-armed bandit problem when one knows the value μ()\mu^{(\star)} of an optimal arm, as a well as a positive lower bound on the smallest positive gap Δ\Delta. We propose a new randomized policy that attains a regret {\em uniformly bounded over time} in this setting. We also prove several lower bounds, which show in particular that bounded regret is not possible if one only knows Δ\Delta, and bounded regret of order 1/Δ1/\Delta is not possible if one only knows $\mu^{(\star)}

    Reducing statistical time-series problems to binary classification

    Get PDF
    We show how binary classification methods developed to work on i.i.d. data can be used for solving statistical problems that are seemingly unrelated to classification and concern highly-dependent time series. Specifically, the problems of time-series clustering, homogeneity testing and the three-sample problem are addressed. The algorithms that we construct for solving these problems are based on a new metric between time-series distributions, which can be evaluated using binary classification methods. Universal consistency of the proposed algorithms is proven under most general assumptions. The theoretical results are illustrated with experiments on synthetic and real-world data.Comment: In proceedings of NIPS 2012, pp. 2069-207

    Quantifying Homology Classes

    Get PDF
    We develop a method for measuring homology classes. This involves three problems. First, we define the size of a homology class, using ideas from relative homology. Second, we define an optimal basis of a homology group to be the basis whose elements' size have the minimal sum. We provide a greedy algorithm to compute the optimal basis and measure classes in it. The algorithm runs in O(β4n3log2n)O(\beta^4 n^3 \log^2 n) time, where nn is the size of the simplicial complex and β\beta is the Betti number of the homology group. Third, we discuss different ways of localizing homology classes and prove some hardness results
    corecore