66 research outputs found

    Learning Determinantal Point Processes

    Get PDF
    Determinantal point processes (DPPs), which arise in random matrix theory and quantum physics, are natural models for subset selection problems where diversity is preferred. Among many remarkable properties, DPPs offer tractable algorithms for exact inference, including computing marginal probabilities and sampling; however, an important open question has been how to learn a DPP from labeled training data. In this paper we propose a natural feature-based parameterization of conditional DPPs, and show how it leads to a convex and efficient learning formulation. We analyze the relationship between our model and binary Markov random fields with repulsive potentials, which are qualitatively similar but computationally intractable. Finally, we apply our approach to the task of extractive summarization, where the goal is to choose a small subset of sentences conveying the most important information from a set of documents. In this task there is a fundamental tradeoff between sentences that are highly relevant to the collection as a whole, and sentences that are diverse and not repetitive. Our parameterization allows us to naturally balance these two characteristics. We evaluate our system on data from the DUC 2003/04 multi-document summarization task, achieving state-of-the-art results

    Empirical Limitations on High Frequency Trading Profitability

    Get PDF
    Addressing the ongoing examination of high-frequency trading practices in financial markets, we report the results of an extensive empirical study estimating the maximum possible profitability of the most aggressive such practices, and arrive at figures that are surprisingly modest. By "aggressive" we mean any trading strategy exclusively employing market orders and relatively short holding periods. Our findings highlight the tension between execution costs and trading horizon confronted by high-frequency traders, and provide a controlled and large-scale empirical perspective on the high-frequency debate that has heretofore been absent. Our study employs a number of novel empirical methods, including the simulation of an "omniscient" high-frequency trader who can see the future and act accordingly

    Subset-Based Instance Optimality in Private Estimation

    Full text link
    We propose a new definition of instance optimality for differentially private estimation algorithms. Our definition requires an optimal algorithm to compete, simultaneously for every dataset DD, with the best private benchmark algorithm that (a) knows DD in advance and (b) is evaluated by its worst-case performance on large subsets of DD. That is, the benchmark algorithm need not perform well when potentially extreme points are added to DD; it only has to handle the removal of a small number of real data points that already exist. This makes our benchmark significantly stronger than those proposed in prior work. We nevertheless show, for real-valued datasets, how to construct private algorithms that achieve our notion of instance optimality when estimating a broad class of dataset properties, including means, quantiles, and β„“p\ell_p-norm minimizers. For means in particular, we provide a detailed analysis and show that our algorithm simultaneously matches or exceeds the asymptotic performance of existing algorithms under a range of distributional assumptions
    • …
    corecore