66 research outputs found
Learning Determinantal Point Processes
Determinantal point processes (DPPs), which arise in random matrix theory and
quantum physics, are natural models for subset selection problems where
diversity is preferred. Among many remarkable properties, DPPs offer tractable
algorithms for exact inference, including computing marginal probabilities and
sampling; however, an important open question has been how to learn a DPP from
labeled training data. In this paper we propose a natural feature-based
parameterization of conditional DPPs, and show how it leads to a convex and
efficient learning formulation. We analyze the relationship between our model
and binary Markov random fields with repulsive potentials, which are
qualitatively similar but computationally intractable. Finally, we apply our
approach to the task of extractive summarization, where the goal is to choose a
small subset of sentences conveying the most important information from a set
of documents. In this task there is a fundamental tradeoff between sentences
that are highly relevant to the collection as a whole, and sentences that are
diverse and not repetitive. Our parameterization allows us to naturally balance
these two characteristics. We evaluate our system on data from the DUC 2003/04
multi-document summarization task, achieving state-of-the-art results
Empirical Limitations on High Frequency Trading Profitability
Addressing the ongoing examination of high-frequency trading practices in
financial markets, we report the results of an extensive empirical study
estimating the maximum possible profitability of the most aggressive such
practices, and arrive at figures that are surprisingly modest. By "aggressive"
we mean any trading strategy exclusively employing market orders and relatively
short holding periods. Our findings highlight the tension between execution
costs and trading horizon confronted by high-frequency traders, and provide a
controlled and large-scale empirical perspective on the high-frequency debate
that has heretofore been absent. Our study employs a number of novel empirical
methods, including the simulation of an "omniscient" high-frequency trader who
can see the future and act accordingly
Subset-Based Instance Optimality in Private Estimation
We propose a new definition of instance optimality for differentially private
estimation algorithms. Our definition requires an optimal algorithm to compete,
simultaneously for every dataset , with the best private benchmark algorithm
that (a) knows in advance and (b) is evaluated by its worst-case
performance on large subsets of . That is, the benchmark algorithm need not
perform well when potentially extreme points are added to ; it only has to
handle the removal of a small number of real data points that already exist.
This makes our benchmark significantly stronger than those proposed in prior
work. We nevertheless show, for real-valued datasets, how to construct private
algorithms that achieve our notion of instance optimality when estimating a
broad class of dataset properties, including means, quantiles, and
-norm minimizers. For means in particular, we provide a detailed
analysis and show that our algorithm simultaneously matches or exceeds the
asymptotic performance of existing algorithms under a range of distributional
assumptions
- β¦