129 research outputs found

    Markov Properties of Discrete Determinantal Point Processes

    Get PDF
    Determinantal point processes (DPPs) are probabilistic models for repulsion. When used to represent the occurrence of random subsets of a finite base set, DPPs allow to model global negative associations in a mathematically elegant and direct way. Discrete DPPs have become popular and computationally tractable models for solving several machine learning tasks that require the selection of diverse objects, and have been successfully applied in numerous real-life problems. Despite their popularity, the statistical properties of such models have not been adequately explored. In this note, we derive the Markov properties of discrete DPPs and show how they can be expressed using graphical models.Comment: 9 pages, 1 figur

    Advances in the Theory of Determinantal Point Processes

    Get PDF
    The theory of determinantal point processes has its roots in work in mathematical physics in the 1960s, but it is only in recent years that it has been developed beyond several specific examples. While there is a rich probabilistic theory, there are still many open questions in this area, and its applications to statistics and machine learning are still largely unexplored. Our contributions are threefold. First, we develop the theory of determinantal point processes on a finite set. While there is a small body of literature on this topic, we offer a new perspective that allows us to unify and extend previous results. Second, we investigate several new kernels. We describe these processes explicitly, and investigate the new discrete distribution which arises from our computations. Finally, we show how the parameters of a determinantal point process over a finite ground set with a symmetric kernel may be computed if infinite samples are available. This algorithm is a vital step towards the use of determinantal point processes as a general statistical model

    Efficient Failure Pattern Identification of Predictive Algorithms

    Full text link
    Given a (machine learning) classifier and a collection of unlabeled data, how can we efficiently identify misclassification patterns presented in this dataset? To address this problem, we propose a human-machine collaborative framework that consists of a team of human annotators and a sequential recommendation algorithm. The recommendation algorithm is conceptualized as a stochastic sampler that, in each round, queries the annotators a subset of samples for their true labels and obtains the feedback information on whether the samples are misclassified. The sampling mechanism needs to balance between discovering new patterns of misclassification (exploration) and confirming the potential patterns of classification (exploitation). We construct a determinantal point process, whose intensity balances the exploration-exploitation trade-off through the weighted update of the posterior at each round to form the generator of the stochastic sampler. The numerical results empirically demonstrate the competitive performance of our framework on multiple datasets at various signal-to-noise ratios.Comment: 19 pages, Accepted for UAI202

    Intertwining wavelets or Multiresolution analysis on graphs through random forests

    Full text link
    We propose a new method for performing multiscale analysis of functions defined on the vertices of a finite connected weighted graph. Our approach relies on a random spanning forest to downsample the set of vertices, and on approximate solutions of Markov intertwining relation to provide a subgraph structure and a filter bank leading to a wavelet basis of the set of functions. Our construction involves two parameters q and q'. The first one controls the mean number of kept vertices in the downsampling, while the second one is a tuning parameter between space localization and frequency localization. We provide an explicit reconstruction formula, bounds on the reconstruction operator norm and on the error in the intertwining relation, and a Jackson-like inequality. These bounds lead to recommend a way to choose the parameters q and q'. We illustrate the method by numerical experiments.Comment: 39 pages, 12 figure
    • …
    corecore