Search CORE

50,095 research outputs found

Clustering and Inference From Pairwise Comparisons

Author: Hajek Bruce
Lelarge Marc
Massoulié Laurent
Srikant R.
Wu Rui
Xu Jiaming
Publication venue
Publication date: 01/01/2015
Field of study

Given a set of pairwise comparisons, the classical ranking problem computes a single ranking that best represents the preferences of all users. In this paper, we study the problem of inferring individual preferences, arising in the context of making personalized recommendations. In particular, we assume that there are

n

users of

r

types; users of the same type provide similar pairwise comparisons for

m

items according to the Bradley-Terry model. We propose an efficient algorithm that accurately estimates the individual preferences for almost all users, if there are

r \max \{m, n\}\log m \log^2 n

pairwise comparisons per type, which is near optimal in sample complexity when

r

only grows logarithmically with

m

n

. Our algorithm has three steps: first, for each user, compute the \emph{net-win} vector which is a projection of its

\binom{m}{2}

-dimensional vector of pairwise comparisons onto an

m

-dimensional linear subspace; second, cluster the users based on the net-win vectors; third, estimate a single preference for each cluster separately. The net-win vectors are much less noisy than the high dimensional vectors of pairwise comparisons and clustering is more accurate after the projection as confirmed by numerical experiments. Moreover, we show that, when a cluster is only approximately correct, the maximum likelihood estimation for the Bradley-Terry model is still close to the true preference.Comment: Corrected typos in the abstrac

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Optimal interval clustering: Application to Bregman clustering and statistical mixture learning

Author: Nielsen Frank
Nock Richard
Publication venue
Publication date: 01/01/2014
Field of study

We present a generic dynamic programming method to compute the optimal clustering of

n

scalar elements into

k

pairwise disjoint intervals. This case includes 1D Euclidean

k

-means,

k

-medoids,

k

-medians,

k

-centers, etc. We extend the method to incorporate cluster size constraints and show how to choose the appropriate

k

by model selection. Finally, we illustrate and refine the method on two case studies: Bregman clustering and statistical mixture learning maximizing the complete likelihood.Comment: 10 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

Estimation of instrinsic dimension via clustering

Author: Crovella Mark
Eriksson Brian
Publication venue: Computer Science Department, Boston University
Publication date: 12/05/2011
Field of study

The problem of estimating the intrinsic dimension of a set of points in high dimensional space is a critical issue for a wide range of disciplines, including genomics, finance, and networking. Current estimation techniques are dependent on either the ambient or intrinsic dimension in terms of computational complexity, which may cause these methods to become intractable for large data sets. In this paper, we present a clustering-based methodology that exploits the inherent self-similarity of data to efficiently estimate the intrinsic dimension of a set of points. When the data satisfies a specified general clustering condition, we prove that the estimated dimension approaches the true Hausdorff dimension. Experiments show that the clustering-based approach allows for more efficient and accurate intrinsic dimension estimation compared with all prior techniques, even when the data does not conform to obvious self-similarity structure. Finally, we present empirical results which show the clustering-based estimation allows for a natural partitioning of the data points that lie on separate manifolds of varying intrinsic dimension

Boston University Institutional Repository (OpenBU)

Pairwise Covariates-adjusted Block Model for Community Detection

Author: Feng Yang
Huang Sihan
Publication venue
Publication date: 20/06/2020
Field of study

One of the most fundamental problems in network study is community detection. The stochastic block model (SBM) is one widely used model for network data with different estimation methods developed with their community detection consistency results unveiled. However, the SBM is restricted by the strong assumption that all nodes in the same community are stochastically equivalent, which may not be suitable for practical applications. We introduce a pairwise covariates-adjusted stochastic block model (PCABM), a generalization of SBM that incorporates pairwise covariate information. We study the maximum likelihood estimates of the coefficients for the covariates as well as the community assignments. It is shown that both the coefficient estimates of the covariates and the community assignments are consistent under suitable sparsity conditions. Spectral clustering with adjustment (SCWA) is introduced to efficiently solve PCABM. Under certain conditions, we derive the error bound of community estimation under SCWA and show that it is community detection consistent. PCABM compares favorably with the SBM or degree-corrected stochastic block model (DCBM) under a wide range of simulated and real networks when covariate information is accessible.Comment: 41 pages, 6 figure

arXiv.org e-Print Archive

Ordered Preference Elicitation Strategies for Supporting Multi-Objective Decision Making

Author: Jonker Catholijn M
Linders Sjoerd
Nowé Ann
Roijers Diederik M
Zintgraf Luisa M
Publication venue
Publication date: 01/01/2018
Field of study

In multi-objective decision planning and learning, much attention is paid to producing optimal solution sets that contain an optimal policy for every possible user preference profile. We argue that the step that follows, i.e, determining which policy to execute by maximising the user's intrinsic utility function over this (possibly infinite) set, is under-studied. This paper aims to fill this gap. We build on previous work on Gaussian processes and pairwise comparisons for preference modelling, extend it to the multi-objective decision support scenario, and propose new ordered preference elicitation strategies based on ranking and clustering. Our main contribution is an in-depth evaluation of these strategies using computer and human-based experiments. We show that our proposed elicitation strategies outperform the currently used pairwise methods, and found that users prefer ranking most. Our experiments further show that utilising monotonicity information in GPs by using a linear prior mean at the start and virtual comparisons to the nadir and ideal points, increases performance. We demonstrate our decision support framework in a real-world study on traffic regulation, conducted with the city of Amsterdam.Comment: AAMAS 2018, Source code at https://github.com/lmzintgraf/gp_pref_elici

arXiv.org e-Print Archive

VU Research Portal