50,095 research outputs found
Clustering and Inference From Pairwise Comparisons
Given a set of pairwise comparisons, the classical ranking problem computes a
single ranking that best represents the preferences of all users. In this
paper, we study the problem of inferring individual preferences, arising in the
context of making personalized recommendations. In particular, we assume that
there are users of types; users of the same type provide similar
pairwise comparisons for items according to the Bradley-Terry model. We
propose an efficient algorithm that accurately estimates the individual
preferences for almost all users, if there are
pairwise comparisons per type, which is near optimal in sample complexity when
only grows logarithmically with or . Our algorithm has three steps:
first, for each user, compute the \emph{net-win} vector which is a projection
of its -dimensional vector of pairwise comparisons onto an
-dimensional linear subspace; second, cluster the users based on the net-win
vectors; third, estimate a single preference for each cluster separately. The
net-win vectors are much less noisy than the high dimensional vectors of
pairwise comparisons and clustering is more accurate after the projection as
confirmed by numerical experiments. Moreover, we show that, when a cluster is
only approximately correct, the maximum likelihood estimation for the
Bradley-Terry model is still close to the true preference.Comment: Corrected typos in the abstrac
Optimal interval clustering: Application to Bregman clustering and statistical mixture learning
We present a generic dynamic programming method to compute the optimal
clustering of scalar elements into pairwise disjoint intervals. This
case includes 1D Euclidean -means, -medoids, -medians, -centers,
etc. We extend the method to incorporate cluster size constraints and show how
to choose the appropriate by model selection. Finally, we illustrate and
refine the method on two case studies: Bregman clustering and statistical
mixture learning maximizing the complete likelihood.Comment: 10 pages, 3 figure
Estimation of instrinsic dimension via clustering
The problem of estimating the intrinsic dimension of a set of points in high dimensional space is a critical issue for a wide range of disciplines, including genomics, finance, and networking. Current estimation techniques are dependent on either the ambient or intrinsic dimension in terms of computational complexity, which may cause these methods to become intractable for large data sets. In this paper, we present a clustering-based methodology that exploits the inherent self-similarity of data to efficiently estimate the intrinsic dimension of a set of points. When the data satisfies a specified general clustering condition, we prove that the estimated dimension approaches the true Hausdorff dimension. Experiments show that the clustering-based approach allows for more efficient and accurate intrinsic dimension estimation compared with all prior techniques, even when the data does not conform to obvious self-similarity structure. Finally, we present empirical results which show the clustering-based estimation allows for a natural partitioning of the data points that lie on separate manifolds of varying intrinsic dimension
Pairwise Covariates-adjusted Block Model for Community Detection
One of the most fundamental problems in network study is community detection.
The stochastic block model (SBM) is one widely used model for network data with
different estimation methods developed with their community detection
consistency results unveiled. However, the SBM is restricted by the strong
assumption that all nodes in the same community are stochastically equivalent,
which may not be suitable for practical applications. We introduce a pairwise
covariates-adjusted stochastic block model (PCABM), a generalization of SBM
that incorporates pairwise covariate information. We study the maximum
likelihood estimates of the coefficients for the covariates as well as the
community assignments. It is shown that both the coefficient estimates of the
covariates and the community assignments are consistent under suitable sparsity
conditions. Spectral clustering with adjustment (SCWA) is introduced to
efficiently solve PCABM. Under certain conditions, we derive the error bound of
community estimation under SCWA and show that it is community detection
consistent. PCABM compares favorably with the SBM or degree-corrected
stochastic block model (DCBM) under a wide range of simulated and real networks
when covariate information is accessible.Comment: 41 pages, 6 figure
Ordered Preference Elicitation Strategies for Supporting Multi-Objective Decision Making
In multi-objective decision planning and learning, much attention is paid to
producing optimal solution sets that contain an optimal policy for every
possible user preference profile. We argue that the step that follows, i.e,
determining which policy to execute by maximising the user's intrinsic utility
function over this (possibly infinite) set, is under-studied. This paper aims
to fill this gap. We build on previous work on Gaussian processes and pairwise
comparisons for preference modelling, extend it to the multi-objective decision
support scenario, and propose new ordered preference elicitation strategies
based on ranking and clustering. Our main contribution is an in-depth
evaluation of these strategies using computer and human-based experiments. We
show that our proposed elicitation strategies outperform the currently used
pairwise methods, and found that users prefer ranking most. Our experiments
further show that utilising monotonicity information in GPs by using a linear
prior mean at the start and virtual comparisons to the nadir and ideal points,
increases performance. We demonstrate our decision support framework in a
real-world study on traffic regulation, conducted with the city of Amsterdam.Comment: AAMAS 2018, Source code at
https://github.com/lmzintgraf/gp_pref_elici
- …