1,291 research outputs found

    Inferring Rankings Using Constrained Sensing

    Full text link
    We consider the problem of recovering a function over the space of permutations (or, the symmetric group) over nn elements from given partial information; the partial information we consider is related to the group theoretic Fourier Transform of the function. This problem naturally arises in several settings such as ranked elections, multi-object tracking, ranking systems, and recommendation systems. Inspired by the work of Donoho and Stark in the context of discrete-time functions, we focus on non-negative functions with a sparse support (support size β‰ͺ\ll domain size). Our recovery method is based on finding the sparsest solution (through β„“0\ell_0 optimization) that is consistent with the available information. As the main result, we derive sufficient conditions for functions that can be recovered exactly from partial information through β„“0\ell_0 optimization. Under a natural random model for the generation of functions, we quantify the recoverability conditions by deriving bounds on the sparsity (support size) for which the function satisfies the sufficient conditions with a high probability as nβ†’βˆžn \to \infty. β„“0\ell_0 optimization is computationally hard. Therefore, the popular compressive sensing literature considers solving the convex relaxation, β„“1\ell_1 optimization, to find the sparsest solution. However, we show that β„“1\ell_1 optimization fails to recover a function (even with constant sparsity) generated using the random model with a high probability as nβ†’βˆžn \to \infty. In order to overcome this problem, we propose a novel iterative algorithm for the recovery of functions that satisfy the sufficient conditions. Finally, using an Information Theoretic framework, we study necessary conditions for exact recovery to be possible.Comment: 19 page

    Compressive Network Analysis

    Full text link
    Modern data acquisition routinely produces massive amounts of network data. Though many methods and models have been proposed to analyze such data, the research of network data is largely disconnected with the classical theory of statistical learning and signal processing. In this paper, we present a new framework for modeling network data, which connects two seemingly different areas: network data analysis and compressed sensing. From a nonparametric perspective, we model an observed network using a large dictionary. In particular, we consider the network clique detection problem and show connections between our formulation with a new algebraic tool, namely Randon basis pursuit in homogeneous spaces. Such a connection allows us to identify rigorous recovery conditions for clique detection problems. Though this paper is mainly conceptual, we also develop practical approximation algorithms for solving empirical problems and demonstrate their usefulness on real-world datasets

    Minimax-optimal Inference from Partial Rankings

    Full text link
    This paper studies the problem of inferring a global preference based on the partial rankings provided by many users over different subsets of items according to the Plackett-Luce model. A question of particular interest is how to optimally assign items to users for ranking and how many item assignments are needed to achieve a target estimation error. For a given assignment of items to users, we first derive an oracle lower bound of the estimation error that holds even for the more general Thurstone models. Then we show that the Cram\'er-Rao lower bound and our upper bounds inversely depend on the spectral gap of the Laplacian of an appropriately defined comparison graph. When the system is allowed to choose the item assignment, we propose a random assignment scheme. Our oracle lower bound and upper bounds imply that it is minimax-optimal up to a logarithmic factor among all assignment schemes and the lower bound can be achieved by the maximum likelihood estimator as well as popular rank-breaking schemes that decompose partial rankings into pairwise comparisons. The numerical experiments corroborate our theoretical findings.Comment: 16 pages, 2 figure

    Summary Based Structures with Improved Sublinear Recovery for Compressed Sensing

    Get PDF
    We introduce a new class of measurement matrices for compressed sensing, using low order summaries over binary sequences of a given length. We prove recovery guarantees for three reconstruction algorithms using the proposed measurements, including β„“1\ell_1 minimization and two combinatorial methods. In particular, one of the algorithms recovers kk-sparse vectors of length NN in sublinear time poly(klog⁑N)\text{poly}(k\log{N}), and requires at most Ξ©(klog⁑Nlog⁑log⁑N)\Omega(k\log{N}\log\log{N}) measurements. The empirical oversampling constant of the algorithm is significantly better than existing sublinear recovery algorithms such as Chaining Pursuit and Sudocodes. In particular, for 103≀N≀10810^3\leq N\leq 10^8 and k=100k=100, the oversampling factor is between 3 to 8. We provide preliminary insight into how the proposed constructions, and the fast recovery scheme can be used in a number of practical applications such as market basket analysis, and real time compressed sensing implementation

    On Estimating Multi-Attribute Choice Preferences using Private Signals and Matrix Factorization

    Full text link
    Revealed preference theory studies the possibility of modeling an agent's revealed preferences and the construction of a consistent utility function. However, modeling agent's choices over preference orderings is not always practical and demands strong assumptions on human rationality and data-acquisition abilities. Therefore, we propose a simple generative choice model where agents are assumed to generate the choice probabilities based on latent factor matrices that capture their choice evaluation across multiple attributes. Since the multi-attribute evaluation is typically hidden within the agent's psyche, we consider a signaling mechanism where agents are provided with choice information through private signals, so that the agent's choices provide more insight about his/her latent evaluation across multiple attributes. We estimate the choice model via a novel multi-stage matrix factorization algorithm that minimizes the average deviation of the factor estimates from choice data. Simulation results are presented to validate the estimation performance of our proposed algorithm.Comment: 6 pages, 2 figures, to be presented at CISS conferenc

    A Topic Modeling Approach to Ranking

    Full text link
    We propose a topic modeling approach to the prediction of preferences in pairwise comparisons. We develop a new generative model for pairwise comparisons that accounts for multiple shared latent rankings that are prevalent in a population of users. This new model also captures inconsistent user behavior in a natural way. We show how the estimation of latent rankings in the new generative model can be formally reduced to the estimation of topics in a statistically equivalent topic modeling problem. We leverage recent advances in the topic modeling literature to develop an algorithm that can learn shared latent rankings with provable consistency as well as sample and computational complexity guarantees. We demonstrate that the new approach is empirically competitive with the current state-of-the-art approaches in predicting preferences on some semi-synthetic and real world datasets

    Inferring rankings using constrained sensing

    Get PDF
    We consider the problem of recovering a function over the space of permutations (or, the symmetric group) over n elements from given partial information; the partial information we consider is related to the group theoretic Fourier Transform of the function. This problem naturally arises in several settings such as ranked elections, multi-object tracking, ranking systems, and recommendation systems. Inspired by the work of Donoho and Stark in the context of discrete-time functions, we focus on non-negative functions with a sparse support (support size <;<; domain size). Our recovery method is based on finding the sparsest solution (through l[subscript 0] optimization) that is consistent with the available information. As the main result, we derive sufficient conditions for functions that can be recovered exactly from partial information through l[subscript 0] optimization. Under a natural random model for the generation of functions, we quantify the recoverability conditions by deriving bounds on the sparsity (support size) for which the function satisfies the sufficient conditions with a high probability as n β†’ ∞. β„“0 optimization is computationally hard. Therefore, the popular compressive sensing literature considers solving the convex relaxation, β„“[subscript 1] optimization, to find the sparsest solution. However, we show that β„“[subscript 1] optimization fails to recover a function (even with constant sparsity) generated using the random model with a high probability as n β†’ ∞. In order to overcome this problem, we propose a novel iterative algorithm for the recovery of functions that satisfy the sufficient conditions. Finally, using an Information Theoretic framework, we study necessary conditions for exact recovery to be possible

    KCRC-LCD: Discriminative Kernel Collaborative Representation with Locality Constrained Dictionary for Visual Categorization

    Full text link
    We consider the image classification problem via kernel collaborative representation classification with locality constrained dictionary (KCRC-LCD). Specifically, we propose a kernel collaborative representation classification (KCRC) approach in which kernel method is used to improve the discrimination ability of collaborative representation classification (CRC). We then measure the similarities between the query and atoms in the global dictionary in order to construct a locality constrained dictionary (LCD) for KCRC. In addition, we discuss several similarity measure approaches in LCD and further present a simple yet effective unified similarity measure whose superiority is validated in experiments. There are several appealing aspects associated with LCD. First, LCD can be nicely incorporated under the framework of KCRC. The LCD similarity measure can be kernelized under KCRC, which theoretically links CRC and LCD under the kernel method. Second, KCRC-LCD becomes more scalable to both the training set size and the feature dimension. Example shows that KCRC is able to perfectly classify data with certain distribution, while conventional CRC fails completely. Comprehensive experiments on many public datasets also show that KCRC-LCD is a robust discriminative classifier with both excellent performance and good scalability, being comparable or outperforming many other state-of-the-art approaches
    • …
    corecore