7,688 research outputs found

    Distance Metric Learning with Eigenvalue Optimization

    Get PDF
    Copyright Ā© 2012 Yiming Ying and Peng Li.The main theme of this paper is to develop a novel eigenvalue optimization framework for learning a Mahalanobis metric. Within this context, we introduce a novel metric learning approach called DML-eig which is shown to be equivalent to a well-known eigenvalue optimization problem called minimizing the maximal eigenvalue of a symmetric matrix (Overton, 1988; Lewis and Overton, 1996). Moreover, we formulate LMNN (Weinberger et al., 2005), one of the state-of-the-art metric learning methods, as a similar eigenvalue optimization problem. This novel framework not only provides new insights into metric learning but also opens new avenues to the design of efļ¬cient metric learning algorithms. Indeed, ļ¬rst-order algorithms are developed for DML-eig and LMNN which only need the computation of the largest eigenvector of a matrix per iteration. Their convergence characteristics are rigorously established. Various experiments on benchmark data sets show the competitive performance of our new approaches. In addition, we report an encouraging result on a difļ¬cult and challenging face veriļ¬cation data set called Labeled Faces in the Wild (LFW)

    Characterizing and Modeling the Dynamics of Activity and Popularity

    Full text link
    Social media, regarded as two-layer networks consisting of users and items, turn out to be the most important channels for access to massive information in the era of Web 2.0. The dynamics of human activity and item popularity is a crucial issue in social media networks. In this paper, by analyzing the growth of user activity and item popularity in four empirical social media networks, i.e., Amazon, Flickr, Delicious and Wikipedia, it is found that cross links between users and items are more likely to be created by active users and to be acquired by popular items, where user activity and item popularity are measured by the number of cross links associated with users and items. This indicates that users generally trace popular items, overall. However, it is found that the inactive users more severely trace popular items than the active users. Inspired by empirical analysis, we propose an evolving model for such networks, in which the evolution is driven only by two-step random walk. Numerical experiments verified that the model can qualitatively reproduce the distributions of user activity and item popularity observed in empirical networks. These results might shed light on the understandings of micro dynamics of activity and popularity in social media networks.Comment: 13 pages, 6 figures, 2 table

    A Hybrid Approach for Biomarker Discovery from Microarray Gene Expression Data for Cancer Classification

    Get PDF
    Microarrays allow researchers to monitor the gene expression patterns for tens of thousands of genes across a wide range of cellular responses, phenotype and conditions. Selecting a small subset of discriminate genes from thousands of genes is important for accurate classification of diseases and phenotypes. Many methods have been proposed to find subsets of genes with maximum relevance and minimum redundancy, which can distinguish accurately between samples with different labels. To find the minimum subset of relevant genes is often referred as biomarker discovery. Two main approaches, filter and wrapper techniques, have been applied to biomarker discovery. In this paper, we conducted a comparative study of different biomarker discovery methods, including six filter methods and three wrapper methods. We then proposed a hybrid approach, FR-Wrapper, for biomarker discovery. The aim of this approach is to find an optimum balance between the precision of the biomarker discovery and the computation cost, by taking advantages of both filter methodā€™s efficiency and wrapper methodā€™s high accuracy. Our hybrid approach applies Fisherā€™s ratio, a simple method easy to understand and implement, to filter out most of the irrelevant genes, then a wrapper method is employed to reduce the redundancy. The performance of FR-Wrapper approach is evaluated over four widely used microarray datasets. Analysis of experimental results reveals that the hybrid approach can achieve the goal of maximum relevance with minimum redundancy
    • ā€¦
    corecore