9 research outputs found

    Identifying networks with common organizational principles

    Full text link
    Many complex systems can be represented as networks, and the problem of network comparison is becoming increasingly relevant. There are many techniques for network comparison, from simply comparing network summary statistics to sophisticated but computationally costly alignment-based approaches. Yet it remains challenging to accurately cluster networks that are of a different size and density, but hypothesized to be structurally similar. In this paper, we address this problem by introducing a new network comparison methodology that is aimed at identifying common organizational principles in networks. The methodology is simple, intuitive and applicable in a wide variety of settings ranging from the functional classification of proteins to tracking the evolution of a world trade network.Comment: 26 pages, 7 figure

    Regularized Optimal Transport and the Rot Mover's Distance

    Full text link
    This paper presents a unified framework for smooth convex regularization of discrete optimal transport problems. In this context, the regularized optimal transport turns out to be equivalent to a matrix nearness problem with respect to Bregman divergences. Our framework thus naturally generalizes a previously proposed regularization based on the Boltzmann-Shannon entropy related to the Kullback-Leibler divergence, and solved with the Sinkhorn-Knopp algorithm. We call the regularized optimal transport distance the rot mover's distance in reference to the classical earth mover's distance. We develop two generic schemes that we respectively call the alternate scaling algorithm and the non-negative alternate scaling algorithm, to compute efficiently the regularized optimal plans depending on whether the domain of the regularizer lies within the non-negative orthant or not. These schemes are based on Dykstra's algorithm with alternate Bregman projections, and further exploit the Newton-Raphson method when applied to separable divergences. We enhance the separable case with a sparse extension to deal with high data dimensions. We also instantiate our proposed framework and discuss the inherent specificities for well-known regularizers and statistical divergences in the machine learning and information geometry communities. Finally, we demonstrate the merits of our methods with experiments using synthetic data to illustrate the effect of different regularizers and penalties on the solutions, as well as real-world data for a pattern recognition application to audio scene classification

    Optimal Projection Method Determination by Logdet Divergence and Perturbed Von-Neumann Divergence

    Get PDF
    published_or_final_versio

    INVESTIGATION OF ORTHOGONAL POLYNOMIAL KERNELS AS SIMILARITY FUNCTIONS FOR PATTERN CLASSIFICATION BY SUPPORT VECTOR MACHINES

    Get PDF
    A kernel function is an important component in the support vector machine (SVM) kernel-based classifier. This is due to the elegant mathematical characteristics of a kernel, which amount to the mapping of non-linearly separable classes to an implicit higher-dimensional feature space where they can become linearly separable, and hence easier to classify. Such characteristics are those prescribed by the underpinning positive semi-definite (PSD) property. The properties of this feature space can, however, be difficult to interpret, to customize or select an appropriate kernel for the classification task at hand. Moreover, the high-dimensionality of the feature space does not usually provide apparent and intuitive information about the natural representations of the data in the input space, as the construction of this feature space is only implicit. On the other hand, SVM kernels have also been regarded as similarity functions in many contexts to measure the resemblance between two patterns, which can be from the same or different classes. However, despite the elegant theory of PSD kernels, and its remarkable implications on the performance of many learning algorithms, limited research efforts seem to have studied kernels from this similarity perspective. Given that patterns from the same class share more similar characteristics than those belonging to different classes, this similarity perspective can therefore provide more tangible means to craft or select appropriate kernels than the properties of the implicit high-dimensional feature spaces that one might not even be able to calculate. This thesis therefore aims to: (i) investigate the similarity-based properties, which can be exploited to characterise kernels (with focus on the so-called “orthogonal polynomial kernels”) when used as similarity functions, and (ii) assess the influence of these properties on the performance of the SVM classifier. An appropriate similarity-based model is therefore defined in the thesis based on how the shape of an SVM kernel should ideally look like when used to measure the similarity between its two inputs. The model proposes that the similarity curve should be maximized when the two kernel inputs are identical, and it should decay monotonically as they differ more and more from each other. Motivated by the pictorial characteristics of the Chebyshev kernels reported in the literature, the thesis adopts this kernel-shape perspective to also study some other orthogonal polynomial kernels (such as the Legendre kernels and Hermite kernels), to underpin the assessment of the proposed ideal shape of the similarity curve for kernel-based pattern classification by SVMs. The analysis of these polynomial kernels revealed that they are naturally constructed from smaller kernel building blocks, which are combined by summation and multiplication operations. A novel similarity fusion framework is therefore developed in this thesis to investigate the effect of these fusion operations on the shape characteristics of the kernels and on their classification performance. This framework is developed in three stages, where Stage 1 kernels are those building blocks constructed from only the polynomial order n (the highest order under consideration), whereas Stage 2 kernels combine all the Stage 1 kernel blocks (from order 0 to n) using a summation fusion operation. The Stage 3 kernels finally combine Stage 2 kernels with another kernel via a multiplication fusion operation. The analysis of the shape characteristics of these three-stage polynomial kernels revealed that their inherent fusion operations are synergistic in nature, as they bring their shapes closer to the ideal similarity function model, and hence enable the calculation of more accurate similarity measures, and accordingly score better classification performance. Experimental results showed that these summative and multiplicative fusion operations improved the classification accuracy by average factors of 17.35% and 19.16%, respectively, depending on the dataset and the polynomial function employed. On the other hand, the shapes of the Stage 2 polynomial kernels have also been shown to oscillate after a certain threshold within the standard normalized input space of [-1,1]. A simple adaptive data normalization approach is therefore proposed to confine the data to the threshold window where these kernels exhibit the sought after ideal shape characteristics, hence eliminate the possibility of any data point to be located outside the range where these oscillations are observed. The implementation of the adaptive data normalization approach accordingly leads to a more accurate calculation of similarity measures and improves the classification performance. When compared to the standard normalized input space, experimental results (performed on the Stage 2 kernels) demonstrate the effectiveness of the proposed adaptive data normalization approach, with an average accuracy improvement factor of 11.772%, depending on the dataset and the polynomial function utilized. Finally, a new perspective is also introduced whereby the utilization of orthogonal polynomials is perceived as a way of transforming the input space to another vector space, of the same dimensionality as the input space, prior to the kernel calculation step. Based on this perspective, a novel processing approach, based on vector concatenation, is proposed which, unlike the previous approaches, ensures that the quantities processed by each polynomial order are always formulated in vector form. This way, the attributes embedded in the structure of the original vectors are maintained intact. The proposed concatenated processing approach can also be used with any polynomial function, regardless of the parity combination of its monomials, whether they are only odd, only even, or a combination of both. Moreover, the Gaussian kernel is also proposed to be evaluated on vectors processed by the polynomial kernels (instead of the linear kernel used in the previous approaches), due to the more accurate similarity shape characteristics of the Gaussian kernel, as well as its renowned ability to implicitly map the input space to a feature space of higher dimensionality. Experimental results demonstrate the superiority of the concatenated approach for all the three polynomial-kernel stages of the developed similarity fusion framework and for all the polynomial functions under investigation. When the Gaussian kernel is evaluated on the vectors processed using the concatenated approach, the observed results show a statistically significant improvement in the average classification accuracy of 22.269%, compared to when the linear kernel is evaluated on the vectors processed using the previously proposed approaches

    Learning SVM classifiers with indefinite kernels

    No full text
    Recently, training support vector machines with indefinite kernels has attracted great attention in the machine learning community. In this paper, we tackle this problem by formulating a joint optimization model over SVM classifications and kernel principal component analysis. We first reformulate the kernel principal component analysis as a general kernel transformation framework, and then incorporate it into the SVM classification to formulate a joint optimization model. The proposed model has the advantage of making consistent kernel transformations over training and test samples. It can be used for both binary classification and multiclass classification problems. Our experimental results on both synthetic data sets and real world data sets show the proposed model can significantly outperform related approaches. Copyrigh

    Learning SVM Classifiers with Indefinite Kernels

    No full text
    Recently, training support vector machines with indefinite kernels has attracted great attention in the machine learning community. In this paper, we tackle this problem by formulating a joint optimization model over SVM classifications and kernel principal component analysis. We first reformulate the kernel principal component analysis as a general kernel transformation framework, and then incorporate it into the SVM classification to formulate a joint optimization model. The proposed model has the advantage of making consistent kernel transformations over training and test samples. It can be used for both binary classification and multi-class classification problems. Our experimental results on both synthetic data sets and real world data sets show the proposed model can significantly outperform related approaches
    corecore