22 research outputs found
Efficient Semidefinite Spectral Clustering via Lagrange Duality
We propose an efficient approach to semidefinite spectral clustering (SSC),
which addresses the Frobenius normalization with the positive semidefinite
(p.s.d.) constraint for spectral clustering. Compared with the original
Frobenius norm approximation based algorithm, the proposed algorithm can more
accurately find the closest doubly stochastic approximation to the affinity
matrix by considering the p.s.d. constraint. In this paper, SSC is formulated
as a semidefinite programming (SDP) problem. In order to solve the high
computational complexity of SDP, we present a dual algorithm based on the
Lagrange dual formalization. Two versions of the proposed algorithm are
proffered: one with less memory usage and the other with faster convergence
rate. The proposed algorithm has much lower time complexity than that of the
standard interior-point based SDP solvers. Experimental results on both UCI
data sets and real-world image data sets demonstrate that 1) compared with the
state-of-the-art spectral clustering methods, the proposed algorithm achieves
better clustering performance; and 2) our algorithm is much more efficient and
can solve larger-scale SSC problems than those standard interior-point SDP
solvers.Comment: 13 page
G-Protein Coupled Receptor Signaling Architecture of Mammalian Immune Cells
A series of recent studies on large-scale networks of signaling and metabolic systems revealed that a certain network structure often called “bow-tie network” are observed. In signaling systems, bow-tie network takes a form with diverse and redundant inputs and outputs connected via a small numbers of core molecules. While arguments have been made that such network architecture enhances robustness and evolvability of biological systems, its functional role at a cellular level remains obscure. A hypothesis was proposed that such a network function as a stimuli-reaction classifier where dynamics of core molecules dictate downstream transcriptional activities, hence physiological responses against stimuli. In this study, we examined whether such hypothesis can be verified using experimental data from Alliance for Cellular Signaling (AfCS) that comprehensively measured GPCR related ligands response for B-cell and macrophage. In a GPCR signaling system, cAMP and Ca2+ act as core molecules. Stimuli-response for 32 ligands to B-Cells and 23 ligands to macrophages has been measured. We found that ligands with correlated changes of cAMP and Ca2+ tend to cluster closely together within the hyperspaces of both cell types and they induced genes involved in the same cellular processes. It was found that ligands inducing cAMP synthesis activate genes involved in cell growth and proliferation; cAMP and Ca2+ molecules that increased together form a feedback loop and induce immune cells to migrate and adhere together. In contrast, ligands without a core molecules response are scattered throughout the hyperspace and do not share clusters. G-protein coupling receptors together with immune response specific receptors were found in cAMP and Ca2+ activated clusters. Analyses have been done on the original software applicable for discovering ‘bow-tie’ network architectures within the complex network of intracellular signaling where ab initio clustering has been implemented as well. Groups of potential transcription factors for each specific group of genes were found to be partly conserved across B-Cell and macrophage. A series of findings support the hypothesis
Doubly Stochastic Neighbor Embedding on Spheres
Stochastic Neighbor Embedding (SNE) methods minimize the divergence between the similarity matrix of a high-dimensional data set and its counterpart from a low-dimensional embedding, leading to widely applied tools for data visualization. Despite their popularity, the current SNE methods experience a crowding problem when the data include highly imbalanced similarities. This implies that the data points with higher total similarity tend to get crowded around the display center. To solve this problem, we introduce a fast normalization method and normalize the similarity matrix to be doubly stochastic such that all the data points have equal total similarities. Furthermore, we show empirically and theoretically that the doubly stochasticity constraint often leads to embeddings which are approximately spherical. This suggests replacing a flat space with spheres as the embedding space. The spherical embedding eliminates the discrepancy between the center and the periphery in visualization, which efficiently resolves the crowding problem. We compared the proposed method (DOSNES) with the state-of-the-art SNE method on three real-world datasets and the results clearly indicate that our method is more favorable in terms of visualization quality. DOSNES is freely available at http://yaolubrain.github.io/dosnes/. (C) 2019 The Authors. Published by Elsevier B.V.Peer reviewe
Manifold Optimization Over the Set of Doubly Stochastic Matrices: A Second-Order Geometry
Convex optimization is a well-established research area with applications in
almost all fields. Over the decades, multiple approaches have been proposed to
solve convex programs. The development of interior-point methods allowed
solving a more general set of convex programs known as semi-definite programs
and second-order cone programs. However, it has been established that these
methods are excessively slow for high dimensions, i.e., they suffer from the
curse of dimensionality. On the other hand, optimization algorithms on manifold
have shown great ability in finding solutions to nonconvex problems in
reasonable time. This paper is interested in solving a subset of convex
optimization using a different approach. The main idea behind Riemannian
optimization is to view the constrained optimization problem as an
unconstrained one over a restricted search space. The paper introduces three
manifolds to solve convex programs under particular box constraints. The
manifolds, called the doubly stochastic, symmetric and the definite multinomial
manifolds, generalize the simplex also known as the multinomial manifold. The
proposed manifolds and algorithms are well-adapted to solving convex programs
in which the variable of interest is a multidimensional probability
distribution function. Theoretical analysis and simulation results testify the
efficiency of the proposed method over state of the art methods. In particular,
they reveal that the proposed framework outperforms conventional generic and
specialized solvers, especially in high dimensions
An Efficient Approach to Solve the Large-Scale Semidefinite Programming Problems
Solving the large-scale problems with semidefinite programming (SDP) constraints is of great importance in modeling and model reduction of complex system, dynamical system, optimal control, computer vision, and machine learning. However, existing SDP solvers are of large complexities and thus unavailable to deal with large-scale problems. In this paper, we solve SDP using matrix generation, which is an extension of the classical column generation. The exponentiated gradient algorithm is also used
to solve the special structure subproblem of matrix generation. The numerical experiments show that our approach is efficient and scales very well with the problem dimension. Furthermore, the proposed algorithm is applied for a clustering problem. The experimental results on real datasets imply that the proposed approach outperforms the traditional interior-point SDP solvers in terms of efficiency and scalability