57,907 research outputs found

    Sequence-based Multiscale Model (SeqMM) for High-throughput chromosome conformation capture (Hi-C) data analysis

    Full text link
    In this paper, I introduce a Sequence-based Multiscale Model (SeqMM) for the biomolecular data analysis. With the combination of spectral graph method, I reveal the essential difference between the global scale models and local scale ones in structure clustering, i.e., different optimization on Euclidean (or spatial) distances and sequential (or genomic) distances. More specifically, clusters from global scale models optimize Euclidean distance relations. Local scale models, on the other hand, result in clusters that optimize the genomic distance relations. For a biomolecular data, Euclidean distances and sequential distances are two independent variables, which can never be optimized simultaneously in data clustering. However, sequence scale in my SeqMM can work as a tuning parameter that balances these two variables and deliver different clusterings based on my purposes. Further, my SeqMM is used to explore the hierarchical structures of chromosomes. I find that in global scale, the Fiedler vector from my SeqMM bears a great similarity with the principal vector from principal component analysis, and can be used to study genomic compartments. In TAD analysis, I find that TADs evaluated from different scales are not consistent and vary a lot. Particularly when the sequence scale is small, the calculated TAD boundaries are dramatically different. Even for regions with high contact frequencies, TAD regions show no obvious consistence. However, when the scale value increases further, although TADs are still quite different, TAD boundaries in these high contact frequency regions become more and more consistent. Finally, I find that for a fixed local scale, my method can deliver very robust TAD boundaries in different cluster numbers.Comment: 22 PAGES, 13 FIGURE

    Diffusion Component Analysis: Unraveling Functional Topology in Biological Networks

    Full text link
    Complex biological systems have been successfully modeled by biochemical and genetic interaction networks, typically gathered from high-throughput (HTP) data. These networks can be used to infer functional relationships between genes or proteins. Using the intuition that the topological role of a gene in a network relates to its biological function, local or diffusion based "guilt-by-association" and graph-theoretic methods have had success in inferring gene functions. Here we seek to improve function prediction by integrating diffusion-based methods with a novel dimensionality reduction technique to overcome the incomplete and noisy nature of network data. In this paper, we introduce diffusion component analysis (DCA), a framework that plugs in a diffusion model and learns a low-dimensional vector representation of each node to encode the topological properties of a network. As a proof of concept, we demonstrate DCA's substantial improvement over state-of-the-art diffusion-based approaches in predicting protein function from molecular interaction networks. Moreover, our DCA framework can integrate multiple networks from heterogeneous sources, consisting of genomic information, biochemical experiments and other resources, to even further improve function prediction. Yet another layer of performance gain is achieved by integrating the DCA framework with support vector machines that take our node vector representations as features. Overall, our DCA framework provides a novel representation of nodes in a network that can be used as a plug-in architecture to other machine learning algorithms to decipher topological properties of and obtain novel insights into interactomes.Comment: RECOMB 201

    Fundamentals of direct limit Lie theory

    Full text link
    We show that every countable direct system of finite-dimensional real or complex Lie groups has a direct limit in the category of Lie groups modelled on locally convex spaces. This enables us to push all basic constructions of finite-dimensional Lie theory to the case of direct limit groups. In particular, we obtain an analogue of Lie's third theorem: Every countable-dimensional real or complex locally finite Lie algebra is enlargible, i.e., it is the Lie algebra of some regular Lie group (a suitable direct limit group).Comment: 33 pages (v2: Lemma 7.12 and Proposition 7.13 corrected, clearer distinction between analyticity and convenient analyticity

    BRST Formulation of 4-Monopoles

    Get PDF
    A supersymmetric gauge invariant action is constructed over any 4-dimensional Riemannian manifold describing Witten's theory of 4-monopoles. The topological supersymmetric algebra closes off-shell. The multiplets include the auxiliary fields and the Wess-Zumino fields in an unusual way, arising naturally from BRST gauge fixing. A new canonical approach over Riemann manifolds is followed, using a Morse function as an euclidean time and taking into account the BRST boundary conditions that come from the BFV formulation. This allows a construction of the effective action starting from gauge principles.Comment: 18 pages, Amste

    Connectedness of Higgs bundle moduli for complex reductive Lie groups

    Get PDF
    We carry an intrinsic approach to the study of the connectedness of the moduli space MG\mathcal{M}_G of GG-Higgs bundles, over a compact Riemann surface, when GG is a complex reductive (not necessarily connected) Lie group. We prove that the number of connected components of MG\mathcal{M}_G is indexed by the corresponding topological invariants. In particular, this gives an alternative proof of the counting by J. Li of the number of connected components of the moduli space of flat GG-connections in the case in which GG is connected and semisimple.Comment: Due to some mistake the authors did not appear in the previous version. Fixed this. Final version; to appear in the Asian Journal of Mathematics. 19 page
    corecore