47 research outputs found

    Arena3D: visualizing time-driven phenotypic differences in biological systems

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Elucidating the genotype-phenotype connection is one of the big challenges of modern molecular biology. To fully understand this connection, it is necessary to consider the underlying networks and the time factor. In this context of data deluge and heterogeneous information, visualization plays an essential role in interpreting complex and dynamic topologies. Thus, software that is able to bring the network, phenotypic and temporal information together is needed. Arena3D has been previously introduced as a tool that facilitates link discovery between processes. It uses a layered display to separate different levels of information while emphasizing the connections between them. We present novel developments of the tool for the visualization and analysis of dynamic genotype-phenotype landscapes.</p> <p>Results</p> <p>Version 2.0 introduces novel features that allow handling time course data in a phenotypic context. Gene expression levels or other measures can be loaded and visualized at different time points and phenotypic comparison is facilitated through clustering and correlation display or highlighting of impacting changes through time. Similarity scoring allows the identification of global patterns in dynamic heterogeneous data. In this paper we demonstrate the utility of the tool on two distinct biological problems of different scales. First, we analyze a medium scale dataset that looks at perturbation effects of the pluripotency regulator Nanog in murine embryonic stem cells. Dynamic cluster analysis suggests alternative indirect links between Nanog and other proteins in the core stem cell network. Moreover, recurrent correlations from the epigenetic to the translational level are identified. Second, we investigate a large scale dataset consisting of genome-wide knockdown screens for human genes essential in the mitotic process. Here, a potential new role for the gene <it>lsm14a </it>in cytokinesis is suggested. We also show how phenotypic patterning allows for extensive comparison and identification of high impact knockdown targets.</p> <p>Conclusions</p> <p>We present a new visualization approach for perturbation screens with multiple phenotypic outcomes. The novel functionality implemented in Arena3D enables effective understanding and comparison of temporal patterns within morphological layers, to help with the system-wide analysis of dynamic processes. Arena3D is available free of charge for academics as a downloadable standalone application from: <url>http://arena3d.org/</url>.</p

    Use of machine learning algorithms to classify binary protein sequences as highly-designable or poorly-designable

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>By using a standard Support Vector Machine (SVM) with a Sequential Minimal Optimization (SMO) method of training, Naïve Bayes and other machine learning algorithms we are able to distinguish between two classes of protein sequences: those folding to highly-designable conformations, or those folding to poorly- or non-designable conformations.</p> <p>Results</p> <p>First, we generate all possible compact lattice conformations for the specified shape (a hexagon or a triangle) on the 2D triangular lattice. Then we generate all possible binary hydrophobic/polar (H/P) sequences and by using a specified energy function, thread them through all of these compact conformations. If for a given sequence the lowest energy is obtained for a particular lattice conformation we assume that this sequence folds to that conformation. Highly-designable conformations have many H/P sequences folding to them, while poorly-designable conformations have few or no H/P sequences. We classify sequences as folding to either highly – or poorly-designable conformations. We have randomly selected subsets of the sequences belonging to highly-designable and poorly-designable conformations and used them to train several different standard machine learning algorithms.</p> <p>Conclusion</p> <p>By using these machine learning algorithms with ten-fold cross-validation we are able to classify the two classes of sequences with high accuracy – in some cases exceeding 95%.</p

    Knotted vs. Unknotted Proteins: Evidence of Knot-Promoting Loops

    Get PDF
    Knotted proteins, because of their ability to fold reversibly in the same topologically entangled conformation, are the object of an increasing number of experimental and theoretical studies. The aim of the present investigation is to assess, on the basis of presently available structural data, the extent to which knotted proteins are isolated instances in sequence or structure space, and to use comparative schemes to understand whether specific protein segments can be associated to the occurrence of a knot in the native state. A significant sequence homology is found among a sizeable group of knotted and unknotted proteins. In this family, knotted members occupy a primary sub-branch of the phylogenetic tree and differ from unknotted ones only by additional loop segments. These "knot-promoting" loops, whose virtual bridging eliminates the knot, are found in various types of knotted proteins. Valuable insight into how knots form, or are encoded, in proteins could be obtained by targeting these regions in future computational studies or excision experiments

    Fast and Accurate Resonance Assignment of Small-to-Large Proteins by Combining Automated and Manual Approaches

    Get PDF
    The process of resonance assignment is fundamental to most NMR studies of protein structure and dynamics. Unfortunately, the manual assignment of residues is tedious and time-consuming, and can represent a significant bottleneck for further characterization. Furthermore, while automated approaches have been developed, they are often limited in their accuracy, particularly for larger proteins. Here, we address this by introducing the software COMPASS, which, by combining automated resonance assignment with manual intervention, is able to achieve accuracy approaching that from manual assignments at greatly accelerated speeds. Moreover, by including the option to compensate for isotope shift effects in deuterated proteins, COMPASS is far more accurate for larger proteins than existing automated methods. COMPASS is an open-source project licensed under GNU General Public License and is available for download from http://www.liu.se/forskning/foass/tidigare-foass/patrik-lundstrom/software?l=en. Source code and binaries for Linux, Mac OS X and Microsoft Windows are available.Funding Agencies|Swedish Research Council [Dnr. 2012-5136]</p

    Distance constraints solved geometrically

    Get PDF
    International Symposium on Advances in Robot Kinematics (ARK), 2004, Sestri Levante (Italia)Most geometric constraint problems can be reduced to give coordinates to a set of points from a subset of their pairwise distances. By exploiting this fact, this paper presents an algorithm that solves distance constraint systems by iteratively reducing and expanding the dimension of the problem. In general, these projection/backprojection iterations permit tightening the ranges for the possible solutions but, if at a given point no progress is made, the algorithm bisects the search space and proceeds recursively for both subproblems. This branch-and-prune strategy is shown to converge to all solutions.Peer Reviewe
    corecore