44 research outputs found

    Structural evolution drives diversification of the large LRR-RLK gene family

    Get PDF
    Cells are continuously exposed to chemical signals that they must discriminate between and respond to appropriately. In embryophytes, the leucine‐rich repeat receptor‐like kinases (LRR‐RLKs) are signal receptors critical in development and defense. LRR‐RLKs have diversified to hundreds of genes in many plant genomes. Although intensively studied, a well‐resolved LRR‐RLK gene tree has remained elusive. To resolve the LRR‐RLK gene tree, we developed an improved gene discovery method based on iterative hidden Markov model searching and phylogenetic inference. We used this method to infer complete gene trees for each of the LRR‐RLK subclades and reconstructed the deepest nodes of the full gene family. We discovered that the LRR‐RLK gene family is even larger than previously thought, and that protein domain gains and losses are prevalent. These structural modifications, some of which likely predate embryophyte diversification, led to misclassification of some LRR‐RLK variants as members of other gene families. Our work corrects this misclassification. Our results reveal ongoing structural evolution generating novel LRR‐RLK genes. These new genes are raw material for the diversification of signaling in development and defense. Our methods also enable phylogenetic reconstruction in any large gene family

    Transcriptional Regulation of Sorghum Stem Composition : Key Players Identified Through Co-expression Gene Network and Comparative Genomics Analyses

    Get PDF
    Most sorghum biomass accumulates in stem secondary cell walls (SCW). As sorghum stems are used as raw materials for various purposes such as feed, energy and fiber reinforced polymers, identifying the genes responsible for SCW establishment is highly important. Taking advantage of studies performed in model species, most of the structural genes contributing at the molecular level to the SCW biosynthesis in sorghum have been proposed while their regulatory factors have mostly not been determined. Validation of the role of several MYB and NAC transcription factors in SCW regulation in Arabidopsis and a few other species has been provided. In this study, we contributed to the recent efforts made in grasses to uncover the mechanisms underlying SCW establishment. We reported updated phylogenies of NAC and MYB in 9 different species and exploited findings from other species to highlight candidate regulators of SCW in sorghum. We acquired expression data during sorghum internode development and used co-expression analyses to determine groups of co-expressed genes that are likely to be involved in SCW establishment. We were able to identify two groups of co-expressed genes presenting multiple evidences of involvement in SCW building. Gene enrichment analysis of MYB and NAC genes provided evidence that while NAC SECONDARY WALL THICKENING PROMOTING FACTOR NST genes and SECONDARY WALL-ASSOCIATED NAC DOMAIN PROTEIN gene functions appear to be conserved in sorghum, NAC master regulators of SCW in sorghum may not be as tissue compartmentalized as in Arabidopsis. We showed that for every homolog of the key SCW MYB in Arabidopsis, a similar role is expected for sorghum. In addition, we unveiled sorghum MYB and NAC that have not been identified to date as being involved in cell wall regulation. Although specific validation of the MYB and NAC genes uncovered in this study is needed, we provide a network of sorghum genes involved in SCW both at the structural and regulatory levels

    A Satisfiability-based Approach for Embedding Generalized Tanglegrams on Level Graphs

    Get PDF
    A tanglegram is a pair of trees on the same set of leaves with matching leaves in the two trees joined by an edge. Tanglegrams are widely used in computational biology to compare evolutionary histories of species. In this paper we present a formulation of two related combinatorial embedding problems concerning tanglegrams in terms of CNF-formulas. The first problem is known as planar embedding and the second as crossing minimization problem. We show that our satisfiability formulation of these problems can handle a much more general case with more than two, not necessarily binary or complete, trees defined on arbitrary sets of leaves and allowed to vary their layouts

    Phylogeny.fr: robust phylogenetic analysis for the non-specialist

    Get PDF
    Phylogenetic analyses are central to many research areas in biology and typically involve the identification of homologous sequences, their multiple alignment, the phylogenetic reconstruction and the graphical representation of the inferred tree. The Phylogeny.fr platform transparently chains programs to automatically perform these tasks. It is primarily designed for biologists with no experience in phylogeny, but can also meet the needs of specialists; the first ones will find up-to-date tools chained in a phylogeny pipeline to analyze their data in a simple and robust way, while the specialists will be able to easily build and run sophisticated analyses. Phylogeny.fr offers three main modes. The ‘One Click’ mode targets non-specialists and provides a ready-to-use pipeline chaining programs with recognized accuracy and speed: MUSCLE for multiple alignment, PhyML for tree building, and TreeDyn for tree rendering. All parameters are set up to suit most studies, and users only have to provide their input sequences to obtain a ready-to-print tree. The ‘Advanced’ mode uses the same pipeline but allows the parameters of each program to be customized by users. The ‘A la Carte’ mode offers more flexibility and sophistication, as users can build their own pipeline by selecting and setting up the required steps from a large choice of tools to suit their specific needs. Prior to phylogenetic analysis, users can also collect neighbors of a query sequence by running BLAST on general or specialized databases. A guide tree then helps to select neighbor sequences to be used as input for the phylogeny pipeline. Phylogeny.fr is available at: http://www.phylogeny.fr

    PhyloPattern: regular expressions to identify complex patterns in phylogenetic trees

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>To effectively apply evolutionary concepts in genome-scale studies, large numbers of phylogenetic trees have to be automatically analysed, at a level approaching human expertise. Complex architectures must be recognized within the trees, so that associated information can be extracted.</p> <p>Results</p> <p>Here, we present a new software library, PhyloPattern, for automating tree manipulations and analysis. PhyloPattern includes three main modules, which address essential tasks in high-throughput phylogenetic tree analysis: node annotation, pattern matching, and tree comparison. PhyloPattern thus allows the programmer to focus on: i) the use of predefined or user defined annotation functions to perform immediate or deferred evaluation of node properties, ii) the search for user-defined patterns in large phylogenetic trees, iii) the pairwise comparison of trees by dynamically generating patterns from one tree and applying them to the other.</p> <p>Conclusion</p> <p>PhyloPattern greatly simplifies and accelerates the work of the computer scientist in the evolutionary biology field. The library has been used to automatically identify phylogenetic evidence for domain shuffling or gene loss events in the evolutionary histories of protein sequences. However any workflow that relies on phylogenetic tree analysis, could be automated with PhyloPattern.</p

    PlasmoDraft: a database of Plasmodium falciparum gene function predictions based on postgenomic data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Of the 5 484 predicted proteins of <it>Plasmodium falciparum</it>, the main causative agent of malaria, about 60% do not have sufficient sequence similarity with proteins in other organisms to warrant provision of functional assignments. Non-homology methods are thus needed to obtain functional clues for these uncharacterized genes.</p> <p>Results</p> <p>We present PlasmoDraft <url>http://atgc.lirmm.fr/PlasmoDraft/</url>, a database of Gene Ontology (GO) annotation predictions for <it>P. falciparum </it>genes based on postgenomic data. Predictions of PlasmoDraft are achieved with a <it>Guilt By Association </it>method named Gonna. This involves (1) a predictor that proposes GO annotations for a gene based on the similarity of its profile (measured with transcriptome, proteome or interactome data) with genes already annotated by GeneDB; (2) a procedure that estimates the confidence of the predictions achieved with each data source; (3) a procedure that combines all data sources to provide a global summary and confidence estimate of the predictions. Gonna has been applied to all <it>P. falciparum </it>genes using most publicly available transcriptome, proteome and interactome data sources. Gonna provides predictions for numerous genes without any annotations. For example, 2 434 genes without any annotations in the Biological Process ontology are associated with specific GO terms (<it>e.g</it>. Rosetting, Antigenic variation), and among these, 841 have confidence values above 50%. In the Cellular Component and Molecular Function ontologies, 1 905 and 1 540 uncharacterized genes are associated with specific GO terms, respectively (740 and 329 with confidence value above 50%).</p> <p>Conclusion</p> <p>All predictions along with their confidence values have been compiled in PlasmoDraft, which thus provides an extensive database of GO annotation predictions that can be achieved with these data sources. The database can be accessed in different ways. A global view allows for a quick inspection of the GO terms that are predicted with high confidence, depending on the various data sources. A gene view and a GO term view allow for the search of potential GO terms attached to a given gene, and genes that potentially belong to a given GO term.</p

    Detection of gene orthology from gene co-expression and protein interaction networks

    Get PDF
    Background Ortholog detection methods present a powerful approach for finding genes that participate in similar biological processes across different organisms, extending our understanding of interactions between genes across different pathways, and understanding the evolution of gene families. Results We exploit features derived from the alignment of protein-protein interaction networks and gene-coexpression networks to reconstruct KEGG orthologs for Drosophila melanogaster, Saccharomyces cerevisiae, Mus musculus and Homo sapiens protein-protein interaction networks extracted from the DIP repository and Mus musculus and Homo sapiens and Sus scrofa gene coexpression networks extracted from NCBI\u27s Gene Expression Omnibus using the decision tree, Naive-Bayes and Support Vector Machine classification algorithms. Conclusions The performance of our classifiers in reconstructing KEGG orthologs is compared against a basic reciprocal BLAST hit approach. We provide implementations of the resulting algorithms as part of BiNA, an open source biomolecular network alignment toolkit

    South Green Galaxy: a suite of tools for plant genomics

    Get PDF
    Playwright: N/A Director: N/A Academic Year: 2000-2001https://scholarworks.sjsu.edu/production_images/2682/thumbnail.jp
    corecore