22 research outputs found

    The calculation of information and organismal complexity

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>It is difficult to measure precisely the phenotypic complexity of living organisms. Here we propose a method to calculate the minimal amount of genomic information needed to construct organism (effective information) as a measure of organismal complexity, by using permutation and combination formulas and Shannon's information concept.</p> <p>Results</p> <p>The results demonstrate that the calculated information correlates quite well with the intuitive organismal phenotypic complexity defined by traditional taxonomy and evolutionary theory. From viruses to human beings, the effective information gradually increases, from thousands of bits to hundreds of millions of bits. The simpler the organism is, the less the information; the more complex the organism, the more the information. About 13% of human genome is estimated as effective information or functional sequence.</p> <p>Conclusions</p> <p>The effective information can be used as a quantitative measure of phenotypic complexity of living organisms and also as an estimate of functional fraction of genome.</p> <p>Reviewers</p> <p>This article was reviewed by Dr. Lavanya Kannan (nominated by Dr. Arcady Mushegian), Dr. Chao Chen, and Dr. ED Rietman (nominated by Dr. Marc Vidal).</p

    Hormone-Dependent Expression of a Steroidogenic Acute Regulatory Protein Natural Antisense Transcript in MA-10 Mouse Tumor Leydig Cells

    Get PDF
    Cholesterol transport is essential for many physiological processes, including steroidogenesis. In steroidogenic cells hormone-induced cholesterol transport is controlled by a protein complex that includes steroidogenic acute regulatory protein (StAR). Star is expressed as 3.5-, 2.8-, and 1.6-kb transcripts that differ only in their 3′-untranslated regions. Because these transcripts share the same promoter, mRNA stability may be involved in their differential regulation and expression. Recently, the identification of natural antisense transcripts (NATs) has added another level of regulation to eukaryotic gene expression. Here we identified a new NAT that is complementary to the spliced Star mRNA sequence. Using 5′ and 3′ RACE, strand-specific RT-PCR, and ribonuclease protection assays, we demonstrated that Star NAT is expressed in MA-10 Leydig cells and steroidogenic murine tissues. Furthermore, we established that human chorionic gonadotropin stimulates Star NAT expression via cAMP. Our results show that sense-antisense Star RNAs may be coordinately regulated since they are co-expressed in MA-10 cells. Overexpression of Star NAT had a differential effect on the expression of the different Star sense transcripts following cAMP stimulation. Meanwhile, the levels of StAR protein and progesterone production were downregulated in the presence of Star NAT. Our data identify antisense transcription as an additional mechanism involved in the regulation of steroid biosynthesis

    Structural and functional annotation of the porcine immunome

    Get PDF
    Background: The domestic pig is known as an excellent model for human immunology and the two species share many pathogens. Susceptibility to infectious disease is one of the major constraints on swine performance, yet the structure and function of genes comprising the pig immunome are not well-characterized. The completion of the pig genome provides the opportunity to annotate the pig immunome, and compare and contrast pig and human immune systems.[br/] Results: The Immune Response Annotation Group (IRAG) used computational curation and manual annotation of the swine genome assembly 10.2 (Sscrofa10.2) to refine the currently available automated annotation of 1,369 immunity-related genes through sequence-based comparison to genes in other species. Within these genes, we annotated 3,472 transcripts. Annotation provided evidence for gene expansions in several immune response families, and identified artiodactyl-specific expansions in the cathelicidin and type 1 Interferon families. We found gene duplications for 18 genes, including 13 immune response genes and five non-immune response genes discovered in the annotation process. Manual annotation provided evidence for many new alternative splice variants and 8 gene duplications. Over 1,100 transcripts without porcine sequence evidence were detected using cross-species annotation. We used a functional approach to discover and accurately annotate porcine immune response genes. A co-expression clustering analysis of transcriptomic data from selected experimental infections or immune stimulations of blood, macrophages or lymph nodes identified a large cluster of genes that exhibited a correlated positive response upon infection across multiple pathogens or immune stimuli. Interestingly, this gene cluster (cluster 4) is enriched for known general human immune response genes, yet contains many un-annotated porcine genes. A phylogenetic analysis of the encoded proteins of cluster 4 genes showed that 15% exhibited an accelerated evolution as compared to 4.1% across the entire genome.[br/] Conclusions: This extensive annotation dramatically extends the genome-based knowledge of the molecular genetics and structure of a major portion of the porcine immunome. Our complementary functional approach using co-expression during immune response has provided new putative immune response annotation for over 500 porcine genes. Our phylogenetic analysis of this core immunome cluster confirms rapid evolutionary change in this set of genes, and that, as in other species, such genes are important components of the pig’s adaptation to pathogen challenge over evolutionary time. These comprehensive and integrated analyses increase the value of the porcine genome sequence and provide important tools for global analyses and data-mining of the porcine immune response

    BayGO: Bayesian analysis of ontology term enrichment in microarray data

    Get PDF
    BACKGROUND: The search for enriched (aka over-represented or enhanced) ontology terms in a list of genes obtained from microarray experiments is becoming a standard procedure for a system-level analysis. This procedure tries to summarize the information focussing on classification designs such as Gene Ontology, KEGG pathways, and so on, instead of focussing on individual genes. Although it is well known in statistics that association and significance are distinct concepts, only the former approach has been used to deal with the ontology term enrichment problem. RESULTS: BayGO implements a Bayesian approach to search for enriched terms from microarray data. The R source-code is freely available at in three versions: Linux, which can be easily incorporated into pre-existent pipelines; Windows, to be controlled interactively; and as a web-tool. The software was validated using a bacterial heat shock response dataset, since this stress triggers known system-level responses. CONCLUSION: The Bayesian model accounts for the fact that, eventually, not all the genes from a given category are observable in microarray data due to low intensity signal, quality filters, genes that were not spotted and so on. Moreover, BayGO allows one to measure the statistical association between generic ontology terms and differential expression, instead of working only with the common significance analysis

    RNA-SEQ applied to the peacock blenny Salaria pavo: unveiling the gene networks and signalling pathways behind phenotypic plasticity in a littoral fish

    Get PDF
    Phenotypic plasticity is the ability of an individual genome to produce different phenotypes depending on environmental cues. These plastic responses rely on diverse genomic mechanisms and allow an organism to maximize its fitness in a variety of social and physical settings. The development of next-generation sequencing (NGS) technologies, especially RNA Sequencing (RNA-Seq), has made it possible to investigate the distinct patterns of gene expression known to be underlying plastic phenotypes in species with ecological interest. In teleost fishes, changes in phenotypes is often observed during the reproductive season, with shifts and adjustments in dominance status that can lead to the co-existence of multiple reproductive morphs within the same population. One such example is the peacock blenny Salaria pavo (Risso, 1810), a species where the intensity of mating competition varies among populations due to nest-site availability, such that two different levels of plasticity arise: 1) intraspecific variation in reproductive behaviour for males that can follow either of two developmental pathways, grow directly into nest-holder males, or behave first as female mimics to sneak fertilizations (sneaker males) and later transition into nest-holder males, and 2) inter-population variation in courting roles of females and nest-holder males. This system provides the ideal basis to apply RNA-Seq methods to study plasticity since differences in reproductive traits within and among populations can reveal which genetic and genomic mechanisms underpin the observed variation in behavioural response to changes in the social environment. However, the genomic information available for this species was scarce, and hence multiple sequencing techniques were used and the methodologies applied optimized throughout the work. In this thesis, we start by first obtaining a de novo transcriptome assembly to develop the first genetic markers for this species (Chapter 2). These microsatellites were used to elucidate the reproductive success (i.e. consisting of mating success and fertilization success) of male ARTs, which can be used as a proxy of Darwinian fitness (Chapter 3). In this study, we detected a fertilization success for nestholder males of 95%, and showed a stronger influence of the social environment rather than morphological variables in the proportion of lost fertilizations by nest-holder males of this species. Taking advantage of the developed transcriptome, we used highthroughput sequencing to obtain expression profiles for male morphs (i.e. intraspecific variation) and females in this species, and focus on the role of differential gene expression in the evolution of sequential alternative reproductive tactics (ARTs) that involve the expression of both male and female traits (Chapter 4). Additionally, we show how the distinct behavioural repertoires are facilitated by distinct neurogenomic states, which discriminate not only sex but also male morphs. Lastly, using two different target tissues, gonads and forebrain, we focus on the genomic regulation of sex roles in courtship behaviour between females and males from two populations under different selective regimes (inter-population variation), the Portuguese coastal population with reversed sex roles and the rocky Italian population with ‘conventional’ sex roles (Chapter 5). Here we demonstrate that variation in gene expression at the brain level segregates individuals by population rather than by sex, indicating that plasticity in behaviour across populations drives variation in neurogenomic expression. On the other hand, at the gonad level, variation in gene expression segregates individuals by sex and then by population, indicating that sexual selection is also acting at the intrasexual level, particularly in nestholder males by paralleling differences in gonadal investment. However, the genomic mechanisms underlying courtship behaviour were not fully elucidated, and more studies are necessary.A plasticidade fenotípica consiste na capacidade de o mesmo genoma produzir diferentes fenótipos comportamentais dependendo das pistas ambientais recebidas. Estas respostas plásticas dependem de diversos mecanismos genómicos e permitem que o indivíduo maximize a sua fitness (aptidão) numa variedade de ambientes ecológicos. O desenvolvimento verificado nas tecnologias de sequenciação de alto desempenho ao longo da última década, globalmente denominadas de “Next Generation Sequencing” (NGS), permitiu o estabelecimento de métodos de análise e ferramentas genómicas que podem ser aplicadas em todos os sistemas ecológicos de interesse em biologia, sem a existência prévia de um genoma curado. Nomeadamente a tecnologia de sequenciação de ARN, conhecida globalmente como RNA-Seq, tornou possível a investigação dos perfis de expressão génica que se sabe serem determinantes na emergência de fenótipos plásticos, e consequentemente permitem determinar fenótipos em estados distintos de expressão genómica. Em peixes teleósteos, é possível observar com frequência modificações no fenótipo comportamental durante o período de reprodução, como por exemplo alterações e ajustes no estatuto de dominância que podem levar à coexistência de indivíduos que apresentam diferentes táticas de reprodução dentro da mesma população. Um desses exemplos é o peixe marachomba-pavão Salaria pavo (Risso, 1810), onde a intensidade na competição intra e intersexual varia entre populações sendo modulada pela disponibilidade de locais de nidificação, de forma a que dois níveis diferentes de plasticidade surgem: 1) variação intraespecífica no comportamento reprodutivo em machos que podem seguir uma de duas vias de desenvolvimento, investirem no seu crescimento e tornarem-se machos nidificantes na sua primeira época de reprodução, ou primeiro seguir uma tática de macho parasita onde investem em fertilizações furtivas, sendo que mais tarde no seu desenvolvimento fazem a transição para macho nidificante, e 2) variação interpopulacional nos papeis de corte de fêmeas e machos nidificantes. Os machos parasitas, conhecidos nesta espécie como “sneakers”, possuem uma particularidade que os tornam singulares, para além de imitarem a morfologia das fêmeas também conseguem imitar o seu comportamento de corte direcionado ao macho nidificante, o que lhes permite aproximarem-se discretamente dos ninhos dos machos e fertilizar parte dos ovos que as fêmeas depositam. Este sistema constitui a base ideal para aplicar métodos de RNA-Seq e estudar esta plasticidade fenotípica, uma vez que diferenças nas características reprodutivas dentro e entre populações podem revelar quais os mecanismos genéticos e genómicos subjacentes à variação observada em resposta a mudanças no ambiente ecológico. No entanto, a informação genómica disponível nesta espécie é reduzida e, por isso diferentes técnicas de sequenciação, assim como diferentes métodos de análise foram usados e otimizados ao longo deste trabalho. A presente tese é constituída por quatro trabalhos, sendo que no primeiro estudo se começa pela sequenciação de uma biblioteca de ARN proveniente de uma mistura de múltiplos indivíduos e de tecidos, de forma a captar a diversidade genética e desenvolver os primeiros marcadores genéticos nesta espécie (Capítulo 2). Com base nestes marcadores, microssatélites polimórficos, foi possível genotipar uma fração dos indivíduos da população existente na Ilha da Culatra (Ria Formosa, Portugal) bem como os ovos retirados de ninhos alvo, de forma a fazer análises de paternidade (Capítulo 3). Neste estudo, foi possível estimar o sucesso de fertilização de ovos de cada uma das táticas alternativas de reprodução, e usá-la como medida representativa de fitness de cada tática alternativa de reprodução nesta espécie. Os resultados indicam um sucesso de fertilização para os machos nidificantes de 95%, e mostramos que existe uma maior influência do ambiente social do que de variáveis morfológicas na proporção de fertilizações não obtidas pelos machos nidificantes, quando comparado com estudos anteriores. Usando o transcriptoma obtido no primeiro trabalho, avançámos com a caraterização genómica de cada um dos fenótipos presentas na população da ilha da Culatra, fêmeas, machos nidificantes, machos sneakers e machos de transição (machos que apenas investem no seu crescimento, não se reproduzindo, e consequente transição de sneaker para macho nidificante) (Capítulo 4). Para tal, foi sequenciado em profundidade o transcriptoma de cérebro de cada um deste fenótipos, e os perfis de expressão obtidos para machos e fêmeas desta espécie, onde o foco do estudo se centrava no papel da expressão génica diferencial na evolução de táticas reprodutivas alternativas sequenciais que envolvem a expressão de ambos os traços masculinos e femininos. Os resultados obtidos, mostram como repertórios comportamentais distintos são facilitados por estados neurogenómicos distintos, que discriminam não apenas o sexo, mas também as táticas alternativas de reprodução. Por fim, utilizando dois tecidos-alvo, gónadas e prosencéfalo, focámo-nos na regulação genómica dos papeis sexuais no comportamento de corte entre fêmeas e machos nidificantes de duas populações sob diferentes regimes seletivos, a população costeira portuguesa com papeis sexuais invertidos e a população rochosa italiana, com papeis sexuais ‘convencionais’ (Capítulo 5). Os resultados obtidos mostram que ao nível do cérebro, a variação na expressão génica segrega os indivíduos por população e não por sexo, indicando que a plasticidade no comportamento entre as populações induz uma maior variação na expressão neurogenómica. Por outro lado, ao nível das gónadas, a variação na expressão génica segrega os indivíduos por sexo e também por população, indicando que a seleção sexual está a atuar ao nível intrasexual, particularmente nos machos nidificantes, indo de encontro a diferenças detetadas entre populações no investimento alocado às gónadas. No entanto, os mecanismos genómicos subjacentes ao comportamento de corte não foram totalmente elucidados, e mais estudos são necessários.The work presented here was developed at Instituto Gulbenkian de Ciência (IGC) in Oeiras, with the support of both ISPA – Instituto Universitário in Lisbon, for the maintenance of live fish, and Centro de Ciências do Mar (CCMAR) at Universidade do Algarve, for logistics and support during fieldwork in Ria Formosa

    Conservation genomics: speciation of the Neotropical damselfly species Megaloprepus caerulatus – as a model for insect speciation in tropical rainforests

    Get PDF
    The work presented in this thesis is located at the interface between ecology, evolution and developmental biology. It addresses theories and questions in population biology, phylogeography and speciation as well as methodological approaches for applying Next Generation Sequencing (NGS) data. In the center of this thesis stands the world’s largest extant damselfly, Megaloprepus caerulatus, as a model system for primary rainforests

    Analyses of CRN effectors (Crinkler and Necrosis) of the oomycete Aphanomyces euteiches

    Get PDF
    L'oomycète Aphanomyces euteiches est un pathogène racinaire de légumineuses cultivées (pois, luzerne ...) et de la plante modèle Medicago truncatula. Les oomycètes, comme d'autres microorganismes pathogènes eucaryotes, secrètent et transloquent des molécules à l'intérieur des cellules de l'hôte (effecteurs intracellulaires/cytoplasmiques) dans le but de manipuler les fonctions de la plante et de faciliter l'infection. Les protéines CRN (Crinkling and Necrosis) constituent une famille d'effecteurs nucléaires largement répandue chez les oomycètes et récemment décrites chez des espèces fongiques. Leurs cibles et rôle dans la virulence ainsi que leurs mécanismes de sécrétion et de translocation sont encore mal compris. Nous avons entrepris la caractérisation fonctionnelle des protéines AeCRN5 et AeCRN13 d'A.euteiches ainsi que de l'homologue d'AeCRN13 du champignon pathogène d'amphibien Batrachochytrium dendrobatidis, BdCRN13. Les analyses d'expression génique et protéique ont permis de montrer que AeCRN5 et AeCRN13 sont exprimés durant l'infection des racines de M. truncatula. Des résultats préliminaires d'immuno-localisation d'AeCRN13 ont révélé, pour la première fois, la sécrétion et translocation d'un CRN durant l'infection. Leur expression hétérologue, à la fois dans les cellules de plantes et d'amphibiens, a montré que ces protéines se localisent dans les noyaux où leurs activités conduisent à la perturbation de la physiologie de l'hôte. En développant un système in vivo basé sur la technique de FRET-FLIM, nous avons démontré que ces CRN ciblent les acides nucléiques: AeCRN5 cible l'ARN des plantes tandis qu'AeCRN13 et BdCRN13 lient l'ADN. Ces deux effecteurs CRN13 exhibent un motif de type HNH, lequel est typiquement retrouvé dans des endonucleases. Nous avons démontré que les CRN13 présentent une activité nuclease in vivo conduisant à la génération de coupures double brin de l'ADN. Ce travail a permis de mettre en évidence un nouvel mécanisme d'action des effecteurs de microorganismes eucaryotes et apporte des nouveaux aspects pour la compréhension de l'activité des protéines CRN d'oomycète mais aussi, pour la première fois, de champignon.The oomycete Aphanomyces euteiches is an important pathogen infecting roots of legumes (pea, alfalfa...) and the model legume Medicago truncatula. Oomycetes and other microbial eukaryotic pathogens secrete and deliver effector molecules into host intracellular compartments (intracellular/cytoplasmic effectors) to manipulate plant functions and promote infection. CRN (Crinkling and Necrosis) proteins are a wide class of intracellular, nuclear-localized effectors commonly found in oomycetes and recently described in true fungi whose host targets, virulence roles, secretion and host-delivery mechanisms are poorly understood. We addressed the functional characterization of CRN proteins AeCRN5 and AeCRN13 of A. euteiches and AeCRN13's homolog of the chytrid fungal pathogen of amphibians Batrachochytrium dendrobatidis, BdCRN13. Gene and protein expression studies showed that AeCRN5 and AeCRN13 are expressed during infection of M. truncatula's roots. Preliminary immunolocalization studies on AeCRN13 in infected roots indicated that the protein is secreted and translocated into root cells, depicting for the first time CRN secretion and translocation into the host during infection. The heterologous ectopic expression of AeCRNs and BdCRN13 in plant and amphibian cells indicated that these proteins target host nuclei and lead to the perturbation of host physiology. By developing an in vivo FRET-FLIM-based assay, we revealed that these CRNs target host nucleic acids: AeCRN5 targets plant RNA while AeCRN13 and BdCRN13 target DNA. Both CRN13 exhibit a HNH-like motif commonly found in endonucleases and we further demonstrated that both CRN13 display a nuclease activity in vivo inducing double-stranded DNA cleavage. This work reveals a new mode of action of intracellular eukaryotic effectors and brings new aspects for the comprehension of CRN's activities not only in oomycetes but, for the first time, also in true fungi

    Expression profiling of drug response-from genes to pathways

    Get PDF
    Understanding individual response to a drug—what determines its efficacy and tolerability—is the major bottleneck in current drug development and clinical trials. Intracellular response and metabolism, for example through cytochrome P- 450 enzymes, may either enhance or decrease the effect of different drugs, dependent on the genetic variant. Microarrays offer the potential to screen the genetic composition of the individual patient. However, experiments are “noisy” and must be accompanied by solid and robust data analysis. Furthermore, recent research aims at the combination of highthroughput data with methods of mathematical modeling, enabling problem-oriented assistance in the drug discovery process. This article will discuss state-of-the-art DNA array technology platforms and the basic elements of data analysis and bioinformatics research in drug discovery. Enhancing single-gene analysis, we will present a new method for interpreting gene expression changes in the context of entire pathways. Furthermore, we will introduce the concept of systems biology as a new paradigm for drug development and highlight our recent research—the development of a modeling and simulation platform for biomedical applications. We discuss the potentials of systems biology for modeling the drug response of the individual patient

    Phylogenomics of vertebrate serpins

    Get PDF
    Kumar A. Phylogenomics of vertebrate serpins. Bielefeld (Germany): Bielefeld University; 2010.The serpins constitute a superfamily of proteins that fold into a conserved tertiary structure and employ a sophisticated, irreversible suicide-mechanism of inhibition. More than 6000 serpins have been identified, occurring in all three forms of the life - the eukaryotes, the prokaryotes and the archea. Vertebrate serpins can be conveniently classified into six groups (V1 - V6), based on three independent biological features - gene organization, diagnostic amino acid sites and rare indels. In the present work, the phylogenetic relationships of serpins from Nematostella vectensis, Strongylocentrotus purpuratus, Ciona intestinalis, four fish species, frog, chicken and mammals were investigated, using gene architecture analyses and stringent criteria for identification of orthologs. With some deviations, all vertebrate serpin genes fit into one of the six exon/intron gene classes previously identified, dating the existence and maintenance of these gene organizations before or close to the divergence of fishes. Group V1 and V2 gene families underwent rapid adaptive radiation along the lineages leading to mammals as indicated by an up to nine-fold increased number of family members, accompanied by a rapid functional diversification. In contrast, gene groups V3 to V6 display a rather conservative evolution with little changes since the divergence of fishes and the other vertebrates. The orthology assessment indicates that all vertebrates are equipped with a subset of strongly conserved serpins with functions that can be clearly correlated with basic vertebrate-specific physiology. None of serpin genes from C. intestinalis shares a common exon-intron architecture organisation with any of the vertebrate serpin gene classes, nor was it possible to identify orthologs of vertebrates. The lack of gene architecture similarity and the complete absence of orthology between urochordate and vertebrate serpins indicate that major changes with bursts of character acquisition must have occurred during evolution of serpins in the time interval separating urochordates from chordates, indicating massive intron gains or losses and events providing C and N-terminal sequence extensions characteristic for today's vertebrate serpins. Lancelets and sea urchin genomes, in contrast, share one orthologous serpin with vertebrates. Rare genomic characters are used to show that orthologs of neuroserpin, a prominent representative of vertebrate group V3 serpin genes, exist in early diverging deuterostomes and probably also in cnidarians, indicating that the origin of a mammalian serpin can be traced back far in the history of eumetazoans. A C-terminal address code assigning association with secretory pathway organelles is present in all neuroserpin orthologs, suggesting that supervision of cellular export/import routes by antiproteolytic serpins is an ancient trait. Phylogenomic comparisons show that, after establishment of canonical exon-intron patterns in the serpin superfamily at the dawn of vertebrate evolution, multiple intron acquisition events have occurred during diversification of a lineage of actinopterygian fishes. The novel introns were acquired within a limited time interval (on an evolutionary timescale), and no such events were observed in other groups of vertebrates. Examination of the sequences flanking the intron insertion points revealed that the genetic requirements for acquisition of novel introns might be less stringent than previously suggested. Finally, we argue that genome compaction, a phenomenon associated with the fish lineage depicting preferential intron gain, might promote intron acquisition

    Genomics 4.0 : syntenic gene and genome duplication drives diversification of plant secondary metabolism and innate immunity in flowering plants : advanced pattern analytics in duplicate genomes

    Get PDF
    Genomics 4.0 - Syntenic Gene and Genome Duplication Drives Diversification of Plant Secondary Metabolism and Innate Immunity in Flowering Plants Johannes A. Hofberger1, 2, 3 1 Biosystematics Group, Wageningen University & Research Center, Droevendaalsesteeg 1, 6708 PB Wageningen, The Netherlands (August 2012 – December 2013) 2 Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands (December 2010 – July 2012) 3 Chinese Academy of Sciences/Max Planck Partner Institute for Computational Biology, 320 Yueyang Road, Shanghai 200031, PR China (January 2014 – December 2014) TWO-SENTENCE SUMMARY Large-scale comparative analysis of Big Data from next generation sequencing provides powerful means to exploit the potential of nature in context of plant breeding and biotechnology. In this thesis, we combine various computational methods for genome-wide identification of gene families involved in (a) plant innate immunity and (a) biosynthesis of defense-related plant secondary metabolites across 21 species, assess dynamics that affected evolution of underlying traits during 250 Million Years of flowering plant radiation and provide data on more than 4500 loci that can underpin crop improvement for future food and live quality. GENERAL ABSTRACT As sessile organisms, plants are permanently exposed to a plethora of potentially harmful microbes and other pests. The surprising resilience to infections observed in successful lineages is due to a complex defense network fighting off invading pathogens. Within this network, a sophisticated plant innate immune system is accompanied by a multitude of specialized biosynthetic pathways that generate more than 200,000 secondary metabolites with ecological, agricultural, energy and medicinal importance. The rapid diversification of associated genes was accompanied by a series of duplication events in virtually all plant species, including local duplication of short sequences as well as multiplication of all chromosomes due to meiotic errors (plant polyploidy). In a comparative genomics approach, we combined several bioinformatics techniques for large-scale identification of multi-domain and multi-gene families that are involved in plant innate immunity or defense-related secondary metabolite pathways across 21 representative flowering plant genomes. We introduced a framework to trace back duplicate gene copies to distinct ancient duplication events, thereby unravelling a differential impact of gene and genome duplication to molecular evolution of target genes. Comparing the genomic context among homologs within and between species in a phylogenomics perspective, we discovered orthologs conserved within genomic regions that remained structurally immobile during flowering plant radiation. In summary, we described a complex interplay of gene and genome duplication that increased genetic versatility of disease resistance and secondary metabolite pathways, thereby expanding the playground for functional diversification and thus plant trait innovation and success. Our findings give fascinating insights to evolution across lineages and can underpin crop improvement for food, fiber and biofuels production</p
    corecore