37 research outputs found

    1000 Genomes Selection Browser 1.0: A genome browser dedicated to signatures of natural selection in modern humans

    Get PDF
    This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited.Searching for Darwinian selection in natural populations has been the focus of a multitude of studies over the last decades. Here we present the 1000 Genomes Selection Browser 1.0 (http://hsb.upf.edu) as a resource for signatures of recent natural selection in modern humans. We have implemented and applied a large number of neutrality tests as well as summary statistics informative for the action of selection such as Tajima's D, CLR, Fay and Wu's H, Fu and Li's F* and D*, XPEHH, ΔiHH, iHS, FST, ΔDAF and XPCLR among others to low coverage sequencing data from the 1000 genomes project (Phase 1; release April 2012). We have implemented a publicly available genome-wide browser to communicate the results from three different populations of West African, Northern European and East Asian ancestry (YRI, CEU, CHB). Information is provided in UCSC-style format to facilitate the integration with the rich UCSC browser tracks and an access page is provided with instructions and for convenient visualization. We believe that this expandable resource will facilitate the interpretation of signals of selection on different temporal, geographical and genomic scales. © 2013 The Author(s). Published by Oxford University Press.Ministerio de Ciencia y Tecnología (Spain); Direcció General de Recerca, Generalitat de Catalunya (Grup de Recerca Consolidat 2009 SGR 1101); Subprogram BMC [BFU2010-19443 awarded to J.B.]; Post-doctoral scholarship from the Volkswagenstiftung [Az: I/85 198 to J.E.]; Spanish government [BFU-2008-01046; SAF2011-29239]; The Spanish government FPI scholarships [BES-2009-017731 and BES-2011-04502 to G.M.D. and M.P., respectively]; PhD fellowship from ‘Acción Estratégica de Salud, en el marco del Plan Nacional de Investigación Científica, Desarrollo e Innovación Tecnológica 2008-2011’ from Instituto de Salud Carlos III (to P.L.). Funding for open access charge: Prof. Jaume Bertranpetit.Peer Reviewe

    000 Genomes Selection Browser 1.0: a genome browser dedicated to signatures of natural selection in modern humans

    Get PDF
    ABSTRACT Searching for Darwinian selection in natural populations has been the focus of a multitude of studies over the last decades. Here we present the 1000 Genomes Selection Browser 1.0 (http://hsb.upf.edu) as a resource for signatures of recent natural selection in modern humans. We have implemented and applied a large number of neutrality tests as well as summary statistics informative for the action of selection such as Tajima's D, CLR

    Taxonomic distribution and origins of the extended LHC (light-harvesting complex) antenna protein superfamily

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The extended light-harvesting complex (LHC) protein superfamily is a centerpiece of eukaryotic photosynthesis, comprising the LHC family and several families involved in photoprotection, like the LHC-like and the photosystem II subunit S (PSBS). The evolution of this complex superfamily has long remained elusive, partially due to previously missing families.</p> <p>Results</p> <p>In this study we present a meticulous search for LHC-like sequences in public genome and expressed sequence tag databases covering twelve representative photosynthetic eukaryotes from the three primary lineages of plants (Plantae): glaucophytes, red algae and green plants (Viridiplantae). By introducing a coherent classification of the different protein families based on both, hidden Markov model analyses and structural predictions, numerous new LHC-like sequences were identified and several new families were described, including the red lineage chlorophyll <it>a/b</it>-binding-like protein (RedCAP) family from red algae and diatoms. The test of alternative topologies of sequences of the highly conserved chlorophyll-binding core structure of LHC and PSBS proteins significantly supports the independent origins of LHC and PSBS families via two unrelated internal gene duplication events. This result was confirmed by the application of cluster likelihood mapping.</p> <p>Conclusions</p> <p>The independent evolution of LHC and PSBS families is supported by strong phylogenetic evidence. In addition, a possible origin of LHC and PSBS families from different homologous members of the stress-enhanced protein subfamily, a diverse and anciently paralogous group of two-helix proteins, seems likely. The new hypothesis for the evolution of the extended LHC protein superfamily proposed here is in agreement with the character evolution analysis that incorporates the distribution of families and subfamilies across taxonomic lineages. Intriguingly, stress-enhanced proteins, which are universally found in the genomes of green plants, red algae, glaucophytes and in diatoms with complex plastids, could represent an important and previously missing link in the evolution of the extended LHC protein superfamily.</p

    Testing the thrifty gene hypothesis: the Gly482Ser variant in PPARGC1A is associated with BMI in Tongans

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The thrifty gene hypothesis posits that, in populations that experienced periods of feast and famine, natural selection favoured individuals carrying thrifty alleles that promote the storage of fat and energy. Polynesians likely experienced long periods of cold stress and starvation during their settlement of the Pacific and today have high rates of obesity and type 2 diabetes (T2DM), possibly due to past positive selection for thrifty alleles. Alternatively, T2DM risk alleles may simply have drifted to high frequency in Polynesians. To identify thrifty alleles in Polynesians, we previously examined evidence of positive selection on T2DM-associated SNPs and identified a T2DM risk allele at unusually high frequency in Polynesians. We suggested that the risk allele of the Gly482Ser variant in the <it>PPARGC1A </it>gene was driven to high frequency in Polynesians by positive selection and therefore possibly represented a thrifty allele in the Pacific.</p> <p>Methods</p> <p>Here we examine whether <it>PPARGC1A </it>is a thrifty gene in Pacific populations by testing for an association between Gly482Ser genotypes and BMI in two Pacific populations (Maori and Tongans) and by evaluating the frequency of the risk allele of the Gly482Ser variant in a sample of worldwide populations.</p> <p>Results</p> <p>We find that the Gly482Ser variant is associated with BMI in Tongans but not in Maori. In a sample of 58 populations worldwide, we also show that the 482Ser risk allele reaches its highest frequency in the Pacific.</p> <p>Conclusion</p> <p>The association between Gly482Ser genotypes and BMI in Tongans together with the worldwide frequency distribution of the Gly482Ser risk allele suggests that <it>PPARGC1A </it>remains a candidate thrifty gene in Pacific populations.</p

    Introducing evolutionary biologists to the analysis of big data: guidelines to organize extended bioinformatics training courses

    Get PDF
    Research in evolutionary biology has been progressively influenced by big data such as massive genome and transcriptome sequencing data, scalar measurements of several phenotypes on tens to thousands of individuals, as well as from collecting worldwide environmental data at an increasingly detailed scale. The handling and analysis of such data require computational skills that usually exceed the abilities of most traditionally trained evolutionary biologists. Here we discuss the advantages, challenges and considerations for organizing and running bioinformatics training courses of 2–3 weeks in length to introduce evolutionary biologists to the computational analysis of big data. Extended courses have the advantage of offering trainees the opportunity to learn a more comprehensive set of complementary topics and skills and allowing for more time to practice newly acquired competences. Many organizational aspects are common to any course, as the need to define precise learning objectives and the selection of appropriate and highly motivated instructors and trainees, among others. However, other features assume particular importance in extended bioinformatics training courses. To successfully implement a learning-by-doing philosophy, sufficient and enthusiastic teaching assistants (TAs) are necessary to offer prompt help to trainees. Further, a good balance between theoretical background and practice time needs to be provided and assured that the schedule includes enough flexibility for extra review sessions or further discussions if desired. A final project enables trainees to apply their newly learned skills to real data or case studies of their interest. To promote a friendly atmosphere throughout the course and to build a close-knit community after the course, allow time for some scientific discussions and social activities. In addition, to not exhaust trainees and TAs, some leisure time needs to be organized. Finally, all organization should be done while keeping the budget within fair limits. In order to create a sustainable course that constantly improves and adapts to the trainees’ needs, gathering short- and long-term feedback after the end of the course is important. Based on our experience we have collected a set of recommendations to effectively organize and run extended bioinformatics training courses for evolutionary biologists, which we here want to share with the community. They offer a complementary way for the practical teaching of modern evolutionary biology and reaching out to the biological community.Peer reviewe

    Evolution of the extended LHC protein superfamily in photosynthesis

    No full text
    In photosynthesis, sunlight interacts with colorful photosynthetic pigments like the chlorophylls, carotenoids and phycobilines. The first two of these pigments can be bound by members of the extended light-harvesting complex (LHC) protein superfamily and are organised in order to take on functions in the collection of or in the defense against sunlight. The extended LHC superfamily comprises several protein families, like the LHCs, the photosystem II subunit S (PSBS), the red algal lineage chlorophyll a/b-binding (CAB)-like proteins (RedCAP), and several LHC-like proteins. Some of these groups are very old, likely over two billions of years, and they show a characteristic distribution across different groups of photosynthetic organisms, like cyanobacteria, red algae, algae with secondary plastids, green algae or plants.In this work we aim to distangle the evolutionary history of this complex protein superfamily and to use the results to inform functional studies of different LHC-like proteins in plants and diatoms. After careful searches of homologous protein sequences in public sequence databases, we developed a coherent classification system of the different protein families in part based on hidden Markov model analyses. With this approach, we identified many new LHC-like proteins including several from the model plant species Arabidopsis thaliana and described new families, like the RedCAP from red algae and complex algae with red plastids, and new subfamilies of two-helix proteins from glaucophytes, red algae, diatoms and plants. A group of newly found RedCAP and LHC-like proteins from the diatom Phaeodactylum tricornutum was of sufficient interest for functional follow-up experiments, done by collaborators. The results of these mRNA expression and cellular targeting experiments in combination with evolutionary analyses were used to make inferences about possible functions of these proteins.Results from reverse genetics experiments on the LHC-like one-helix proteins (OHP) 1 and 2 done by others in the Adamska lab were interpreted in an evolutionary framework. Specifically, ohp1 and ohp2 knock out mutants of A. thaliana were extremely sensitive to light so that they had to be grown under very low light conditions and on sugar-supplemented medium. This pointed to fundamentally important functions of these proteins in photoprotection of photosystem I, a point that could be supported by their taxononomic distributions and conservation patterns across algae and plants.The main result of this work was an improved model for the evolution of the extended LHC protein family. By adjusting different phylogenetic methods to our questions, we showed that LHC and PSBS, as well as other eukaryotic three-helix proteins, have evolved independently, contrary to previous suggestions. Likely, they were derived from a pool of two-helix stressenhanced proteins (SEPs). Over the last billions of years and in an still ongoing process, adaptational processes including the evolution of new protein functions, origin of novel proteinfamilies and secondary losses of others, as well as lineage-specific family expansions have shaped this protein superfamily. This has allowed algae and plants to survive and thrive in a multitude of environments, hereby changing our planet forever

    Phylogeography of Haplochromine Fish in the Lake Victoria Region

    No full text
    The three Great East African Lakes are important model systems for evolutionary research. Among other things the massive adaptive radiations of cichlid fish are of interest. The so-called species flock consisting of several hundred endemic cichlid species, which occurs in the region of Lake Victoria, was derived from a single founder population and therefore has a monophyletic origin. It is uncertain, however, when and where this adaptive radiation took place and how the lake was populated. Previous studies were not able to give sufficient answers. Especially, when one considers a recent geological study which suggests the complete desiccation of Lake Victoria about 18.000 to 12 400 years ago. Therefore a characterization of the phylogeography of populations of cichlids of waters nearby was carried out in the present study. Most of these waters were firstly examined in this context. Mitochondrial DNA (D-Loop) of 70 cichlids from the region was sequenced and analyzed, together with unpublished sequences from Lake Kivu and sequences from previous studies. In this way a new picture of the history of Lake Victoria´s cichlid species flock could be drawn. The classification of the cichlids from the different waters could be resolved doubtlessly, whereas interpretation of the data rises new questions. The phylogeographic analyses suggest, that the genetic prerequisites for the adaptive radiation did not necessarily arise in Lake Victoria itself, but possibly in another deep body of water like Lake Kivu. A big founder population could have populated Lake Victoria less than 12.400 years ago. Following that, the special conditions within the lake made possible the astonishingly fast radiation of several hundred species

    Taxonomic distribution and origins of the extended LHC (light-harvesting complex) antenna protein superfamily

    No full text
    Background: The extended light-harvesting complex (LHC) protein superfamily is a centerpiece of eukaryotic photosynthesis, comprising the LHC family and several families involved in photoprotection, like the LHC-like and the photosystem II subunit S (PSBS). The evolution of this complex superfamily has long remained elusive, partially due to previously missing families./nResults: In this study we present a meticulous search for LHC-like sequences in public genome and expressed sequence tag databases covering twelve representative photosynthetic eukaryotes from the three primary lineages of plants (Plantae): glaucophytes, red algae and green plants (Viridiplantae). By introducing a coherent classification of the different protein families based on both, hidden Markov model analyses and structural predictions, numerous new LHC-like sequences were identified and several new families were described, including the red lineage chlorophyll a/b-binding-like protein (RedCAP) family from red algae and diatoms. The test of alternative topologies of sequences of the highly conserved chlorophyll-binding core structure of LHC and PSBS proteins significantly supports the independent origins of LHC and PSBS families via two unrelated internal gene duplication events. This result was confirmed by the application of cluster likelihood mapping./nConclusions: The independent evolution of LHC and PSBS families is supported by strong phylogenetic evidence. In addition, a possible origin of LHC and PSBS families from different homologous members of the stress-enhanced protein subfamily, a diverse and anciently paralogous group of two-helix proteins, seems likely. The new hypothesis for the evolution of the extended LHC protein superfamily proposed here is in agreement with the character evolution analysis that incorporates the distribution of families and subfamilies across taxonomic lineages. Intriguingly, stress-enhanced proteins, which are universally found in the genomes of green plants, red algae, glaucophytes and in diatoms with complex plastids, could represent an important and previously missing link in the evolution of the extended LHC protein superfamily.This work was supported by grants from the Deutsche Forschungsgemeinschaft (AD-92/7-2) and the Konstanz University to IA, JE was supported by a grant (I/82 750) from the Volkswagenstiftung, "Förderungsinitiative Evolutionsbiologie"

    A novel type of light-harvesting antenna protein of red algal origin in algae with secondary plastids

    Get PDF
    Background: Light, the driving force of photosynthesis, can be harmful when present in excess; therefore, any light harvesting system requires photoprotection. Members of the extended light-harvesting complex (LHC) protein superfamily are involved in light harvesting as well as in photoprotection and are found in the red and green plant lineages, with a complex distribution pattern of subfamilies in the different algal lineages. Results: Here, we demonstrate that the recently discovered “red lineage chlorophyll a/b-binding-like proteins” (RedCAPs) form a monophyletic family within this protein superfamily. The occurrence of RedCAPs was found to be restricted to the red algal lineage, including red algae (with primary plastids) as well as cryptophytes, haptophytes and heterokontophytes (with secondary plastids of red algal origin). Expression of a full-length RedCAP:GFP fusion construct in the diatom Phaeodactylum tricornutum confirmed the predicted plastid localisation of RedCAPs. Furthermore, we observed that similarly to the fucoxanthin chlorophyll a/c-binding light-harvesting antenna proteins also RedCAP transcripts in diatoms were regulated in a diurnal way at standard light conditions and strongly repressed at high light intensities./nConclusions: The absence of RedCAPs from the green lineage implies that RedCAPs evolved in the red lineage after separation from the the green lineage. During the evolution of secondary plastids, RedCAP genes therefore must have been transferred from the nucleus of the endocytobiotic alga to the nucleus of the host cell, a process that involved complementation with pre-sequences allowing import of the gene product into the secondary plastid bound by four membranes. Based on light-dependent transcription and on localisation data, we propose that RedCAPs might participate in the light (intensity and quality)-dependent structural or functional reorganisation of the light-harvesting antennae of the photosystems upon dark to light shifts as regularly experienced by diatoms in nature. Remarkably, in plastids of the red lineage as well as in green lineage plastids, the phycobilisome based cyanobacterial light harvesting system has been replaced by light harvesting systems that are based on members of the extended LHC protein superfamily, either for one of the photosystems (PS I of red algae) or for both (diatoms). In their proposed function, the RedCAP protein family may thus have played a role in the evolutionary structural remodelling of light-harvesting antennae in the red lineage.This work was supported by the Universität Konstanz and by grants of the Deutsche Forschungsgemeinschaft (KR 1661/3 to PGK, AD 92/7-2 to IA, LA 2368/2-1 to JL). JE was supported by a grant (I/82 750) from the Volkswagenstiftung (“Förderungsinitiative Evolutionsbiologie”
    corecore