145 research outputs found

    Towards realistic benchmarks for multiple alignments of non-coding sequences

    Get PDF
    <p><b>Abstract</b></p> <p>Background</p> <p>With the continued development of new computational tools for multiple sequence alignment, it is necessary today to develop benchmarks that aid the selection of the most effective tools. Simulation-based benchmarks have been proposed to meet this necessity, especially for non-coding sequences. However, it is not clear if such benchmarks truly represent real sequence data from any given group of species, in terms of the difficulty of alignment tasks.</p> <p>Results</p> <p>We find that the conventional simulation approach, which relies on empirically estimated values for various parameters such as substitution rate or insertion/deletion rates, is unable to generate synthetic sequences reflecting the broad genomic variation in conservation levels. We tackle this problem with a new method for simulating non-coding sequence evolution, by relying on genome-wide distributions of evolutionary parameters rather than their averages. We then generate synthetic data sets to mimic orthologous sequences from the <it>Drosophila </it>group of species, and show that these data sets truly represent the variability observed in genomic data in terms of the difficulty of the alignment task. This allows us to make significant progress towards estimating the alignment accuracy of current tools in an absolute sense, going beyond only a relative assessment of different tools. We evaluate six widely used multiple alignment tools in the context of <it>Drosophila </it>non-coding sequences, and find the accuracy to be significantly different from previously reported values. Interestingly, the performance of most tools degrades more rapidly when there are more insertions than deletions in the data set, suggesting an asymmetric handling of insertions and deletions, even though none of the evaluated tools explicitly distinguishes these two types of events. We also examine the accuracy of two existing tools for annotating insertions versus deletions, and find their performance to be close to optimal in <it>Drosophila </it>non-coding sequences if provided with the true alignments.</p> <p>Conclusion</p> <p>We have developed a method to generate benchmarks for multiple alignments of <it>Drosophila </it>non-coding sequences, and shown it to be more realistic than traditional benchmarks. Apart from helping to select the most effective tools, these benchmarks will help practitioners of comparative genomics deal with the effects of alignment errors, by providing accurate estimates of the extent of these errors.</p

    Chromosomal-level assembly of the Asian Seabass genome using long sequence reads and multi-layered scaffolding

    Get PDF
    We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species' native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics

    MetaPIGA v2.0: maximum likelihood large phylogeny estimation using the metapopulation genetic algorithm and other stochastic heuristics

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The development, in the last decade, of stochastic heuristics implemented in robust application softwares has made large phylogeny inference a key step in most comparative studies involving molecular sequences. Still, the choice of a phylogeny inference software is often dictated by a combination of parameters not related to the raw performance of the implemented algorithm(s) but rather by practical issues such as ergonomics and/or the availability of specific functionalities.</p> <p>Results</p> <p>Here, we present MetaPIGA v2.0, a robust implementation of several stochastic heuristics for large phylogeny inference (under maximum likelihood), including a Simulated Annealing algorithm, a classical Genetic Algorithm, and the Metapopulation Genetic Algorithm (metaGA) together with complex substitution models, discrete Gamma rate heterogeneity, and the possibility to partition data. MetaPIGA v2.0 also implements the Likelihood Ratio Test, the Akaike Information Criterion, and the Bayesian Information Criterion for automated selection of substitution models that best fit the data. Heuristics and substitution models are highly customizable through manual batch files and command line processing. However, MetaPIGA v2.0 also offers an extensive graphical user interface for parameters setting, generating and running batch files, following run progress, and manipulating result trees. MetaPIGA v2.0 uses standard formats for data sets and trees, is platform independent, runs in 32 and 64-bits systems, and takes advantage of multiprocessor and multicore computers.</p> <p>Conclusions</p> <p>The metaGA resolves the major problem inherent to classical Genetic Algorithms by maintaining high inter-population variation even under strong intra-population selection. Implementation of the metaGA together with additional stochastic heuristics into a single software will allow rigorous optimization of each heuristic as well as a meaningful comparison of performances among these algorithms. MetaPIGA v2.0 gives access both to high customization for the phylogeneticist, as well as to an ergonomic interface and functionalities assisting the non-specialist for sound inference of large phylogenetic trees using nucleotide sequences. MetaPIGA v2.0 and its extensive user-manual are freely available to academics at <url>http://www.metapiga.org</url>.</p

    The Tetraodon nigroviridis reference transcriptome: Developmental transition, length retention and microsynteny of long non-coding RNAs in a compact vertebrate genome

    Get PDF
    Pufferfish such as fugu and tetraodon carry the smallest genomes among all vertebrates and are ideal for studying genome evolution. However, comparative genomics using these species is hindered by the poor annotation of their genomes. We performed RNA sequencing during key stages of maternal to zygotic transition of Tetraodon nigroviridis and report its first developmental transcriptome. We assembled 61,033 transcripts (23,837 loci) representing 80% of the annotated gene models and 3816 novel coding transcripts from 2667 loci. We demonstrate the similarities of gene expression profiles between pufferfish and zebrafish during maternal to zygotic transition and annotated 1120 long non-coding RNAs (lncRNAs) many of which differentially expressed during development. The promoters for 60% of the assembled transcripts result validated by CAGE-seq. Despite the extreme compaction of the tetraodon genome and the dramatic loss of transposons, the length of lncRNA exons remain comparable to that of other vertebrates and a small set of lncRNAs appears enriched for transposable elements suggesting a selective pressure acting on lncRNAs length and composition. Finally, a set of lncRNAs are microsyntenic between teleost and vertebrates, which indicates potential regulatory interactions between lncRNAs and their flanking coding genes. Our work provides a fundamental molecular resource for vertebrate comparative genomics and embryogenesis studies

    The Caenorhabditis elegans Gene mfap-1 Encodes a Nuclear Protein That Affects Alternative Splicing

    Get PDF
    RNA splicing is a major regulatory mechanism for controlling eukaryotic gene expression. By generating various splice isoforms from a single pre–mRNA, alternative splicing plays a key role in promoting the evolving complexity of metazoans. Numerous splicing factors have been identified. However, the in vivo functions of many splicing factors remain to be understood. In vivo studies are essential for understanding the molecular mechanisms of RNA splicing and the biology of numerous RNA splicing-related diseases. We previously isolated a Caenorhabditis elegans mutant defective in an essential gene from a genetic screen for suppressors of the rubberband Unc phenotype of unc-93(e1500) animals. This mutant contains missense mutations in two adjacent codons of the C. elegans microfibrillar-associated protein 1 gene mfap-1. mfap-1(n4564 n5214) suppresses the Unc phenotypes of different rubberband Unc mutants in a pattern similar to that of mutations in the splicing factor genes uaf-1 (the C. elegans U2AF large subunit gene) and sfa-1 (the C. elegans SF1/BBP gene). We used the endogenous gene tos-1 as a reporter for splicing and detected increased intron 1 retention and exon 3 skipping of tos-1 transcripts in mfap-1(n4564 n5214) animals. Using a yeast two-hybrid screen, we isolated splicing factors as potential MFAP-1 interactors. Our studies indicate that C. elegans mfap-1 encodes a splicing factor that can affect alternative splicing.National Natural Science Foundation (China) (Grant 30971639)United States. National Institutes of Health (Grant GM24663

    Ecoregional Analysis of Nearshore Sea-Surface Temperature in the North Pacific

    Get PDF
    The quantification and description of sea surface temperature (SST) is critically important because it can influence the distribution, migration, and invasion of marine species; furthermore, SSTs are expected to be affected by climate change. To better understand present temperature regimes, we assembled a 29-year nearshore time series of mean monthly SSTs along the North Pacific coastline using remotely-sensed satellite data collected with the Advanced Very High Resolution Radiometer (AVHRR) instrument. We then used the dataset to describe nearshore (<20 km offshore) SST patterns of 16 North Pacific ecoregions delineated by the Marine Ecoregions of the World (MEOW) hierarchical schema. Annual mean temperature varied from 3.8Β°C along the Kamchatka ecoregion to 24.8Β°C in the Cortezian ecoregion. There are smaller annual ranges and less variability in SST in the Northeast Pacific relative to the Northwest Pacific. Within the 16 ecoregions, 31–94% of the variance in SST is explained by the annual cycle, with the annual cycle explaining the least variation in the Northern California ecoregion and the most variation in the Yellow Sea ecoregion. Clustering on mean monthly SSTs of each ecoregion showed a clear break between the ecoregions within the Warm and Cold Temperate provinces of the MEOW schema, though several of the ecoregions contained within the provinces did not show a significant difference in mean seasonal temperature patterns. Comparison of these temperature patterns shared some similarities and differences with previous biogeographic classifications and the Large Marine Ecosystems (LMEs). Finally, we provide a web link to the processed data for use by other researchers

    Heterogeneous Nuclear Ribonucleoprotein K Interacts with Abi-1 at Postsynaptic Sites and Modulates Dendritic Spine Morphology

    Get PDF
    BACKGROUND: Abelson-interacting protein 1 (Abi-1) plays an important role for dendritic branching and synapse formation in the central nervous system. It is localized at the postsynaptic density (PSD) and rapidly translocates to the nucleus upon synaptic stimulation. At PSDs Abi-1 is in a complex with several other proteins including WASP/WAVE or cortactin thereby regulating the actin cytoskeleton via the Arp 2/3 complex. PRINCIPAL FINDINGS: We identified heterogeneous nuclear ribonucleoprotein K (hnRNPK), a 65 kDa ssDNA/RNA-binding-protein that is involved in multiple intracellular signaling cascades, as a binding partner of Abi-1 at postsynaptic sites. The interaction with the Abi-1 SH3 domain is mediated by the hnRNPK-interaction (KI) domain. We further show that during brain development, hnRNPK expression becomes more and more restricted to granule cells of the cerebellum and hippocampal neurons where it localizes in the cell nucleus as well as in the spine/dendritic compartment. The downregulation of hnRNPK in cultured hippocampal neurons by RNAi results in an enlarged dendritic tree and a significant increase in filopodia formation. This is accompanied by a decrease in the number of mature synapses. Both effects therefore mimic the neuronal morphology after downregulation of Abi-1 mRNA in neurons. CONCLUSIONS: Our findings demonstrate a novel interplay between hnRNPK and Abi-1 in the nucleus and at synaptic sites and show obvious similarities regarding both protein knockdown phenotypes. This indicates that hnRNPK and Abi-1 act synergistic in a multiprotein complex that regulates the crucial balance between filopodia formation and synaptic maturation in neurons

    Developmental Programming Mediated by Complementary Roles of Imprinted Grb10 in Mother and Pup

    Get PDF
    Developmental programming links growth in early life with health status in adulthood. Although environmental factors such as maternal diet can influence the growth and adult health status of offspring, the genetic influences on this process are poorly understood. Using the mouse as a model, we identify the imprinted gene Grb10 as a mediator of nutrient supply and demand in the postnatal period. The combined actions of Grb10 expressed in the mother, controlling supply, and Grb10 expressed in the offspring, controlling demand, jointly regulate offspring growth. Furthermore, Grb10 determines the proportions of lean and fat tissue during development, thereby influencing energy homeostasis in the adult. Most strikingly, we show that the development of normal lean/fat proportions depends on the combined effects of Grb10 expressed in the mother, which has the greater effect on offspring adiposity, and Grb10 expressed in the offspring, which influences lean mass. These distinct functions of Grb10 in mother and pup act complementarily, which is consistent with a coadaptation model of imprinting evolution, a model predicted but for which there is limited experimental evidence. In addition, our findings identify Grb10 as a key genetic component of developmental programming, and highlight the need for a better understanding of mother-offspring interactions at the genetic level in predicting adult disease risk

    Cholesterol Pathways Affected by Small Molecules That Decrease Sterol Levels in Niemann-Pick Type C Mutant Cells

    Get PDF
    Niemann-Pick type C (NPC) disease is a genetically inherited multi-lipid storage disorder with impaired efflux of cholesterol from lysosomal storage organelles.The effect of screen-selected cholesterol lowering compounds on the major sterol pathways was studied in CT60 mutant CHO cells lacking NPC1 protein. Each of the selected chemicals decreases cholesterol in the lysosomal storage organelles of NPC1 mutant cells through one or more of the following mechanisms: increased cholesterol efflux from the cell, decreased uptake of low-density lipoproteins, and/or increased levels of cholesteryl esters. Several chemicals promote efflux of cholesterol to extracellular acceptors in both non-NPC and NPC1 mutant cells. The uptake of low-density lipoprotein-derived cholesterol is inhibited by some of the studied compounds.Results herein provide the information for prioritized further studies in identifying molecular targets of the chemicals. This approach proved successful in the identification of seven chemicals as novel inhibitors of lysosomal acid lipase (Rosenbaum et al, Biochim. Biophys. Acta. 2009, 1791:1155-1165)
    • …
    corecore