14 research outputs found

    Meta-Alignment with Crumble and Prune: Partitioning very large alignment problems for performance and parallelization

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Continuing research into the global multiple sequence alignment problem has resulted in more sophisticated and principled alignment methods. Unfortunately these new algorithms often require large amounts of time and memory to run, making it nearly impossible to run these algorithms on large datasets. As a solution, we present two general methods, Crumble and Prune, for breaking a phylogenetic alignment problem into smaller, more tractable sub-problems. We call Crumble and Prune <it>meta-alignment </it>methods because they use existing alignment algorithms and can be used with many current alignment programs. Crumble breaks long alignment problems into shorter sub-problems. Prune divides the phylogenetic tree into a collection of smaller trees to reduce the number of sequences in each alignment problem. These methods are orthogonal: they can be applied together to provide better scaling in terms of sequence length and in sequence depth. Both methods partition the problem such that many of the sub-problems can be solved independently. The results are then combined to form a solution to the full alignment problem.</p> <p>Results</p> <p>Crumble and Prune each provide a significant performance improvement with little loss of accuracy. In some cases, a gain in accuracy was observed. Crumble and Prune were tested on real and simulated data. Furthermore, we have implemented a system called Job-tree that allows hierarchical sub-problems to be solved in parallel on a compute cluster, significantly shortening the run-time.</p> <p>Conclusions</p> <p>These methods enabled us to solve gigabase alignment problems. These methods could enable a new generation of biologically realistic alignment algorithms to be applied to real world, large scale alignment problems.</p

    Identification and Classification of Conserved RNA Secondary Structures in the Human Genome

    Get PDF
    The discoveries of microRNAs and riboswitches, among others, have shown functional RNAs to be biologically more important and genomically more prevalent than previously anticipated. We have developed a general comparative genomics method based on phylogenetic stochastic context-free grammars for identifying functional RNAs encoded in the human genome and used it to survey an eight-way genome-wide alignment of the human, chimpanzee, mouse, rat, dog, chicken, zebra-fish, and puffer-fish genomes for deeply conserved functional RNAs. At a loose threshold for acceptance, this search resulted in a set of 48,479 candidate RNA structures. This screen finds a large number of known functional RNAs, including 195 miRNAs, 62 histone 3′UTR stem loops, and various types of known genetic recoding elements. Among the highest-scoring new predictions are 169 new miRNA candidates, as well as new candidate selenocysteine insertion sites, RNA editing hairpins, RNAs involved in transcript auto regulation, and many folds that form singletons or small functional RNA families of completely unknown function. While the rate of false positives in the overall set is difficult to estimate and is likely to be substantial, the results nevertheless provide evidence for many new human functional RNAs and present specific predictions to facilitate their further characterization

    A user's guide to the Encyclopedia of DNA elements (ENCODE)

    Get PDF
    The mission of the Encyclopedia of DNA Elements (ENCODE) Project is to enable the scientific and medical communities to interpret the human genome sequence and apply it to understand human biology and improve health. The ENCODE Consortium is integrating multiple technologies and approaches in a collective effort to discover and define the functional elements encoded in the human genome, including genes, transcripts, and transcriptional regulatory regions, together with their attendant chromatin states and DNA methylation patterns. In the process, standards to ensure high-quality data have been implemented, and novel algorithms have been developed to facilitate analysis. Data and derived results are made available through a freely accessible database. Here we provide an overview of the project and the resources it is generating and illustrate the application of ENCODE data to interpret the human genome

    Aberrant B cell repertoire selection associated with HIV neutralizing antibody breadth

    No full text
    A goal of HIV vaccine development is to elicit antibodies with neutralizing breadth. Broadly neutralizing antibodies (bNAbs) to HIV often have unusual sequences with long heavy-chain complementarity-determining region loops, high somatic mutation rates and polyreactivity. A subset of HIV-infected individuals develops such antibodies, but it is unclear whether this reflects systematic differences in their antibody repertoires or is a consequence of rare stochastic events involving individual clones. We sequenced antibody heavy-chain repertoires in a large cohort of HIV-infected individuals with bNAb responses or no neutralization breadth and uninfected controls, identifying consistent features of bNAb repertoires, encompassing thousands of B cell clones per individual, with correlated T cell phenotypes. These repertoire features were not observed during chronic cytomegalovirus infection in an independent cohort. Our data indicate that the development of numerous B cell lineages with antibody features associated with autoreactivity may be a key aspect in the development of HIV neutralizing antibody breadth

    HIV-1 envelope gp41 antibodies can originate from terminal ileum B cells that share cross-reactivity with commensal bacteria.

    Get PDF
    Monoclonal antibodies derived from blood plasma cells of acute HIV-1-infected individuals are predominantly targeted to the HIV Env gp41 and cross-reactive with commensal bacteria. To understand this phenomenon, we examined anti-HIV responses in ileum B cells using recombinant antibody technology and probed their relationship to commensal bacteria. The dominant ileum B cell response was to Env gp41. Remarkably, a majority (82%) of the ileum anti-gp41 antibodies cross-reacted with commensal bacteria, and of those, 43% showed non-HIV-1 antigen polyreactivity. Pyrosequencing revealed shared HIV-1 antibody clonal lineages between ileum and blood. Mutated immunoglobulin G antibodies cross-reactive with both Env gp41 and microbiota could also be isolated from the ileum of HIV-1 uninfected individuals. Thus, the gp41 commensal bacterial antigen cross-reactive antibodies originate in the intestine, and the gp41 Env response in HIV-1 infection can be derived from a preinfection memory B cell pool triggered by commensal bacteria that cross-react with Env

    Co-evolution of a broadly neutralizing HIV-1 antibody and founder virus.

    No full text
    Current human immunodeficiency virus-1 (HIV-1) vaccines elicit strain-specific neutralizing antibodies. However, cross-reactive neutralizing antibodies arise in approximately 20% of HIV-1-infected individuals, and details of their generation could provide a blueprint for effective vaccination. Here we report the isolation, evolution and structure of a broadly neutralizing antibody from an African donor followed from the time of infection. The mature antibody, CH103, neutralized approximately 55% of HIV-1 isolates, and its co-crystal structure with the HIV-1 envelope protein gp120 revealed a new loop-based mechanism of CD4-binding-site recognition. Virus and antibody gene sequencing revealed concomitant virus evolution and antibody maturation. Notably, the unmutated common ancestor of the CH103 lineage avidly bound the transmitted/founder HIV-1 envelope glycoprotein, and evolution of antibody neutralization breadth was preceded by extensive viral diversification in and near the CH103 epitope. These data determine the viral and antibody evolution leading to induction of a lineage of HIV-1 broadly neutralizing antibodies, and provide insights into strategies to elicit similar antibodies by vaccination
    corecore