274 research outputs found

    Genomics: From Microbes to Man

    Get PDF
    Sir John Crawford Memorial Lecture delivered by J. Craig Venter, President and Chief Scientific Officer of Celera Genomics Corporation and founder and Chairman of the Board of The Institute for Genomic Research (TIGR), during CGIAR International Centers Week 2000. Venter describes the development of new technologies and methods for rapidly characterizing and sequencing genomes. He refers to collaboration between TIGR and ILRI on the sequencing of the genome of Theileria parva, the tick-born parasite that causes East Coast Fever, as a step toward development of a vaccine. More generally, he discusses the significance of comparing genes across species, and the potential applications of this knowledge in developing vaccines and predicting individual susceptibility to specific diseases

    An Efficient Algorithm For Chinese Postman Walk on Bi-directed de Bruijn Graphs

    Full text link
    Sequence assembly from short reads is an important problem in biology. It is known that solving the sequence assembly problem exactly on a bi-directed de Bruijn graph or a string graph is intractable. However finding a Shortest Double stranded DNA string (SDDNA) containing all the k-long words in the reads seems to be a good heuristic to get close to the original genome. This problem is equivalent to finding a cyclic Chinese Postman (CP) walk on the underlying un-weighted bi-directed de Bruijn graph built from the reads. The Chinese Postman walk Problem (CPP) is solved by reducing it to a general bi-directed flow on this graph which runs in O(|E|2 log2(|V |)) time. In this paper we show that the cyclic CPP on bi-directed graphs can be solved without reducing it to bi-directed flow. We present a ?(p(|V | + |E|) log(|V |) + (dmaxp)3) time algorithm to solve the cyclic CPP on a weighted bi-directed de Bruijn graph, where p = max{|{v|din(v) - dout(v) > 0}|, |{v|din(v) - dout(v) < 0}|} and dmax = max{|din(v) - dout(v)}. Our algorithm performs asymptotically better than the bidirected flow algorithm when the number of imbalanced nodes p is much less than the nodes in the bi-directed graph. From our experimental results on various datasets, we have noticed that the value of p/|V | lies between 0.08% and 0.13% with 95% probability

    Evaluating African horse sickness virus in horses and field-caught Culicoides biting midges on the East Rand, Gauteng Province, South Africa

    Get PDF
    A prospective study was undertaken during 2013 and 2014, to determine the prevalence of African horse sickness virus (AHSV) in Culicoides midges and the incidence of infection caused by the virus in 28 vaccinated resident horses on two equine establishments on the East Rand, Gauteng Province, South Africa. Field caught Culicoides midges together with whole blood samples from participating horses were collected every two weeks at each establishment. Culicoides midges and blood samples were tested for the presence of AHSV RNA by real-time quantitative reverse transcription polymerase chain reaction. Nine immunised horses became infected with AHSV during the study period, although infections were subclinical. African horse sickness virus was also identified from a field-collected midge pool. The observations recapitulate previously published data in another setting, where further investigation is warranted to determine what role subclinical infection plays in the diseases epidemiology

    Evolution of allostery in the cyclic nucleotide binding module

    Get PDF
    Analysis of cyclic nucleotide binding (CNB) domains shows that they have evolved to sense a wide variety of second messenger signals; a mechanism for allosteric regulation by CNB domains is proposed

    Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, and Interpreting Novel, Deep Branches in Marker Gene Phylogenetic Trees

    Get PDF
    BACKGROUND: Most of our knowledge about the ancient evolutionary history of organisms has been derived from data associated with specific known organisms (i.e., organisms that we can study directly such as plants, metazoans, and culturable microbes). Recently, however, a new source of data for such studies has arrived: DNA sequence data generated directly from environmental samples. Such metagenomic data has enormous potential in a variety of areas including, as we argue here, in studies of very early events in the evolution of gene families and of species. METHODOLOGY/PRINCIPAL FINDINGS: We designed and implemented new methods for analyzing metagenomic data and used them to search the Global Ocean Sampling (GOS) expedition data set for novel lineages in three gene families commonly used in phylogenetic studies of known and unknown organisms: small subunit rRNA and the recA and rpoB superfamilies. Though the methods available could not accurately identify very deeply branched ss-rRNAs (largely due to difficulties in making robust sequence alignments for novel rRNA fragments), our analysis revealed the existence of multiple novel branches in the recA and rpoB gene families. Analysis of available sequence data likely from the same genomes as these novel recA and rpoB homologs was then used to further characterize the possible organismal source of the novel sequences. CONCLUSIONS/SIGNIFICANCE: Of the novel recA and rpoB homologs identified in the metagenomic data, some likely come from uncharacterized viruses while others may represent ancient paralogs not yet seen in any cultured organism. A third possibility is that some come from novel cellular lineages that are only distantly related to any organisms for which sequence data is currently available. If there exist any major, but so-far-undiscovered, deeply branching lineages in the tree of life, we suggest that methods such as those described herein currently offer the best way to search for them

    Designer diatom episomes delivered by bacterial conjugation.

    Get PDF
    Eukaryotic microalgae hold great promise for the bioproduction of fuels and higher value chemicals. However, compared with model genetic organisms such as Escherichia coli and Saccharomyces cerevisiae, characterization of the complex biology and biochemistry of algae and strain improvement has been hampered by the inefficient genetic tools. To date, many algal species are transformable only via particle bombardment, and the introduced DNA is integrated randomly into the nuclear genome. Here we describe the first nuclear episomal vector for diatoms and a plasmid delivery method via conjugation from Escherichia coli to the diatoms Phaeodactylum tricornutum and Thalassiosira pseudonana. We identify a yeast-derived sequence that enables stable episome replication in these diatoms even in the absence of antibiotic selection and show that episomes are maintained as closed circles at copy number equivalent to native chromosomes. This highly efficient genetic system facilitates high-throughput functional characterization of algal genes and accelerates molecular phytoplankton research

    Genetic Variation in an Individual Human Exome

    Get PDF
    There is much interest in characterizing the variation in a human individual, because this may elucidate what contributes significantly to a person's phenotype, thereby enabling personalized genomics. We focus here on the variants in a person's โ€˜exome,โ€™ which is the set of exons in a genome, because the exome is believed to harbor much of the functional variation. We provide an analysis of the โˆผ12,500 variants that affect the protein coding portion of an individual's genome. We identified โˆผ10,400 nonsynonymous single nucleotide polymorphisms (nsSNPs) in this individual, of which โˆผ15โ€“20% are rare in the human population. We predict โˆผ1,500 nsSNPs affect protein function and these tend be heterozygous, rare, or novel. Of the โˆผ700 coding indels, approximately half tend to have lengths that are a multiple of three, which causes insertions/deletions of amino acids in the corresponding protein, rather than introducing frameshifts. Coding indels also occur frequently at the termini of genes, so even if an indel causes a frameshift, an alternative start or stop site in the gene can still be used to make a functional protein. In summary, we reduced the set of โˆผ12,500 nonsilent coding variants by โˆผ8-fold to a set of variants that are most likely to have major effects on their proteins' functions. This is our first glimpse of an individual's exome and a snapshot of the current state of personalized genomics. The majority of coding variants in this individual are common and appear to be functionally neutral. Our results also indicate that some variants can be used to improve the current NCBI human reference genome. As more genomes are sequenced, many rare variants and non-SNP variants will be discovered. We present an approach to analyze the coding variation in humans by proposing multiple bioinformatic methods to hone in on possible functional variation
    • โ€ฆ
    corecore