103 research outputs found

    Linked read technology for assembling large complex and polyploid genomes

    Get PDF
    Background: Short read DNA sequencing technologies have revolutionized genome assembly by providing high accuracy and throughput data at low cost. But it remains challenging to assemble short read data, particularly for large, complex and polyploid genomes. The linked read strategy has the potential to enhance the value of short reads for genome assembly because all reads originating from a single long molecule of DNA share a common barcode. However, the majority of studies to date that have employed linked reads were focused on human haplotype phasing and genome assembly. Results: Here we describe a de novo maize B73 genome assembly generated via linked read technology which contains ~ 172,000 scaffolds with an N50 of 89 kb that cover 50% of the genome. Based on comparisons to the B73 reference genome, 91% of linked read contigs are accurately assembled. Because it was possible to identify errors with \u3e 76% accuracy using machine learning, it may be possible to identify and potentially correct systematic errors. Complex polyploids represent one of the last grand challenges in genome assembly. Linked read technology was able to successfully resolve the two subgenomes of the recent allopolyploid, proso millet (Panicum miliaceum). Our assembly covers ~ 83% of the 1 Gb genome and consists of 30,819 scaffolds with an N50 of 912 kb. Conclusions: Our analysis provides a framework for future de novo genome assemblies using linked reads, and we suggest computational strategies that if implemented have the potential to further improve linked read assemblies, particularly for repetitive genomes

    A Novel Method of Characterizing Genetic Sequences: Genome Space with Biological Distance and Applications

    Get PDF
    Most existing methods for phylogenetic analysis involve developing an evolutionary model and then using some type of computational algorithm to perform multiple sequence alignment. There are two problems with this approach: (1) different evolutionary models can lead to different results, and (2) the computation time required for multiple alignments makes it impossible to analyse the phylogeny of a whole genome. This motivates us to create a new approach to characterize genetic sequences.To each DNA sequence, we associate a natural vector based on the distributions of nucleotides. This produces a one-to-one correspondence between the DNA sequence and its natural vector. We define the distance between two DNA sequences to be the distance between their associated natural vectors. This creates a genome space with a biological distance which makes global comparison of genomes with same topology possible. We use our proposed method to analyze the genomes of the new influenza A (H1N1) virus, human rhinoviruses (HRV) and mammalian mitochondrial. The result shows that a triple-reassortant swine virus circulating in North America and the Eurasian swine virus belong to the lineage of the influenza A (H1N1) virus. For the HRV and mammalian mitochondrial genomes, the results coincide with biologists' analyses.Our approach provides a powerful new tool for analyzing and annotating genomes and their phylogenetic relationships. Whole or partial genomes can be handled more easily and more quickly than using multiple alignment methods. Once a genome space has been constructed, it can be stored in a database. There is no need to reconstruct the genome space for subsequent applications, whereas in multiple alignment methods, realignment is needed to add new sequences. Furthermore, one can make a global comparison of all genomes simultaneously, which no other existing method can achieve

    A review of abnormalities in the perception of visual illusions in schizophrenia

    Get PDF
    Specific abnormalities of vision in schizophrenia have been observed to affect high-level and some low-level integration mechanisms, suggesting that people with schizophrenia may experience anomalies across different stages in the visual system affecting either early or late processing or both. Here, we review the research into visual illusion perception in schizophrenia and the issues which previous research has faced. One general finding that emerged from the literature is that those with schizophrenia are mostly immune to the effects of high-level illusory displays, but this effect is not consistent across all low-level illusions. The present review suggests that this resistance is due to the weakening of top–down perceptual mechanisms and may be relevant to the understanding of symptoms of visual distortion rather than hallucinations as previously thought

    Sequencing, de novo annotation and analysis of the first Anguilla anguilla transcriptome: EeelBase opens new perspectives for the study of the critically endangered european eel

    Get PDF
    Background: Once highly abundant, the European eel (Anguilla anguilla L.; Anguillidae; Teleostei) is considered to be critically endangered and on the verge of extinction, as the stock has declined by 90-99% since the 1980s. Yet, the species is poorly characterized at molecular level with little sequence information available in public databases.\ud \ud Results: The first European eel transcriptome was obtained by 454 FLX Titanium sequencing of a normalized cDNA library, produced from a pool of 18 glass eels (juveniles) from the French Atlantic coast and two sites in the Mediterranean coast. Over 310,000 reads were assembled in a total of 19,631 transcribed contigs, with an average length of 531 nucleotides. Overall 36% of the contigs were annotated to known protein/nucleotide sequences and 35 putative miRNA identified.\ud \ud Conclusions: This study represents the first transcriptome analysis for a critically endangered species. EeelBase, a dedicated database of annotated transcriptome sequences of the European eel is freely available at http://compgen.bio.unipd.it/eeelbase. Considering the multiple factors potentially involved in the decline of the European eel, including anthropogenic factors such as pollution and human-introduced diseases, our results will provide a rich source of data to discover and identify new genes, characterize gene expression, as well as for identification of genetic markers scattered across the genome to be used in various applications

    Transcriptome Sequencing and De Novo Analysis for Yesso Scallop (Patinopecten yessoensis) Using 454 GS FLX

    Get PDF
    BACKGROUND: Bivalves comprise 30,000 extant species, constituting the second largest group of mollusks. However, limited genetic research has focused on this group of animals so far, which is, in part, due to the lack of genomic resources. The advent of high-throughput sequencing technologies enables generation of genomic resources in a short time and at a minimal cost, and therefore provides a turning point for bivalve research. In the present study, we performed de novo transcriptome sequencing to first produce a comprehensive expressed sequence tag (EST) dataset for the Yesso scallop (Patinopecten yessoensis). RESULTS: In a single 454 sequencing run, 805,330 reads were produced and then assembled into 32,590 contigs, with about six-fold sequencing coverage. A total of 25,237 unique protein-coding genes were identified from a variety of developmental stages and adult tissues based on sequence similarities with known proteins. As determined by GO annotation and KEGG pathway mapping, functional annotation of the unigenes recovered diverse biological functions and processes. Transcripts putatively involved in growth, reproduction and stress/immune-response were identified. More than 49,000 single nucleotide polymorphisms (SNPs) and 2,700 simple sequence repeats (SSRs) were also detected. CONCLUSION: Our data provide the most comprehensive transcriptomic resource currently available for P. yessoensis. Candidate genes potentially involved in growth, reproduction, and stress/immunity-response were identified, and are worthy of further investigation. A large number of SNPs and SSRs were also identified and ready for marker development. This resource should lay an important foundation for future genetic or genomic studies on this species

    Maize Inbreds Exhibit High Levels of Copy Number Variation (CNV) and Presence/Absence Variation (PAV) in Genome Content

    Get PDF
    Following the domestication of maize over the past ∼10,000 years, breeders have exploited the extensive genetic diversity of this species to mold its phenotype to meet human needs. The extent of structural variation, including copy number variation (CNV) and presence/absence variation (PAV), which are thought to contribute to the extraordinary phenotypic diversity and plasticity of this important crop, have not been elucidated. Whole-genome, array-based, comparative genomic hybridization (CGH) revealed a level of structural diversity between the inbred lines B73 and Mo17 that is unprecedented among higher eukaryotes. A detailed analysis of altered segments of DNA conservatively estimates that there are several hundred CNV sequences among the two genotypes, as well as several thousand PAV sequences that are present in B73 but not Mo17. Haplotype-specific PAVs contain hundreds of single-copy, expressed genes that may contribute to heterosis and to the extraordinary phenotypic diversity of this important crop

    Mu Transposon Insertion Sites and Meiotic Recombination Events Co-Localize with Epigenetic Marks for Open Chromatin across the Maize Genome

    Get PDF
    The Mu transposon system of maize is highly active, with each of the ∼50–100 copies transposing on average once each generation. The approximately one dozen distinct Mu transposons contain highly similar ∼215 bp terminal inverted repeats (TIRs) and generate 9-bp target site duplications (TSDs) upon insertion. Using a novel genome walking strategy that uses these conserved TIRs as primer binding sites, Mu insertion sites were amplified from Mu stocks and sequenced via 454 technology. 94% of ∼965,000 reads carried Mu TIRs, demonstrating the specificity of this strategy. Among these TIRs, 21 novel Mu TIRs were discovered, revealing additional complexity of the Mu transposon system. The distribution of >40,000 non-redundant Mu insertion sites was strikingly non-uniform, such that rates increased in proportion to distance from the centromere. An identified putative Mu transposase binding consensus site does not explain this non-uniformity. An integrated genetic map containing more than 10,000 genetic markers was constructed and aligned to the sequence of the maize reference genome. Recombination rates (cM/Mb) are also strikingly non-uniform, with rates increasing in proportion to distance from the centromere. Mu insertion site frequencies are strongly correlated with recombination rates. Gene density does not fully explain the chromosomal distribution of Mu insertion and recombination sites, because pronounced preferences for the distal portion of chromosome are still observed even after accounting for gene density. The similarity of the distributions of Mu insertions and meiotic recombination sites suggests that common features, such as chromatin structure, are involved in site selection for both Mu insertion and meiotic recombination. The finding that Mu insertions and meiotic recombination sites both concentrate in genomic regions marked with epigenetic marks of open chromatin provides support for the hypothesis that open chromatin enhances rates of both Mu insertion and meiotic recombination

    Diabetes, periodontitis, and the subgingival microbiota

    Get PDF
    Both type 1 and type 2 diabetes have been associated with increased severity of periodontal disease for many years. More recently, the impact of periodontal disease on glycaemic control has been investigated. The role of the oral microbiota in this two-way relationship is at this stage unknown. Further studies, of a longitudinal nature and investigating a wider array of bacterial species, are required in order to conclusively determine if there is a difference in the oral microbiota of diabetics and non-diabetics and whether this difference accounts, on the one hand, for the increased severity of periodontal disease and on the other for the poorer glycaemic control seen in diabetics
    corecore